Who Wrote This?
TLDR
The way authors frame conversations, set up stories, or talk about different topics (even subconsiously) can influence the way that their audience percieves their message, possibly revealing their biases or beliefs in the process. This study investigates what those patterns mean in the context of news media (online news) both for the way people consume media and for machine learning systems that, in recommending or processing news, read content full of those linguistic markers.
Medium Post
While not a replacement for the research paper below, this study includes a short medium post describing some high level takeaways in one page.
The Actual Abstract
From the paper... An article's tone and framing not only influence an audience's perception of a story but may also reveal attributes of author identity and bias. Building upon prior media, psychological, and machine learning research, this neural network-based system detects those writing characteristics in ten news agencies' reporting, discovering patterns that, intentional or not, may reveal an agency's topical perspectives or common contextualization patterns. Specifically, learning linguistic markers of different organizations through a newly released open database, this probabilistic classifier predicts an article's publishing agency with 74% hidden test set accuracy given only a short snippet of text. The resulting model demonstrates how unintentional 'filter bubbles' can emerge in machine learning systems and, by comparing agencies' patterns and highlighting outlets' prototypical articles through an open source exemplar search engine, this paper offers new insight into news media bias.
Research Paper
This website is provided as part of the academic research study "Machine Learning Techniques for Detecting Identifying Linguistic Patterns in News Media" by A Samuel Pottinger (Data Driven Empathy LLC). The paper is currently under peer review but a preprint is available (also on Preprints.org).