Welcome to the Cognition and Media (CogMedia) project, where you'll find aggregation and analysis of newsfeeds from major media headlines. Our research goal is to link cognition in news consumers to large-scale trends in media. Hosted by the Communicative Mind Laboratory in the Department of Communication at UCLA. Are you an academic researcher or educator? You can get lots of data for free through our API initiative, allowing you to import straight into R.
Our research approach can be called "cognitive analytics." Think of how consumers read the news. It's literally mental. We are testing the hypothesis that subtle but measurable cognitive factors are useful in understanding what consumers read and share. These cognitive factors include subtle aspects such as accessibility, comprehensibility, and even bias. Subtle aspects of human mental processing could help us to understand media data, from the level of individual consumers, to more collective levels, such as the distribution of news themes, perhaps even the behavior of major newsmedia.
We are developing cognitive metrics based on newsmedia headlines, such as how simplified or complex a news story's language is, or how recognizable a recent story might be to a reader. The CogMedia project aims to bridge these cognitive metrics to broader collective patterns seen in newsmedia stories. What predicts a story's becoming viral? What explains the thematic patterns in news stories? How do cognitive variables relate to partisanship and controversy, in the manner that stories are composed and disseminated?
The Co-Mind Laboratory uses the CogMedia database to conduct specific research projects, linking cognitive processes to newsmedia consumption and social media metrics. We will share active research projects here through working papers. Stay tuned.
CogMedia's core data is available for download in throttled buckets. In order to obtain a free API key, please email firstname.lastname@example.org. The documentation for the API is on GitHub here.
With the API, you can summon data right inside R. The result is a data frame over which you can apply your favorite tools (tidyverse, etc.). Illustrations are on the API GitHub repository above.
Each story is based only on the RSS feed of the news item. We obey all copyright rules of the news source. However we tag news stories with a variety of information from social media metrics. Each story record includes:
sourceNews organization (e.g., New York Times).
titleTitle of the story.
descriptionText associated with story description in RSS.
alexaAn Alexa rank of the source.
partisanshipA partisanship score, based on AllSides.com.
social_scoreAn approximate rate of Twitter sharing shortly after story release.
story_neighborsThe number of neighbors on a graph based on shared word triples, from +/- 12 hours of other stories. Basis of network representation of news media.
urlFull URL to the story's source.
Our database contains...
news stories, since early 2019.
Past week of stories, by source...