Counting on the News: Data and Sentimentality. A 40-year Text Analysis of The New York Times.
This paper set out to analyze trends in data usage within mainstream American news. Analyzing a sample of over 60,000 New York Times articles from 1981 to 2021, data usage, article subjectivity, and article polarity were measured. The purpose of this analysis was to test whether the prevailing narrative that the past 20 years is the ‘data age’ and that data usage is bigger than ever was true within the context of print journalism. Overall, current public confidence in mainstream newspapers is low and readership is decreasing. Thus, the value of journalists as collectors, interpreters, and presenters of data is of increasing importance. Contrary to the original hypotheses, the key findings of this analysis are that data usage has not increased absolutely, or as a ratio to word count over the past 40 years. No substantial trends in either data word usage or raw number usage could be detected. Further, the presence of data within New York Times articles was not found to have any strong correlation with changes in article polarity or subjectivity. This raises critical questions about the narrative of the ‘data age’ and why increased data availability has not resulted in increased data utilization in the context of the New York Times.