Exploring Thematic Diversity In News Coverage And Social Media Activity Of Political Candidates Using Unsupervised Machine Learning

Thumbnail Image
Degree type
Doctor of Philosophy (PhD)
Graduate group
Dissertations (ASC)
Political Campaigns
Semantic Network Analysis
Social Media
Thematic Diversity
Topic Modeling
Grant number
Copyright date
Related resources

The relationship between media and politics has been at the core of communication research for over a century. Previous research has examined the impact of both volume and tone of news coverage of political candidates on their electoral success, and the relationship between the volume of candidates’ social media activity (though not its tone) and electoral success. While past research found a positive relationship between these features and electoral success, recent criticisms have called into question the independent nature of these media factors. Moreover, while past research has paid some attention to volume and tone, researchers have yet to examine other key features of discourse represented in candidates’ coverage as a whole. One such feature is the extent to which a political discourse is unidimensional or multidimensional in nature, referred to in this study as thematic diversity. This is due, in part at least, to the complex nature of thematic diversity making its estimation challenging. Analyzing over 120,000 Tweets written by 142 U.S. Senate candidates during the 2012-2016 election cycles, as well as over 420,000 news articles covering 330 U.S. Senate candidates during the 2008-2016 election cycles, this study systematically explores the relationship between electoral success of political candidates and the volume and tone of their news coverage and social media activity. Using a wide array of controls, this study explores the independent (or dependent) nature of these media features. More importantly, this study goes beyond these previously studied media features, to systematically and empirically explore the relationship between thematic diversity in both candidates’ news coverage and social media activity, and their electoral success. Drawing on the conceptualization of diversity in various fields from biology, to physics and information sciences, and using two unsupervised machine learning methods, semantic network analysis and topic modeling, this study offers a novel approach to the conceptualization and estimation of thematic diversity, accounting for the variety, balance and disparity of various themes in a given corpus. Using these methods, this study offers evidence for a significant, negative, and semi-independent relationship between thematic diversity and electoral success, in both news media and social media.

Michael X. Delli Carpini
Date of degree
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher DOI
Journal Issue
Recommended citation