Predicting Consumers’ Brand Sentiment Using Text Analysis on Reddit

Loading...
Thumbnail Image
Degree type
Graduate group
Discipline
Subject
brand sentiment; Reddit; natural language processing; consumer insight mining
Business
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Cen, Puti
Contributor
Abstract

With the emergence of data privacy regulations around the world (e.g. GDPR, CCPA), practitioners of Internet marketing, the largest digital marketing channel, face the trade-off between user data protection and advertisement targeting accuracy due to their current reliance on PII-related social media analytics. To address this challenge, this research proposes a predictive model for consumers’ brand sentiment based entirely on textual data from Reddit, i.e. fully compliant with current data privacy regulations. This author uses natural language processing techniques to process all post and comment data from the r/gadgets subreddit community in 2018 – extracting frequently-discussed brands and products through named entity recognition, as well as generating brand sentiment labels for active users in r/gadgets through sentiment analysis. This research then uses four supervised learning classifiers to predict brand sentiments for four brand clusters (Apple, Samsung, Microsoft and Google) based on the self-identified characteristics of Reddit users. Across all four brand clusters, the predictive model proposed by this research achieved a ROC AUC score above 0.7 (three out of the four above 0.8). This research thus shows the predictive power of self-identified user characteristics on brand sentiments and offers a non-PII-required consumer targeting model for digital marketing practitioners.

Advisor
Chris Callison-Burch
Date of degree
2020-05-01
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation