Decision Modelling Driven by Twitter Data: A Case Study of the 2017 Presidential Election in Ecuador

UoM administered thesis: Phd

  • Authors:
  • Lucia Rivadeneira Barreiro

Abstract

This thesis introduces novel approaches for using Twitter data for building models aiming to analyse decision behaviours in the political arena. The results, presented in the form of three academic papers, apply to problems of sentiment classification and machine learning approaches used for prediction tasks. The first paper reviews the literature on the use of Twitter as a tool to analyse political behaviour. Particular attention is paid to approaches of user behaviour analysis, anticipation of outcomes, and predictive models. The paper identifies unresolved issues related to data selection and adequacy that can limit the performance of Twitter-based models, which researchers and practitioners, such as political campaigners, have not addressed in depth. In this regard, improvements in sampling, data pre-processing, and data analysis are likely to enhance the understanding of user behaviour in the political context. A practical implication, especially for campaigners, is the use of Twitter-based evidence to tailor communication strategies to entice different target audiences simultaneously, while also monitoring, measuring, managing, and evaluating the performance of campaigns. The second paper introduces two novel approaches to Twitter analysis intended for enhancing the performance of sentiment analysis models and identifying influential users during electoral campaigns. For the former, a novel approach is proposed to pre-process Twitter data for sentiment analysis, which considers features in tweets, namely hashtags, emoticons, and URLs, that have been often discarded or not fully utilised in previous works. As for the latter, a new approach is proposed to identify influential users and their sentiment periodically using a two-stage process. A case study of the 2017 Ecuadorian Presidential election was used to develop and validate these approaches, in which 1.3 million tweets pertinent to the two most voted candidates were retrieved for analysis. The key findings are: first, the pre-processing approach improves sentiment analysis results in comparison to results using raw tweets. Second, the most frequent type of tweets observed are retweets, and the most retweeted content is often produced by the two candidates themselves. Third, the number of unique Twitter user accounts producing positive sentiment towards the candidates can provide a measure of vote share. In this study, the latter actually outperformed the results made by the officially authorised polling firms. These findings have implications for political marketing communication strategies that relate to identify sentiment of users towards candidates and influential users throughout a campaign on Twitter. Finally, the third paper proposes a novel prediction model based on the evidential reasoning (ER) rule, named MAKER-RIMER, to predict whether the impact of a tweet is high or low in terms of the number of retweets it can achieve. The study relies on tweets produced by the two most voted candidates of the 2017 Ecuadorian Presidential election and uses five features of tweets as predictors. The proposed MAKER-RIMER model delivered an interpretable, transparent, and trackable model. Similarly, MAKER-RIMER performed better in terms of misclassification errors when compared against alternative machine learning prediction models. Last, this study identifies which features of tweets are causing impact of tweets to be high or low for each candidate. These findings support the design of Twitter content creation based on what users find more attractive to be retweeted.

Details

Original languageEnglish
Awarding Institution
Supervisors/Advisors
Award date31 Dec 2020