Clare Llewellyn and Laura Cram
In our work investigating how people discuss the EU on Twitter, one of the aims is to determine sentiment, both pro- and anti-EU and, in relation to the referendum on the UK’s membership of the EU, pro-Remain or pro-Leave.
Our initial approach has been very straightforward. If Tweeters use hashtags associated with the Leave camp (including #brexit, #no2eu, #notoeu, #betteroffout, #voteout, #eureform, #britainout, #leaveeu, #voteleave, #beleave, #loveeuropeleaveeu, #leaveeu), then we judge their sentiment to be pro-Leave. If they use hashtags associated with the Remain camp (#yes2eu, #yestoeu, #betteroffin, #votein, #ukineu, #bremain, #strongerin, #leadnotleave, #voteremain), then their sentiment is pro-Remain.
If we organise our dataset in this way, we get the following results:
We can see that these results are fairly stable day-to-day:
However, they are not consistent with polling data. The ICM tracker below shows Remain with a higher percentage. So why is there this difference?
First, we need to remember that research with Twitter data can only ever be indicative of Twitter users, who do not necessarily reflect the population at large. Interestingly, this has been a problem for pollsters attempting to include Twitter data in their vote forecasting.
We have been collecting data since 9 August 2015. As a result, we have now built up quite a large dataset, which we are continuing to expand. We have gathered 3109130 tweets associated with the EU up to 27 October 2015 (using a variety of EU-based terms). See our previous article for details:
Using social media for political discourse is increasingly becoming common practice, especially around election time. Arguably, one of the most interesting aspects of this trend is the possibility of ”pulsing” the public’s opinion in near real-time and, thus, it has attracted the interest of many researchers as well as news organisations.
Second, we also need to keep in mind that, as we discussed in a previous post, people tend to tweet against things rather than for them.
Third, we are currently limiting our analysis to sentiment associated with hashtags. If a Tweet doesn’t have hashtags, it is not included for these purposes.
If we take a look at the data, we can see some differences between the two sides of the campaign. Both camps have distinct styles in their usage of Tweets, and in particular hashtags. The Leave camp tend to use many hashtags. These are some examples from both LEAVE.EU and Vote Leave – especially LEAVE.EU.
LEAVE.EU use hashags in their Twitter account biography. This may encourage followers to use these hashtags.
3109130 tweets associated with the EU up to 27 October 2015 (using a variety of EU-based terms). See our previous article for details:
The latest Tweets from LEAVE.EU (@LeaveEUOfficial). A cross party & non-political campaign, advocating the vote to leave the EU in the upcoming referendum. #LeaveEU #Brexit
The Remain camp does not tend to use as many hashtags in their posts. For example:
Tweets that would be considered pro-Remain often also include hashtags we have classified as pro-Leave. This could be the result of several factors, such as positioning within the debate by using a popular hashtag or trying to talk to those which hold opposite views.
When we look at hashtags that are used in conjunction with #strongerin and #brexit, we find, overall, a much lower use of #strongerin. This is used with leave hashtags especially #brexit, #leaveEU, #voteLeave. In contrast, #brexit is used not used with Remain hashtags at all, but instead with other Leave hashtags.
In the future, we aim to undertake a more sophisticated form of sentiment analysis including the text of Tweets. This leads on to another problem that is often discussed in sentiment analysis of text – identifying the target of the sentiment.
The target is the item that the sentiment is expressed towards. In the tweet by Richard Corbett MEP above, the ‘Leave’ sentiment is expressed towards the ‘U think being part of EU holds back trade with rest of world?’ and the ‘Remain’ sentiment is expressed within ‘Think again’. Identifying the text associated with the target is not always easy. This is a challenging issue to tackle, and we haven’t even begun to discuss distinguishing jokes and sarcasm!
You might be tempted to ask that if this data is not representative of the general public why are we looking at it? Our main observations are on how this data changes. If we can identify the differences and track the relationships between Twitter data and more general public opinion, we can start to hypothesise about how changes on Twitter equate to public opinion more widely. We’ll speak to this more in the future.
Our project is part of the Economic and Social Research Council’s The UK in a Changing Europe programme. Look out for our regular updates as the project tracks developments in the debate on the UK’s continued membership of the EU and follow us on Twitter @myimageoftheEU.
Neuropolitics research politics experiments using fMRI brain scanning.
Laura Cram is Senior Fellow, The UK in a Changing Europe, investigating The European Union in the Public Imagination: Maximising the Impact of Transdisciplinary Insights (ESRC/ES/N003985/1).
This article was originally published on the imagineEurope Storify.
University of Edinburgh
Clare Llewellyn is PhD Candidate in Informatics and Research Fellow in the Neuropolitics Research Lab at the University of Edinburgh. Her research focuses on user-generated content on the Internet. Her research interests include social media, big data and text and data analytics.
University of Edinburgh
Prof Laura Cram is Professor of European Politics at the University of Edinburgh; Senior Fellow, The UK in a Changing Europe; and Academic Editor of European Futures. Her research areas include European public policy, European identity and the neuropolitics of public policy and identity.
Please note that this article represents the view of the author(s) alone and not European Futures, the Edinburgh Europa Institute or the University of Edinburgh.
This article is published under a Creative Commons (Attribution-NonCommercial-NoDerivatives 4.0 International) License.