How do tweets map onto the Brexit vote? Neatly, it turns out. Stefan Bauchowitz and Max Hänska collected more than 7.5 million Brexit-related tweets in the run-up to the referendum. They found Leavers were more active than supporters of Remain, despite the younger profile of social media users.
How did Eurosceptic (Leave) and pro-European (Remain) activity compare on social media in the run-up to the EU referendum, and was there a relationship between social media users and votes? To find out how Leave and remain compared, we collected more than 7.5 million Brexit-related tweets during the 23 days leading up to the referendum through Twitter’s streaming API. We used a support vector machine to identify which tweets clearly supported the Leave or Remain camp (and manually coded a random sub-sample of those to ensure our allocation was reliable).
ONE HOUR TO GO! Go out and vote if you haven't already! #VoteLeave #IVotedLeave #ProjectHope pic.twitter.com/9YmlHT2Luf
— Vote Leave (@vote_leave) June 23, 2016
Given the polarity of the issue this worked well, and the model correctly identified most tweets. We used the result of this exercise to assign each user in our sample to one of the two camps.
We collected tweets containing the terms ‘Brexit’, ‘EUref’ and ‘EU Referendum’, which were all frequently used to refer to the referendum. While the term Brexit has great currency across both camps, it was used more often by users who wanted to leave the EU as it lends itself more easily to positive slogans (e.g. “Can’t wait for #Brexit to win!” or “Brexit to save Europe”, also echoed by “Brexit means Brexit”). Even though EURef and EU Referendum are more neutral terms, in both sub-samples we find that support for leaving, measured by number of tweets, outstripped support for remaining by a factor of 2.3 and 1.75 respectively. The margins confirm a slight bias in the term ‘brexit’, where the strength of leave over remain was more pronounced. Overall it is clear that the army of Leave users was larger in numbers and more active in tweeting their cause (see Figure 1).
Other researchers examining Google search trends, Instagram posts and Facebook found a similar tendency: Eurosceptic views were being communicated with greater intensely by a greater number of users. Researchers from Loughborough University revealed that, weighted for circulation, 82% of newspaper articles were pro-Leave. Both in print and on social media, Britons had more exposure to Eurosceptic than pro-European opinions.
We also mapped Twitter activity to local authority districts. To do this, we used Google’s and Bing’s geocoding services to translate user-provided location information to geographical coordinates which we then matched with local authority districts. This is not an exact science, both because many users provide no or fictitious location information in their profiles, and because the finer the granularity of geo-location required, the more error-prone the result. As many users specify their location as ‘London’ rather than its constituent boroughs, we aggregated all tweets from users located there. We plotted the share of users supporting Remain against share of the Remain vote. We excluded districts where we identified fewer than 200 users, giving us usable data for 100 local authorities.
There is clearly a pattern in the way the referendum campaign unfolded on Twitter, with those wanting to leave communicating in greater numbers and with greater intensity. Districts with a greater share of Twitter users supporting Leave also tended to vote for leaving the EU, so that Twitter activity correlates with voting in the referendum.
Yet we must be careful to avoid over-interpretation, in particular regarding claims that social media can predict election outcomes, the problems of which have been pointedly enumerated. Finding a pattern in the data post hoc is quite a different thing than confidently identifying and interpreting the pattern ex ante – Leave leads on social media by a much larger margin than it did in the vote, so it is not at all clear how one should have interpreted results from a Twitter analysis before the vote. The problem is, we lack demographic descriptors of social media users according to which we may weight or interpret results.
Nevertheless, given that Twitter users are generally thought to be younger and young people tended to vote Remain, the result is surprising either way. It seems plausible that Leave voters were more motivated, and consequently more active on Twitter. It also seems likely that slogans such as Vote Leave, take control, or even Brexit better lent themselves to a simple message (particularly useful given the constraints of a tweet), and allowed different interpretations such that users could project their desired meaning onto the slogan.
____
This post was originally published on the LSE Brexit blog.
Stefan Bauchowitz holds a PhD from the London School of Economics.
Max Hänska is a lecturer (Assistant Professor) at De Montfort University, where his research interests centre on social media, political communication and collective decision-making.
From my experience, it seemed to me that a suspicious amount of negative, pro-leave comments to my pro-Remain Tweets came from rather strange looking profiles (I suspected either bots, or astroturfers – not “genuine”, UK Twitter users). Is there any way to explore whether or not this might have been the case on a widespread level, using the same data-set?
I’m