Twitter and other social media platforms represent a large and largely untapped resource for social data and evidence. In this post, Wasim Ahmed updates his recurring series on the Impact Blog, to bring you the latest developments in digital methods and methodologies for researching Twitter and other social media platforms.
This post builds upon the 2015, and 2017 editions of this post, captures key trends and events which are shaping social media research for social scientists and provides a collection of research methods and tools for the analysis of social media data.
Since the 2017 edition of this blog post, I have seen even more unique and interesting uses of social media data across a wide variety of research disciplines, such as sociology, computer science, media and communication, political science, and engineering to name only a few. Social media platforms generate a vast amount of data on a daily basis on a variety of topics and consequently represent a key source of information for anyone seeking to study 21st century society.
Twitter remains the most popular platform for academic research, as it still provides its data via a number of Application Programming Interfaces (API). In contrast, the aftermath of the Cambridge analytica ‘data breach’ has led to certain social media platforms to limit data provided through their Application Programming Interfaces. However, although, it may not be possible to get data from all social media platforms, it is still possible to conduct qualitative and quantitative research such as interviews and surveys, with members of online communities.
Studies in social media can be framed by drawing on a wide-variety of theories, constructs and conceptual frameworks from a wide-variety of disciplines and I would recommend taking a look at this paper: Social media research: Theories, constructs, and conceptual frameworks, which nicely summarises a number of these approaches.
There are also a number of research approaches that can be drawn upon such as Netnography and Digital Ethnography, which provide frameworks for conducting research in the online world. Netnography, for instance, can be based on downloading data directly from a social media platform, noting personal observations of an online community, and interviewing social media users. Furthermore, there are also a number of specific methods for the analysis of social media data summarised in Table 1 below.
Table 1: Overview of research methods
|Content analysis can be used for systematically labelling text, audio, and/or visual communication from social media, and can provide a numerical output. An example study using content analysis on tweets can be accessed here. A useful overview of the method can be found here. It is fine to take a systematic random sample of between 1% to 10% of the dataset depending on the volume of data retrieved.
|Thematic analysis involves a rigorous process in order to locate patterns within data through data familiarisation, coding, and developing and revising themes. An example of a study using thematic analysis can be found here. A useful guide in applying thematic analysis can be found here. Thematic analysis may also be known as discourse analysis. Similar to content analysis it is fine to take a systematic random sample of between 1% to 10% of the dataset depending on the volume of data retrieved.
|Social Network Analysis
|Social network analysis can be used to measure and map the relationships between individuals, organisations, Web Pages, and information and/or knowledge entities. See this article and the supplementary material for insight into how to incorporate social network analysis into a study. A useful resource on analysing social media data using social network analysis can be accessed here.
|Machine learning is a type of artificial intelligence which allows computers to learn without being programmed. It may also involve humans labelling a subset of data and allows the computer to learn and code the remainder of the data. A study using machine learning to analyse Twitter user profiles can be accessed here.
|Semantic Analysis (Linguistics)
|Semantic analysis may examine the meaning of language used and also the relationship between occurrences of words, phrases, and clauses. A useful presentation which uses semantic analysis to examine Twitter can be accessed here. It describes a study by Dr. Mark McGlashan.
|Time Series Analysis
|Time series analysis is rarely used a method within itself and is usually combined with other methods. It plots the frequency of social media across time. It is often used to complement other types of analysis. My PhD thesis provides an example of utilising time series analysis alongside other research methods such as content and thematic analysis. It can be accessed here.
Table 2 below provides an overview of tools for retrieving social media data
Table 2: An overview of tools for 2019
|Download and/or access from
|Twitter, Facebook, Instagram, Blogs, Forums, Videp
|Twitter, Facebook, YouTube, Instagram, Sina Weibo, VK, QQ, Google+, Pinterest, Online blogs
|Windows (Desktop advisable)
|COSMOS Project (free)
|Windows & MAC OS X
|Twitter, Instagram, Foursquare, Panoramio, AIS Shipping, Sina Weibo, Flickr, YouTube, VK
|Twitter, Instagram, Facebook
|Windows (Desktop advisable)
|Twitter, Facebook, YouTube, RSS Feed
|Twitter, YouTube, Flickr, Wikipedia
|Windows and MAC
|Twitter, Ability to import
|Twitter, Facebook topic data, Online blogs
|Twitter, Facebook, Instagram, YouTube
|Symplur (Healthcare focus)
|Twitter Arching Google Spreadsheet (TAGS) (free)
|Webometric Analyst (free)
|Twitter (with image extraction capabilities), YouTube, Flickr, Mendeley, Other web resources
*Some tools may allow access to other platforms and the ability to import your own data.
Recently, it has also become increasingly difficult for academics to access historical Twitter data with a number of services for academics coming to an end. This has given rise to services such as those provided by ScrapeHero which allow users to pull in historical Twitter data for free using web-scraping. However, this form of retrieving Twitter is not recommended.
For researching other platforms on the Internet, such as web forums, blogs and other social media platforms there are tools such as Scrape Storm which is an AI-powered visual web scraper and claims to be able to retrieve data from almost any platform.
There are also a number of advanced data analysis and statistical applications which can be used to analyse social media data, such as:
These packages should be researched when deciding which application is to be used for a project. I’d also like to mention The Digital Methods Initiatives list of tools, and Ryerson University’s list of tools from its Social Media Lab. For retrieving Twitter data it is also worth checking out the DMI-TCAT (free). A further review of 100 social media tools was recently published by SAGE Ocean.
For image analysis I would recommend checking out the Google Cloud vision AI and there are also tools such as Instaloader which allow you to download Instagram photos of public accounts. A really interesting study was conducted on Instagram and analysed the hashtag #CheatMeal using thematic content analysis and it can be accessed here.
Another rapidly developing field of social media research looks at ethics. It is important to conduct ethical social media research and I recently published an open access book chapter, which examines the use of Twitter as a source of data and provides an overview of ethical, legal and methodological challenges. The chapter can be accessed here.
Due to a number of requests I have also started to run regular training events (see a list here) with virtual attendance possible. The first of these events took place at the London School of Economics and Political Sciences on May 17th 2019 and our hashtag #SMRM19 contains a host of informative material as the event was live tweeted.
Note: This article gives the views of the author, and not the position of the LSE Impact Blog, nor of the London School of Economics. Please review our comments policy if you have any concerns on posting a comment below.