Social Media Data in Research

An ESRC convened group looking at Big Data, in particular new forms of data such as social media data, led by Dave De Roure (Oxford e-Research Centre) is studying the use of social media for social research. A survey has recently been launched in order to help the group to learn more about how the UK Social Science research community experiences and responds to the challenges of working with social media data. This gathering of evidence about social media data will inform decision-making and build best practice in the research community.

The survey is now live ( and the group seeks responses from anyone conducting research with social media data. The survey closes mid-December and the group will be reporting in the New Year.

This study is relevant to the current third round of the Jisc-ESRC-AHRC funded Digging into Data Challenge (, in particular projects ( that are using social media data for their research. Trees and Tweets is one such project. This project is a joint effort between Aston University and the University of South Carolina. The team at Aston University has focussed on the analysis of dialect variation based on a corpus of billions of tweets, while the team at the University of South Carolina are looking at the analysis of migration patterns based on a dataset consisting of millions of family trees. The analysis of large Twitter datasets has produced some interesting results that wasn’t anticipated when the project initially submitted their proposal. These results have caught the attention of the media. For example, the use of “um” and “uh” across the US ( More information about this analysis is on the project’s blog ( as well as information on the aggregation of swearing data and visual representations of how the use of new words spread geographically.

The Collaborative Online Social Media Observatory (COSMOS) has been analysing social media and data mining for a number of years. Originally funded under Jisc’s Virtual Research Environment (VRE) programme, the project has grown and received further funding from the ESRC to see if Big Social Data can predict offline social phenomena. The project has brought together social, computer, political, health and mathematical scientists to study the methodological, theoretical, and empirical dimensions of Big Data in technical, social and policy contexts. Much of the analysis of social media data has been in the contexts of Societal Safety and Security e.g. social tension, hate speech, crime reporting and fear of crime, and suicidal ideation. The COSMOS system has been used to provide the BBC’s Radio 5 Live with a chart based on the biggest impact stories across social media and online. Using its specially developed unique algorithm it analyses key words and hashtags in Twitter to evaluate and rank the impact of each.

The above examples show how the analysis of social media is producing valuable research. If you are a researcher working with social media, please complete the survey so that your views can be represented in the report.

By Christopher Brown

Product Manager (Research) at Jisc.