Our social media feeds are full of articles shared by friends and family that make claims about how something can prevent a particular health condition. But how robust is the scientific evidence base underpinning these claims? Noah Haber, Alexander Breskin, Ellen Moscoe and Emily R. Smith, on behalf of the CLAIMS team, report on a systematic review of the state of causal inference in media articles and academic studies at the point of consumption on social media. There is a large disparity between what people see in social media about health research compared with the underlying strength of evidence, both in the studies themselves and in the media articles describing their findings. The studies tend to imply stronger causal inference than their methods merit, while media articles reporting on them were found to be further overstated and inaccurate.

Our social media feeds are saturated with articles shared by friends and family that make claims about something trendy preventing just about any health condition. Often, when we as health researchers dig into what our friends and families share, we find that the studies underlying their posts provide weak and misleading evidence. The idea for our study was inspired by our own experiences on social media, scaled up to a systematic review of the state of causal inference in media articles and academic studies at the point of consumption on social media.

What we found is a large disparity between what people are seeing in social media about health research compared with the underlying strength of evidence, both in the studies themselves and in the media articles describing their findings. The studies people see tend to imply stronger causal inference than their methods merit, and the media articles about them were further overstated and inaccurate.

But let’s back up for a moment.


There are many steps and processes which can lead to weak evidence, overstatement, and/or misinformation by the time research findings hit our social media streams. All of these processes — like publication bias, clickbait, social media selection, etc. — add up by the time research reaches the public.

We focused on one of the most important and least appreciated aspects of health research: causal inference. Causal inference lets us distinguish statements such as “people who drink more coffee live longer on average” and “drinking more coffee will increase your life expectancy”. While they sound similar, the former might be true for reasons that have little to do with drinking coffee (for instance, people who drink lattes also go to gyms more, leading to better health that is correlated with, but does not result from, coffee consumption). If a headline says that “new study finds that X is linked to Y”, most tend to assume that it means that changing X causes a change in Y. Weak causal evidence does not mean that such a causal relationship doesn’t exist, just that the study methods and data do not sufficiently eliminate other explanations.

What did we do?

To find out which health news stories were most shared in 2015, we partnered with NewsWhip, a social media analytics company. We identified which health studies were most shared on Facebook and Twitter in 2015. We then filtered the list down to the ones that were about single scientific studies of the form “the association between X and Y”, taking the most-shared 50 academic studies from the 64 most-shared media articles about them. While 50 seems like a small number, it turns out that these studies represented more than half of all shares of health research in social media in 2015.

Our review process was based on having expert reviewers with proficiency in health research methods. We found 21 voluntary reviewers from six institutions and several fields of health science to help us out, mostly doctoral students specialising in epidemiology and econometrics. Each study was reviewed by three randomly selected reviewers from our team, with one person arbitrating and giving the final rating for each article.

To assess strength of causal inference, we had to develop a new toolkit for reviewing studies to examine various aspects of studies and articles before deciding on a summary strength rating. This rating considers both internal validity (i.e. the evidence that X caused Y within the specific context of the study) and generalisability (i.e. whether the evidence could reasonably be applied to a broader context) against a hypothetical ideal scenario. In the end, our ratings are subjective, but they represent the collective opinion of a team of experts who analysed these studies in detail using prespecified criteria for judgement.

Reviewers then explored the language used in the academic article for signs of overstatement of results and reviewed the media article(s) for accurate reporting. Any errors we did not discover, and any misunderstanding due to implied (but not technically stated) causality would result in the underlying strength of causal inference and language actually being weaker than the rating given.

What did we find?

Among the most-shared academic health studies measuring the link between some factor and a health outcome:

  • Only 6% demonstrated strong evidence that X actually caused Y
  • 20% of those studies used language which strongly implied causality
  • 34% used causal language considered too strong given their methods and data.

Among the most-shared media articles about those studies:

  • 44% used language which strongly implied causality
  • 48% used language which was stronger than the study it reported on
  • 58% contained at least one major error about the results, research question, population, and/or intervention of its associated study.

What does it mean?

There are many processes along the path from production to consumption which could result in weak, overstated, and inaccurate evidence at the point of sharing on social media. What we can conclude is that among the 50 most-shared health news articles, nearly half (48%) overstated the evidence as compared with the study authors, and that the study authors themselves overstated their evidence in 34% of cases. However, this study can’t tell us much about how this occurs, because we only looked at the end of the pathway. We simply are not sure how different factors contribute to this result, nor whether changing those factors would cause better information to get to consumers (notice the implied causal question!).

What’s next?

In order to help combat misinterpretation of our own study and discuss issues impacting science communication, we created MetaCausal.com, which contains a public explainer of our study, the protocols we used, the full dataset, suggestions on how and how not to discuss our findings, and more.

Understanding how this kind of misinformation is created, changed, distributed, and selected is one of the next major goals of our research. The next steps are 1) figure out exactly where things are going wrong; 2) how we can make the process of review fast and accurate enough to meet the speed of social media; and 3) test out ways to intervene to improve the research to consumer pipeline.

This blog post is based on the authors’ co-written article, “Causal language and strength of inference in academic and media articles shared in social media (CLAIMS): A systematic review”, published in PLoS ONE (DOI: 10.1371/journal.pone.0196346).

Featured image credit: William Iven, via Unsplash (licensed under a CC0 1.0 license).

Note: This article gives the views of the authors, and not the position of the LSE Impact Blog, nor of the London School of Economics. Please review our comments policy if you have any concerns on posting a comment below.

About the authors

Noah Haber led the CLAIMS study during his doctoral studies at the Harvard T.H. Chan School of Public Health. He completed his MSc at LSE in 2012, and is currently a postdoctoral scholar at the University of North Carolina at Chapel Hill Carolina Population Center. His research spans a wide range of topics in applied quantitative methodologies, focusing on causal inference econometrics, health and behavioral economics, meta-science, statistical inference for HIV in sub-Saharan Africa, and experimental survey methods. Noah also runs MetaCausal.com, where he and his colleagues dive into topics in science, statistics, and health, starting with keeping track of how media and social media are discussing the CLAIMS study. His ORCID iD is 0000-0002-5672-1769, and he can be found on Twitter at @NoahHaber

Alexander Breskin is a postdoctoral researcher in the Department of Epidemiology at UNC Chapel Hill. He completed his MPH at Columbia University in 2015 and his PhD in Epidemiology at UNC Chapel Hill in 2018, during which he was a member of the CLAIMS study team. His research focuses on developing and implementing methods to improve the internal and external validity of studies, as well as to provide results that are useful for policy decision-making. Substantively, his work explores the prevention and treatment of HIV-related comorbidities. His ORCID iD is 0000-0002-9299-9135.

Ellen Moscoe is a post-doctoral researcher at the University of Pennsylvania. She completed her doctorate at the Harvard T.H. Chan School of Public Health in 2018 and her MA in Economics at McGill University in 2011. Her research lies at the intersection of global health and behavioural economics, and focuses on poor populations in sub-Saharan Africa as well as marginalised groups in the United States. She uses quasi-experimental methods and field experiments to measure the causal effects of policies and interventions on health behaviours and mental health. Find her on Twitter at @ellen_moscoe.

Emily R. Smith is a research associate in the Department of Global Health and Population at the Harvard T. H. Chan School of Public Health and a program officer at the Bill and Melinda Gates Foundation. Her research is focused on generating, analysing, and interpreting epidemiological data needed to improve maternal and child health in low and middle income countries. Dr Smith has implemented both clinic-based and population-based randomised control trials to evaluate the efficacy of interventions to improve health. Her experience includes analysis of clinical trial data, utilising methods for causal inference with observational data, and conducting meta-analyses. She can be found on Twitter at @emily_ers.

Print Friendly, PDF & Email