Lockdown policies are thought to reflect the scientific consensus. But how do we measure that consensus? Daniele Fanelli (LSE) set up a site that enables academics to anonymously give their views on the ‘focused protection’ model endorsed by the ‘Great Barrington Declaration’, and found some striking differences between both countries and genders.
What do experts think about COVID-19 mitigation strategies? Can we know what they really think, right now, across disciplines and countries?
Take national lockdown policies as an example. In one form or another, they have been adopted by most countries around the world, suggesting a strong consensus around their necessity. It is presumably on this basis that the public expression of contrary opinions is discouraged, especially when voiced by prominent scientists, who are seen as unwitting agents of misinformation. And when alternative strategies are proposed, such as the “focused protection” model outlined in the Great Barrington Declaration (GBD), reportedly signed by nearly 13,000 medical scientists and three times as many health practitioners, these are cast aside as a fringe viewpoint that does not reflect the scientific consensus.

It may well be that current policies side with the scientific consensus. But are we measuring such consensus, and how? In which disciplines? In which countries? Moreover, aren’t scientific opinions amenable to change over time, as more evidence is gathered about such a new and complex problem? And how easily can this change of belief occur, if dissent is publicly discouraged?
The problem of online misinformation is real and serious. However, so is the risk of stifling progress by silencing public debates. Moreover, and perhaps most importantly, any action that can be construed as censorship will reinforce conspiratorial narratives, and enlarge the only “fringe” that should really concern us all – that of irredeemable ‘denialists’.
A few weeks ago I decided to experiment with a new way to assess and disseminate if and how experts agree on complex issues like this one. The idea is simple enough, and it involves a combination of systematic review, online survey and social media methodologies.
I created a public platform where a selected group of experts could answer a specific question anonymously, by using a secret key known only to them. Their answers are displayed on the site, in aggregated and anonymised form, and their optional comments are shown. If they wish to change their answer or input a new comment, they can do so at any time. This approach meets three objectives at once: it informs the public about what academics think about a relevant problem, it helps experts communicate freely, and it produces data about how scientific consensus varies across contexts and over time.
A few technical hurdles and multiple ethics revisions later, welcome to CovidConsensus.org.
Selection criteria were intentionally broad, in order to capture a large diversity of perspectives. As shown in the flow diagram, using the Web of Science database I identified 1,841 corresponding authors of articles that in title or abstract included any one of a set of key words relevant to COVID-19 mitigation strategies. That’s all. No arbitrary rules involved.
Each author in the list, which is also displayed online for transparency, received an email invitation that included a personal code and all the data that was associated with the anonymised code: research field, country and gender. They could ask to have the data corrected or not to be included at all.
The question asked was designed to be simple and unambiguous:
“In light of current evidence, to what extent do you support a ‘focused protection’ policy against COVID-19, like that proposed in the Great Barrington Declaration?”
Answers were collected on a five-point Likert scale from “none” to “fully”.
Excluding the undelivered emails, a total of 1,755 invitations were sent. At the time or writing, 453 respondents (25.8%) visited the website at least once, spending on average one minute on it. Of these, N=97 (21.4%, 5.5% of invitations) posted an answer, for a total of 132 votes and 58 comments. A small group of countries yielded zero contacts, suggesting that emails failed to reach their authors, perhaps filtered out as spam. However, the remaining country numbers were correlated with the total number of invitations, suggesting an adequate capturing of the target population.
The response data above suggest that participants have voted deliberately. In many cases, they chose not to vote at all after visiting the site, thereby taking an interest in the project. In other cases, they voted multiple times. At least one author did so in an obvious attempt to “game” the results, inputting “none” 15 times in a row. This strategy was futile, as all analyses are based on the last vote cast by each voter-code.
What were the results? Briefly, answers are rather spread out and were, right from the beginning of data collection, bimodally distributed around “none” and “partially”. In other words, few appear to fully endorse the GBD, but at least as many are in partial agreement with its principles as they are entirely opposed to it.
What level of consensus does this reflect? To measure it, we can use a simple measure of “proportional entropy explained”:
where H(Y) is the Shannon entropy (information content) of the distribution of answers Y. This is a simplified version of a K function that I elsewhere proposed as a general metric of knowledge. Consensus is full when k=1, and all respondents give the same answer, whatever that answer is. Conversely, k=0 means that all answers are equally likely – in other words, we have no idea what any one thinks. Applied to all aggregated data, consensus is surprisingly low (Figure 1).
Figure 1: Consensus among respondents
Let’s be extremely clear that this does not entail low consensus by scientists on COVID-19 policies – not only because the sample size is small, but also because answers come from a very diverse pool of experts, with different social and academic backgrounds.
But this is precisely where things get interesting, because both agreement and consensus vary significantly across disciplines, countries and even the gender of experts. Looking only at categories where five or more votes were cast, it would seem that female authors tend to be less favourable to focused protection than males and/or authors whose gender cannot be determined based on first name (Figure 2).
Figure 2: Consensus by gender
Disciplines also show remarkable differences. In particular, authors of articles in social science or humanities journals have low consensus and/or spread-out distributions overall. Authors in clinical medicine, however, show a strong preference towards “partially” agreeing. This is unlike authors in the remaining 18 disciplines (aggregated here as “other”), which have similar levels of consensus but are relatively against focused protection (Figure 3).
Figure 3: Consensus by specialism
Most intriguing of all, there are significant differences between countries. Authors in India, for example, are much more favourable than others (Figure 4).
Figure 4: Consensus by country
What could explain the sharp difference between countries? The two principal areas of contention in the debate on lockdowns are centred on economics and demographics. On the one hand, there are fears that lockdowns might have a devastating economic impact and increase inequality within and between countries. On the other hand, the focused protection idea of shielding only the most vulnerable is criticised as unethical and unfeasible, especially in conditions of extreme poverty and forced coexistence. This tension was reflected in many of the comments posted on the website, too.
I explored the relative importance of these two dimensions with a multivariable ordinal regression model that included two country variables taken from 2019 World Bank Data: per capita GDP and the percentage of population over 65. The former is a proxy of economic factors, and the latter of demographic ones. Controlling for discipline, the strongest predictors of agreeing with a focused protection strategy are per capita GDP, and gender (Figure 5).
Figure 5: Strongest predictors of agreeing with statement, estimated in a multiple ordinal regression analysis
These two effects are striking. For example, this is how they relate to the predicted probability of agreeing, for an author in clinical medicine, from a country where over-65 year olds are 10% of the population:
Figure 6: Probability of ‘fully’ or ‘mostly’ agreeing with the statement, by per-capita GDP of respondent’s country, controlling for specialism and proportion of population over 65
Although preliminary and derived from a relatively small sample, the relation with GDP seems to offer some support to the economic argument advanced by the GBD. We can hypothesise that scientists in poorer countries are most in favour of it because they are most aware of the economic impact of shutting down local and global economies.
The gender effect is harder to explain, especially against suggestions that female academics pay the heaviest career price due to lockdowns. We might speculate that women, who tend to take on greater responsibility for the care of dependents, are more protective than their male counterparts. However, there could be hidden confounding effects, for example if females are over-represented in subfields that tend to oppose focused protection.
Textual analyses of the comments section, and perhaps analyses on more data, might help assess these interpretations. However, beyond the specific results, which are clearly limited, this project illustrates the importance of probing and studying scientific consensus on matters of societal or scientific controversy, and it also illustrates some of the challenges in doing so. The experience accrued in this pilot will help me build a better and more effective platform, where newer questions will be addressed.
But if you, dear reader, have received an invitation code and haven’t voted yet, I hope that you’ll do that now, and let everybody know what academics really think.
The equation is displayed incorrectly, it should actually be k=1-H(Y)/log(5).
This has now been corrected.
Fantastically fast, thank you! My fault for not checking the proofs better.
> Let’s be extremely clear that this does not entail low consensus by scientists on COVID-19 policies – not only because the sample size is small, but also because answers come from a very diverse pool of experts, with different social and academic backgrounds.
Obviously, such a convenience sampling would be very much prone to a self-selection bias.
For example, there could obviously be a self-selection whereby those who hold the view that interventions are counterproductive would be more likely to respond. And there could well be an interaction effect between that bias and something like gender or discipline.
Generalizing from this sample, it seems to me, would largely be an exercise in confirmation bias.
It is important to stress the limitations, but it is not obvious to me what direction the bias would take.
Something worth mentioning is that each voter that goes to the site sees the data up to that point. And invites were sent over a week, plus reminders the following week.
Therefore, unlike in an ordinary survey, here virtually every voter had (and still has) the chance to redress any bias they perceive in the data.
Don’t know why we’re assuming a selection bias in that direction, or at all. That’s a pretty robust (trans-national) theory of mind you’re working with there. One could very nearly assume a self-selection bias in any direction, really. So why pick that one?
The sample size is a weakness, but the author clearly acknowledges it. I appreciate his effort to disentangle a real problem from the superficial certainty that has strangled debate.
I do think that it does not help to post such articles in reducing the self-selection bias: if I get the feeling that the article wants to prove that the rich countries in their haste of making strong lockdown overlook the needs of the poor ones, I feel much less inclined to answer that I would endorse them. Actually, I feel like not answering at all. Take me as anecdotal evidence for exactly that bias.
Idea of the research: great.
Conduction: screwed.
(Sorry, if I’m too harsh here.)
Thanks for your feedback. Please note that I am not “trying to prove” anything. I just described what the data suggests. Invited voters are supposed to change their minds if they wish to, based on what other invitees (not me) say. And, again, I can see reactions going both ways.
I wish you were giving out data entry codes to academics in other disciplines too (I’m an in engineering department), I guess it wouldn’t be so relevant to ask scientists who haven’t published papers relating to covid responses, but it might start factoring for the consensus of economists, infrastructure specialists, human rights experts, privacy experts… and other experts on aspects of society harmed by lockdowns who just haven’t (yet) had chance to publish on the subject.
Very good study here, does help to show that those scientists who’ve been publishing about the pandemic and “panicdemic” responses are a lot less behind the likes of Whitty and Valance than much of the media would try to claim.
Would you be able to do this on public health experts only? Then it would be more useful in understanding the consensus from that specialism, which is what we really want to know, in my view.
I can use other key words (in all cases referred to the journal these authors had published in) not shown above. For public health, this mostly corresponds to authors in Social Sciences, General, Arts & Humanities, and Clinical Medicine.
However, I think part of the debatable point is to what extent a single sector can be deemed to have the sole relevant expertise.
I take this opportunity to point out that many voters are in now, and that some data have been corrected. Due to an error, some authors were incorrectly classified as Arts & Humanities. However the picture for the other disciplines is substantially unchanged, and so are the overall patterns noted above.
See the updated graphs here: https://covidconsensus.org/ld1.php