Current citation biases give us only the narrowest slice of scientific support. Bradley Voytek writes that while BrainSCANr may have flaws, it gives the reader a quick indication of how well-supported an academic argument is and could provide a new way of thinking about citations.
Science has a lot of problems. Or rather, scientometrics has a lot of problems. Scientific careers are built off the publish or perish foundation of citation counts. Journals are ranked by impact factors. There are serious problems with this system, and many ideas have been offered on how to change it but so far little has actually been affected. Many journals, including the PLoS and Frontiers series, are making efforts to bring about change, but they are mostly taking a social tactic: ranking and commenting on articles.I believe these methods are treating the symptom, not the problem.
Publish or perish reigns because our work needs to be cited for scientists to gain recognition. Impact factors are based on these citation counts. Professorships are given and tenure awarded to those who publish in high-ranking journals. However citations are biased and critical citations are often simply ignored.
Bear with me here for a minute. How do you spell “fish“? “G-h” sounds like “f”, as in “laugh“. “O” sounds like “i”, as in “ women”. “T-i” sounds like “sh”, as in “scientific citations”. This little linguistic quirk is often incorrectly attributed to George Bernard Shaw; it’s used to highlight the strange and inconsistent pronunciations found in English. English spelling is selective. You can find many spelling examples that look strange, but support your spelling argument. Just like scientific citations.
There are a lot of strange things in the peer-reviewed scientific literature. Currently, PubMed contains more than 18 million peer-reviewed articles with approximately 40,000-50,000 more added monthly. Navigating this literature is a crazy mess. When my wife Jessica and I created brainSCANr and its associated peer-reviewed publication in the Journal of Neuroscience Methods (“Automated cognome construction and semi-automated hypothesis generation”), our goal was to simplify complex neuroscience data. But we think we can do more with this system.
At best, as scientists we have to be highly selective about what studies we cite in our papers because many journals limit our bibliographies to 30-50 references. At worst, we’re very biased and selectively myopic. On the flip side, across these 18+ million PubMed articles, a scientist can probably find at least one peer-reviewed manuscript that supports any given statement no matter how ridiculous.
What we need is a way to quickly assess the strength of support of a statement, not an authors’ biased account of the literature. By changing the way we cite support for our statements within our manuscripts, we can begin to address problems with impact factors, publish or perish, and other scientometric downfalls.
brainSCANr is but a first step in what we hope will be a larger project to address what we believe is the core issue with scientific publishing: manuscript citation methods.
We argue that, by extending the methods we present in brainSCANr to find relationships between topics, we can adopt an entirely new citation method. Rather than citing only a few articles to support any given statement made in a manuscript, we can create a link to the entire corpus of scientific research that supports that statement. Instead of a superscript number indicating a specific citation within a manuscript, any statement requiring support would be associated with a superscript number that represents the strength of support that statement has based upon the entire literature.
For example, “working memory processes are supported by the prefrontal cortex” gets strong support, and a link to PubMed showing those articles that support that statement. Another statement, “prefrontal cortex supports breathing”also gets a link, but notice how much smaller that number is? It has far less scientific support. (The method for extracting these numbers uses a simple co-occurrence algorithm outlined in the brainSCANr paper.)
This citation method removes citation biases. It provides the reader a quick indication of how well-supported an argument is. If I’m reading a paper and I see a number indicating strong support for a statement, I might not bother to look it up given the scientific consensus is relatively strong. But if I see an author make a statement with low support–that is, a weak scientific consensus–then I might want to be a bit more sceptical about the claim.
We live in a world where the entirety of scientific knowledge is easily available to us. Why aren’t we leveraging these data in our effort to uncover truth? Why are we limiting ourselves to a method of citations that has not substantially changed since the invention of the book? The system as it is currently instantiated is incomplete and we can do more.
This method may have flaws, but it would be much harder to game than the current citation biases that only give us the narrowest slice of scientific support. This citation method entirely shifts the endeavour of science from numbers and rankings of journals and authors (a weak system for science, to say the least!) to a system wherein research is about making truth statements. Which is what science should be.
Note: This article gives the views of the author(s), and not the position of the Impact of Social Sciences blog, nor of the London School of Economics.
This post was originally published on Bradley’s blog, Oscillatory Thoughts, and is produced here with permission.
About the author:
Bradley Voytek, PhD is an NIH-funded neuroscience researcher working with Dr. Adam Gazzaley at UCSF. Brad makes use of big data, brain-computer interfacing, and machine learning to figure out cognition and is an avid science teacher, outreach advocate, and world zombie brain expert. He runs the blog Oscillatory Thoughts, tweets at @bradleyvoytek, and co-created brainSCANr.com with his wife Jessica Bolger Voytek.
“(M)any journals limit our bibliographies to 30-50 references.”
I’ve never published in such a journal. How widespread is this (stupid) practice?
“(A)cross these 18+ million PubMed articles, a scientist can probably find at least one peer-reviewed manuscript that supports any given statement no matter how ridiculous.”
This is a nice problem to have, but it’s not a problem everyone has. In a lot of research fields, the problem is not sifting through the masses of published papers. It’s that there are are obvious questions that need answering that nobody has done yet.
“This method may have flaws, but it would be much harder to game than the current citation biases that only give us the narrowest slice of scientific support.”
It increases the danger of the unwarranted acceptance claim which are supported by only flimsy evidence, yet are still widely held amongst researchers. It’s important to try to keep our arguments and claims are closely tied to the raw data and evidence as is possible, and I worry that this sort of system could move us in the wrong direction in this regard. While it is possible to find obscure citations to support almost any statement, and it’s certainly possible to claim a citation supports a statement when it does not, this is simply a reason for readers to try to engage critically with everything they read. Attempts to provide more reliable quick-cuts could cause more harm than good by leading to yet more unjustified faith in peer reviewed papers.
Agree that citations leave much to be desired but, as I (admittedly poorly) understand the brainSCANr algorithm, it lacks the accuracy of pointing to specific papers used to support/refute a scientific position. Rather it provides a visual weighting or census of the contributions to the field. Citations are used in many ways and perhaps for some of these instances one could make a strong argument for this type of referencing. However, as you note, people are attracted to consensus, yet for a field to be turned around, only a single initial paper is needed. I would argue that much of what is wrong with citations is related to too much “me-too” incrementalism being done. This builds a huge body of work which provides added mass but little progression to an idea. It is precisely this sullen inertia that needs countering and a gallant paper that flows against the tide needs lifting, not burying.
I do understand the benefits in that there is too much emphasis on balance. CLimate change is a good example, where 1% opinions get equal weighting in debates (especially by the media) with the 99%. But I also worry about the term scientific truth. Absolutism has little place in the scientific method.
You idea of citing databases is essentially citing automatically generated review articles. The big positive is that you don’t have to wait until someone publishes a review article to cite a summary of the field. The negative, like others here note, is that that database doesn’t automatically distinguish disagreements relating to connected words from concordance. For example, using brainSCANr, “posterior cingulate” is strongly connected to both regions in the default mode network and regions, like the “inferior parietal lobule” that are considered parts of distinct networks that are frequently contrasted in the same papers.
Assuming publications as discrete works continues, I think there is huge value to citing only things the author has read or citing review papers that defer to the authority of another expert. The compromise would be citing expert curated databases, which would be like just the reference section of review papers.
More broadly, I’m not sure sure about the problem one is trying to solve here. If an author is selectively citing the sources or doesn’t understand the expanse of the existing literature that is bad science that should either not make it through the peer review process or not have a very high impact.