The increased significance of research assessments and their implications for funding and career prospects has had a knock-on effect on academic publication patterns. Moqi Groen-Xu, Pedro A. Teixeira, Thomas Voigt and Bernhard Knapp report on research that reveals a marked increase in research productivity immediately prior to an evaluation deadline, which quickly reverses once the deadline has passed. Moreoever, the quality of papers published just before deadlines is lower, as measured by citations. Those who design research assessments should consider having cycles of varying lengths across different fields, affording researchers the time and opportunity to pursue more novel, risky projects.
Many scientists face evaluation pressure from their institutions and grant bodies. Regular assessments – such as the UK’s Research Excellence Framework – are used in many countries to encourage research activity and allocate funding, with important financial and career consequences for universities and researchers. As a consequence, researchers often complain that they do not have enough time to pursue novel projects or write more ambitious papers or books.
But do these evaluations affect researchers’ publication patterns? Our research indicates that research output does indeed change around the time researchers are submitted to assessment exercises. Using the ~400,000 outputs submitted to RAE2008 and REF2014, we find sharp changes in research productivity just before the 2008 exercise deadline that reverse abruptly after the deadline. Here is a summary of our key findings:
- 35% more submissions to the REF were published in the year before the deadline, compared to the year after.
- This is most pronounced for “slower-paced” fields such as history; more pronounced for books than for journal papers; and also more pronounced for those departments less reliant on REF-determined funding.
- Among the submissions, papers published in the 12 months immediately prior to the 31 Dec 2007 deadline received fewer citations (12% fewer than papers published in 2008, as of 2016) despite having had more time to collect citations.
- The papers were also published in lower-impact journals, as measured by impact factor, SNIP, IPP, or SJR.
- The variance in journal impact factor is higher for papers published just after the deadline, indicating that researchers did not just time their publications accordingly but possibly also pursued more novel and uncertain research projects when further from the deadline.
- These patterns are consistent with various supplementary tests, including data on aggregate UK research output and data on submission patterns for individual researchers.
Our findings are not only important for the REF, but also for research assessments in general. They imply that researchers facing evaluation pressure publish in lower-impact journals, possibly publishing their research in small chunks instead of more ground-breaking articles or books. In addition, the higher variance in journal quality at the beginning of the assessment period suggests that researchers with more time can afford to take on more novel, risky projects.
After our research was reported on the Times Higher Education, Steven Hill – Head of Research Policy at HEFCE – raised concerns about our interpretation of the findings. We appreciate critical comment on our research and would like to address some of the concerns raised and explain why we believe our interpretation of the data is correct.
As Hill points out, many researchers selectively choose their most cited papers to be among their REF submissions. Because older papers had more time to accumulate citations, information about them is more precise at that time, thus biasing the choice of older articles with higher citation counts. Yet, even though such effects are likely to be present, they cannot fully explain our findings, for the following reasons:
- Papers published close to the deadline not only receive fewer citations (in total as well as journal-adjusted), they are also published in journals with a lower impact factor. The argument about researchers being less sure of those papers published closer to the deadline does not account for this observation since researchers always know a journal’s impact factor at the time of submission. Indeed, we use several measures to show that the pattern in research quality is not limited to citations, to account for the oft-discussed weaknesses of the citation measure.
- The argument that researchers are less sure of those papers published closer to the deadline implies that the same papers should be of mixed quality and their citation numbers ultimately more varied. However, we actually observe a higher variance of quality in research published further from the deadline. This is consistent with theory: research in more fundamental and novel areas requires more time since the path to publication is less certain. These results are discussed in more detail in the supplementary material associated with our paper.
- If older papers are submitted because they have had more time to accumulate citations, then we should see more submissions from earlier years. However, papers published just before the deadlines are much more likely to be submitted.
Aggregate research statistics are difficult to interpret because, in many fields, not all listed authors contribute significantly. In contrast, the REF submissions that we use represent significant contribution by submitters. This distinction is especially relevant in the UK, a world-leader in the number of international collaborations; with some REF submissions listing more than one thousand co-authors.
Hill also writes that there is no evidence of significant shifts in total UK research volume in the reports that Elsevier has produced for the UK government. However, those reports actually document an increase of the UK share of the global output up to the 2008 RAE and the 2014 REF deadlines, followed by subsequent decreases, in line with our results. The argument made is that changes to overall production are attributable to other countries, notably China and India. Yet no other countries, including China and India, exhibit such abrupt changes in their publication share around UK deadlines.
What can be done?
Notwithstanding our differences, we do agree with Hill that our research should not be cause for concerns about the REF. Research evaluations set incentives for producing quality research and allocate funding in an objective and transparent way. Assessment-free science could have worse effects on scientific productivity than the side effects that we show. In addition, decoupling staff from output quotas, as planned for the next REF, could help to reduce the effects we document.
We also encourage designers of assessments to consider differences in appropriate period lengths across fields. This applies not only to the REF and other governmental assessments, but to all individual researcher assessments by universities and grant bodies. For example, the LSE recently increased tenure clocks (years until major review) from five to seven years for all departments. This should allow departments with longer research cycles to pursue a more important research agenda.
The project on REF cycles began at the 2015 Science Hackathon, where the authors, previously unknown to one another, were assembled into an interdisciplinary team.
This blog post draws on the preprint “Short-Termism in Science: Evidence from the UK Research Excellence Framework”, available on SSRN (DOI: 10.2139/ssrn.3083692).
Featured image credit: Project Deadline by Kevin, via Unsplash (licensed under a CC0 1.0 license).
Note: This article gives the views of the author, and not the position of the LSE Impact Blog, nor of the London School of Economics. Please review our comments policy if you have any concerns on posting a comment below.
About the authors
Moqi Groen-Xu is an Assistant Professor for Finance at the London School of Economics and Political Science. Her research focuses on CEO contracts, compensation, shareholder communication, and proxy voting. She blogs on moqixu.com and tweets @moqixu.
Pedro A. Teixeira is a PhD candidate in political science at the Free University of Berlin. His research focuses on methodologies in political theory, critical theory and political economy.
Thomas Voigt was MRes student in Biomedical Imaging at University College London at the time of the project’s initiation. He now works in scientific software development outside academia.
Bernhard Knapp was a “2020 Science” research fellow with the University of Oxford at the time of the project’s initiation. He has since been appointed as Associate Professor at the International University of Catalonia (UIC) in Barcelona. His research focuses on bioinformatics and immunoinformatics.
I wonder why people often assume that lower impact journals publish lower quality papers? To understand whether this is the case, we should not look at citation or journal impact factor as the readers are more likely to cite more famous journals.
Given that all papers are anonymously reviewed in all journals, we need to check whether reviewers deliberately lower their standards when they review for lower quality journals and whether the editor adopts a lower standard for acceptance? My experience as a reviewer for many journals, I do not even think about the journal quality when I comment on a paper. But others might be different.
It is very strange that on the one hand, so many people have already well argued that journal impact factors are about the journal and have nothing to do with each individual papers’ quality. On the other hand, it is surprising to see that scholars still keep using impact factor to label their own papers.
Shall we instead focus on the phenomenon of masochistic mentality in the academic sector?
Thanks for considering my earlier comments. I still think that the claim of lower quality is unsupported. As is pointed out by Pablo in the comments, neither citations not Journal Impact Factor reflect quality. Of course we can debate what ‘quality’ means, but in terms of the REF it is defined against specific criteria as judged by expert assessors. So while the work may provide evidence of a lower number of citations or publication in particular classes of journals, this does not necessarily mean lower quality.
A rush of publications just before the REF deadline is not surprising for those researchers who have not published significant amounts in the preceding four years.
However, for those who publish significant papers throughout the REF period, the rush is irrelevant as they have plenty of published work to submit. During our preparation for REF2021 we have found that if asked to put forward papers for review, researchers will often only submit recent papers, regardless of how many excellent papers they may have produced since the previous REF. I assume this is because more recent papers are uppermost in the researchers’ minds when asked to suggest possible publications. As many institutions do select papers by asking researchers to make suggestions, is it possible that this “selective memory effect” might account for some of the bias towards recent outputs reported above?