Leonard Bauersfeld

Angel Romero

Manasi Muglikar

Davide Scaramuzza

August 8th, 2023

AI can crack double blind peer review – should we still use it?

4 comments | 38 shares

Estimated reading time: 6 minutes

Leonard Bauersfeld

Angel Romero

Manasi Muglikar

Davide Scaramuzza

August 8th, 2023

AI can crack double blind peer review – should we still use it?

4 comments | 38 shares

Estimated reading time: 6 minutes

Artificial intelligence models are particularly good at pattern matching. One potential application is the development of tools that detect and identify author styles, which has implications for nominally blind peer review processes. Drawing on a recent study, Leonard Bauersfeld, Angel Romero, Manasi Muglikar and Davide Scaramuzza show how this could be applied and suggest measures authors might take to maintain anonymity.

Peer-review stands at the core of the academic world, where researchers diligently review each other’s findings before publication, ensuring the quality and integrity of scholarly work. The double-blind review process, adopted by many publishers and funding agencies, plays a vital role in maintaining fairness and unbiasedness by concealing the identities of authors and reviewers. However, in the era of artificial intelligence (AI) and big data, a pressing question arises: can an author’s identity be deduced even from an anonymized paper (in cases where the authors do not advertise their submitted article on social media)?

In a recent article we investigate this very question, by leveraging an artificial intelligence model trained on the largest authorship attribution dataset to date. Created from the publicly available manuscripts on the arXiv preprint server, it comprises over 2 million research papers and tens of thousands of authors. Focusing purely on well-established researchers with at least a few dozen publications, our work demonstrates that reliable author identification is possible.

Our study delves into the capabilities of an advanced AI model that harnesses the textual content of research papers and the references cited by authors to predict the likelihood of a given researcher being the author of a paper. The one with the highest predicted likelihood is the author “guessed” by the model. The AI model correctly predicts authorship for 3 out of 4 papers, even in a dataset with over 2000 possible authors. For prolific researchers with extensive publication records (over 100 papers), the accuracy increases to over 85%.

Following the recent successes of AI for language-related task evaluation (i.e. ChatGPT), these results may not be considered surprising, yet our findings have significant implications for the integrity of the double-blind review process. While our work shows that machine learning methods can be used to attribute anonymous research papers, understanding how the AI is able to identify an author provides valuable guidelines that authors can follow to increase their anonymity:

Abstract and introduction: we find that the first 512 words of a paper, typically encompassing the abstract and introduction, provide sufficient information for robust authorship attribution. The AI’s performance is only marginally affected when compared to considering the entire paper. We believe that the abstract and introduction frequently reflect the authors’ creative identity and their research domain. These distinct traits facilitate author identification, particularly as authors often tend to rephrase introductions from their prior works.
Self-citations: Our analysis also highlights the role of self-citations in revealing authors’ identities. We confirmed the common hypothesis that authors cite themselves too often. On average, papers in our dataset contain 10.8% self-citations, serving as an easy giveaway to their identity. Thus, we encourage authors to omit many self-citations in the submission to a double-blind review to enhance their anonymity.
Citation diversity: Even when self-citations are omitted, the references cited in a paper can still be utilised to identify the author. By including citations from lesser-known papers, authors can bolster their anonymity, while also promoting equal visibility for all research in their field.

While authorship attribution focuses on anonymous papers, our research also explores applications in the context of signed manuscripts to aid plagiarism and ghostwriting detection. By leveraging the AI model’s probability predictions, one can determine the likelihood of the person who signed the document being the actual author. Similarly, one can query the model for the most likely possible authors of a manuscript (e.g. the top 5 or top 10). This opens avenues for more elaborate methods to cross-validate the model’s initial selection of likely authors.

Often, in small research fields experienced researchers are able to correctly guess from which research group an anonymous submission originates, possibly biasing the peer-review process. Our published article is the first to offer insights on the potential vulnerabilities in maintaining anonymity during the double-blind review process in the age of AI and big data. While our AI model demonstrates the ability to attribute authors to anonymous research papers on large scales, we emphasise the importance of preserving the fairness and unbiasedness that the double-blind review process upholds. At present, simple measures, such as reducing self-citations and embracing citation diversity, could be implemented during the initial submission stage to enhance anonymity.

As peer-review is such a fundamental pillar of science, we hope that this study encourages the research community to further explore how AI is changing peer-review itself. We have open-sourced our codebase (https://github.com/uzh-rpg/authorship_attribution) in the hope that it serves as a starting point for scholars to pick-up our work and build on top of it. Authorship attribution and plagiarism detection are vital to ensure the continued integrity and trustworthiness of academic publishing, and enhancing it will be beneficial to the entire scientific community.

The content generated on this blog is for information purposes only. This Article gives the views and opinions of the authors and does not reflect the views and opinions of the Impact of Social Science blog (the blog), nor of the London School of Economics and Political Science. Please review our comments policy if you have any concerns on posting a comment below.

Image Credit: Reproduced with permission of the authors.

About the author

Leonard Bauersfeld

Leonard Bauersfeld is a PhD Student in Robotics at the University of Zurich, where he works on first-principle based and data-driven models for quadrotors. His reseach interests also include agile autonomous vision-based flight and novel machine learning approaches.

Angel Romero

Angel Romero is a PhD Student in the Robotics and Perception Group. His research interests are classic control and learning based control for autonomous flight.

Manasi Muglikar

Manasi Muglikar is a PhD student advised by Prof. Davide Scaramuzza at Robotics and Perception Group at University of Zurich. Her research interests include event-based vision, vision-based navigation, sensor fusion and active vision.

Davide Scaramuzza

Davide Scaramuzza is Professor of Robotics at the University of Zurich, where he works on autonomous navigation of microdrones. He pioneered vision-based navigation of drones, which inspired the NASA Mars helicopter. He co-founded Zurich-Eye, today Meta Zurich, which developed the world-leading virtual-reality headset, Oculus Quest. He was featured in The New York Times, The Economist, Forbes.

Posted In: Academic publishing | AI Data and Society | Peer review

4 Comments

Nick Kautz says:

August 8, 2023 at 12:12 pm

Perhaps AI will be able to detect writing from other AI sources, and be able to recognize the specs of the model…. and then detect who, if anyone, employed AI to write it.

Reply
Sam Schwarzkopf says:

August 11, 2023 at 5:05 am

Frankly this does not surprise me. I never saw much point in double-blind review. Human intelligence is frequently able to identify authors from anonymised manuscripts as well, especially in small fields. In fact, how does the AI detection rate compare to that of human expert reviewers? Unless I missed it, your post doesn’t speak to this point (you may have done this in the study though). For this reason alone, I never much liked double-blind reviews but it is also really uncommon in my own field (if anything it is more common for reviewers to sign but that is also rare). The few times I had to anonymise manuscripts because of a journal’s policy felt pretty contrived. Given that many researchers publish submitted manuscripts as preprints, and people will also typically present their latest research at conferences to an audience of likely reviewers, the whole concept of blinding reviewers to the authors’ identity seems doomed to fail anyway.

I also don’t quite follow your argument about citations. It is possible that ” authors cite themselves too often” but what does that mean exactly? How often is too often? In many situations, an study will be based on the authors’ previous work, so they inevitably need to cite this literature. It is entirely possible that authors cite themselves more than necessary but I don’t think this really speaks to this hypothesis. More importantly, not citing those studies can hardly the the correct solution. A similar point applies to the idea of increasing “citation diversity” to “enhance anonymity”. This seems unscholarly to me. Citations should be used to support claims and give credit to prior research where it’s due. They should not be merely thrown into a paper as a means to confuse AI algorithms (arguably one might call that gaming the system).

By all means, we should increase citation diversity and cite more broadly. I’m sure most of us have a citation bias towards certain outlets, which causes a self-fulfilling prophecy of boosting impact factors of those outlets. This is how this flawed system has survived so long and it would be better for everyone to change this – but let’s do it for the right reasons not merely to protect double-blind anonymity that was probably never attainable in the first place.

Reply
Gregg Murray says:

August 11, 2023 at 2:58 pm

This is an interesting demonstration of what AI can and can’t do. I think in terms of the peer-review process, though, that it’s important for reviewers not to go out of their way to figure out authorship of a paper (e.g., by searching online or using AI) before reviewing it. This is the same approach we took before easy access to AI.

Reply
Pingback: Supporting the future of peer review [podcast] | OUPblog

Leonard Bauersfeld

Angel Romero

Manasi Muglikar

Davide Scaramuzza

August 8th, 2023

AI can crack double blind peer review – should we still use it?

Leonard Bauersfeld

Angel Romero

Manasi Muglikar

Davide Scaramuzza

August 8th, 2023

AI can crack double blind peer review – should we still use it?

About the author

Leonard Bauersfeld

Angel Romero

Manasi Muglikar

Davide Scaramuzza

4 Comments

Leave a Comment Cancel reply

Sunlight not shadows: Double-anonymised peer review is not the answer to status bias

November 18th, 2022

There are four schools of thought on reforming peer review – can they co-exist?

March 24th, 2022

Can AI be used ethically to assist peer review?

May 17th, 2021

Are universities too slow to cope with Generative AI?

April 27th, 2023

Leonard Bauersfeld

Angel Romero

Manasi Muglikar

Davide Scaramuzza

August 8th, 2023

AI can crack double blind peer review – should we still use it?

Leonard Bauersfeld

Angel Romero

Manasi Muglikar

Davide Scaramuzza

August 8th, 2023

AI can crack double blind peer review – should we still use it?

About the author

Leonard Bauersfeld

Angel Romero

Manasi Muglikar

Davide Scaramuzza

4 Comments

Leave a Comment Cancel reply

Related Posts

Sunlight not shadows: Double-anonymised peer review is not the answer to status bias

November 18th, 2022

There are four schools of thought on reforming peer review – can they co-exist?

March 24th, 2022

Can AI be used ethically to assist peer review?

May 17th, 2021

Are universities too slow to cope with Generative AI?

April 27th, 2023