LSE - Small Logo
LSE - Small Logo

Rebecca Lawrence

Sabina Alam

December 15th, 2022

AI paper mills and image generation require a co-ordinated response from academic publishers

1 comment | 71 shares

Estimated reading time: 6 minutes

Rebecca Lawrence

Sabina Alam

December 15th, 2022

AI paper mills and image generation require a co-ordinated response from academic publishers

1 comment | 71 shares

Estimated reading time: 6 minutes

The role of AI in the production of research papers is rapidly moving from being a futuristic vision, towards an everyday reality; a situation with significant consequences for research integrity and the detection of fraudulent research. Rebecca Lawrence and Sabina Alam argue that for publishers, collaboration and open research workflows are key to ensuring the reliability of the scholarly record.


The latest iteration of OpenAI’s Artificial Intelligence (AI) chatbot, ChatGPT, and the bot’s almost uncanny capability to write poetry and academic essays that are very difficult to distinguish from human-centric production has recently, and much like other companies linked to Elon Musk, caused a stir in the world of research. This is raising the spectre of AI in the service of research fraud and a race-to-the-bottom in research output and publication. As John Gapper warned in the Financial Times, “…if an unreliable linguistic mash-up is freely accessible, while original research is costly and laborious, the former will thrive”. Does a new age of research desk top paper mills that are in easy reach of everyone anywhere present a real and present danger to research integrity?

In short, the risk is already with us. In May this year, data sleuth Elisabeth Bik tweeted about how image fraud was being boosted by AI, with Generative Adversarial Network (GAN) technology (where algorithms that closely match the human brain are pitted against each other to produce synthetic data) is capable of producing deepfakes in biomedical literature. Ethics and integrity issues are growing exponentially across scholarly communication. F1000’s and Taylor & Francis’ figures tell a story that is reflected across academic publishing, with such cases representing 34% of ethics cases for F1000 and about 50% of T&F’s ethics cases. Other major issues include duplicate submissions, data integrity, citation manipulation and authorship integrity issues. As Sabina noted recently, the problem is significant, not just because of the volume and extent of the growth in the number of these issues, but also because there are different types of paper mills, and they are all highly adaptive.

Does a new age of research desk top paper mills that are in easy reach of everyone anywhere present a real and present danger to research integrity?

Investigating these issues within a context of shifting sands poses many challenges. Nevertheless, publishers play a vital role in ensuring the legitimacy and integrity of what we publish and disseminate across the world. We invest in systems, safeguards and expertise to ensure due process has been applied to the scholarly content we publish. So, when this is manipulated and the integrity of the scholarly record is under threat, it’s vital we take all steps necessary to protect it. Technology is playing an ever more important role for publishers. The ability to detect research integrity and publishing ethics issues needs to be scalable, because some types of misconduct only become noticeable when patterns are detected across a number of different articles and datasets. This is a key area where developers, publishers and other scholarly organisations are collaborating and investing, not just financially, but with time and effort too.

The ability to detect research integrity and publishing ethics issues needs to be scalable, because some types of misconduct only become noticeable when patterns are detected across a number of different articles and datasets.

Systems, safeguards and expertise are just one part of the solution. At a recent Westminster Higher Education Forum,  there was “wide agreement across the global research system” that open research is critical in reducing research waste and enabling scrutiny of data. We agree. Open data and materials make it harder to fabricate data and conclusions, and access to the underlying data by readers and AI means that issues are more likely to be noticed. As AI and automation, such as Automated Research Workflows, increasingly become an integral part of research (particularly in the analysis of big data), making that data open will significantly benefit the use of AI to interrogate data for fraud. Furthermore, enabling and encouraging the publication of a broad range of outputs including negative/null findings, protocols, incremental studies – a key element of the open research model – minimises publication and editorial bias and provides additional accessible data for AI tools aimed at combatting research fraud.

Publishers themselves also need to be open to collaboration with stakeholders (including other publishers) across the research ecosystem to tackle the root causes including a system of rewards and incentives that deter rather than feed into incentives to use paper mills. The STM Integrity Hub and its prototype paper mill detector shows what can be achieved through cross-publisher collaboration. The use of automated, AI processes that can spot duplicate publications and other issues between publishers are a crucial development, given publishers’ often distinct submission and publication systems.

Abstract microscopic photography of a Graphics Processing Unit resembling a floor plan or fractal art

However, AI tools cannot do this alone and human judgement also plays a crucial role in safeguarding research integrity. From F1000’s experience of open research, we know that rigorous checks prior to publication by both AI and experienced experts are integral to maintaining research integrity in publications.

There is also a crucial need for more training and education for researchers in publishing ethics as well as research integrity. Many types of misconduct or bad practice issues are not deliberate, but rather the consequence of an inconsistent quality of training. This includes training in both good research and publishing practices, including the roles and responsibilities of authors. It’s also important to be aware of what good peer review looks like given that most peer review is still typically conducted anonymously: most researchers only see peer review reports on their own work (unless they happen to also be an Editor of a journal). There is a key role for many of the stakeholders in the scholarly ecosystem to collaborate on this and make such training open to ensure researchers, wherever they are based, can access adequate high-quality information.

There is already a rapidly accelerating race taking place between paper mills using ever more complex AI to produce fake papers at scale and publishers employing ever more sophisticated AI technology to detect issues. Ultimately, to paraphrase one of the best-known, popular reflections on AI and humanity: The future of research is not set. There is no research integrity, but what we make for ourselves.

 


The content generated on this blog is for information purposes only. This Article gives the views and opinions of the authors and does not reflect the views and opinions of the Impact of Social Science blog (the blog), nor of the London School of Economics and Political Science. Please review our comments policy if you have any concerns on posting a comment below.

Image Credit: Fritzchens Fritz, Better Images of AI, GPU shot etched 1, (CC-BY 4.0)


Print Friendly, PDF & Email

About the author

Rebecca Lawrence

Rebecca Lawrence is Managing Director of F1000. She was responsible for the launch of F1000Research in 2013. She was a member of the EC’s Open Science Policy Platform, and has co-Chaired many working groups for Research Data Alliance (RDA) and ORCID. She is a Steering Committee member of DORA (San Francisco Declaration on Research Assessment) and has worked in STM publishing for over 20 years.

Sabina Alam

Dr Sabina Alam is the Director of Publishing Ethics and Integrity at Taylor and Francis Group. After a research career in neuroscience, Sabina moved into scientific and medical publishing in 2008, working on open access journals as well as a variety of peer review and publishing models at BMC journals (Springer Nature) and later at F1000 Research. Sabina joined Taylor and Francis in 2018, leading the Publishing Ethics and Integrity team on providing investigative support and expert guidance on editorial policies, publishing ethics and research integrity. Sabina also serves on several cross-industry working groups on publishing ethics and research integrity issues, including paper mills.

Posted In: Academic publishing | AI Data and Society | Research ethics

1 Comments