Quantitative research in the social and natural sciences is increasingly dependent on new datasets and forms of code. Making these resources open and accessible is a key aspect of open research and underpins efforts to maintain research integrity. Erika Pastrana explains how Springer Nature developed Nature Computational Science to be fully compliant with open research and data principles.
Last year, we asked peers in the publishing industry whether we are providing what researchers need in the transition to Open Science. New findings from our State of Open Data report and recent regional research integrity surveys continue to show that there are gaps, particularly in supporting researchers with code and data sharing – a key aspect of achieving full and sustainable open science.
In our experience, success in trying to address these gaps involves a combination of implementing publication policies, leveraging technology and applying editorial expertise. None of these approaches in themselves will deliver sufficient results, but a tripartite approach has proven effective in moving the needle towards openness. Importantly, when this recipe meets a scientific community that is supportive of sharing, such as the computational science community, full (aka 100%) compliance with open science practices is possible.
Ingredient 1: Policies and best practice
Policies that strongly encourage or require sharing of research objects, during peer review or at publication are the groundwork to enable publishers to collaborate with researchers in following open science behaviours.
Springer Nature adopts an integrated policy framework that encourages authors to share, as much as possible publicly, the key research objects that underlie the work. We have long been committed to advancing reproducibility and open research practices across our journals, such as the steps taken to improve the reproducibility of published research by the Nature portfolio journals, the introduction of Springer Nature Data policies, as well as our support for protocol sharing. Recently, we announced a policy encouraging public code sharing in all of Springer Nature’s Journals and Books. This policy aims to provide transparency and to facilitate editors and authors working together to increase the sharing of new code that is key to scientific advancement. Data and code availability statements, along with the deposition and encouragement of data sharing, reinforce trust through transparency. These policies enable authors to submit their code for peer review and share verified code upon publication, strengthening their work.
In addition to policies relating to the sharing of objects, publishers can also influence researchers through education about best practices. In the case of code, documentation to enable others to check the code and re-use it, including information on dependencies, operating systems, technical requirements, as well as licences and terms of use, are vital. Our editors work with authors and reviewers to meet these standards of sharing. Upon publication, deposition of the code in a repository that assigns a ‘permanent identifier’ or PID (such as a DOI) is strongly encouraged and, in some journals, required.
These best practices ensure the code that was used as part of the work associated with the publication is permanently accessible via a unique identifier, cited in the paper and recognised as a valuable output in its own right.
Fig. 1: Example of a Code Availability statement part of a Nature journal publication describing the accessibility of the code.
Ingredient 2: Evolving publishers’ technical capabilities
Any framework of policies must be supported by technical solutions that make it easy for busy researchers to share and link the outputs of research to their publications. Publishers, in collaboration with technological platforms, can create technical solutions to meet these needs as part of the submission process. In turn, these solutions can make reviewers’ lives easier by facilitating visibility, seamless access and technical support for verifying the research objects associated with the article they are reviewing.
In 2022, the submission system for the Nature journals was updated to collect information from authors about their plans to share code and data as part of the submission. This was enhanced by providing a service to facilitate sharing via a platform integration with Code Ocean and Figshare.
Looking at journal data submissions from those journals that offer code sharing, analysis shows that from November 2023 to March 2024, of those manuscripts in which authors reported having developed new code, between 25-41% of authors took up code sharing services, and at each journal, at least 25% of reviewers engaged with code review (for some journals the engagement from reviewers surpassed 50%). This initial data shows that making code sharing a central and integrated part of the submissions process, is facilitating authors to share their code and data, and making it easier for reviewers to verify the objects.
Fig. 2: Questions posed to researchers submitting to Nature Journals about code sharing as part of their article submission.
Ingredient 3: Editorial expertise
Policies and technological solutions can go a long way toward promoting open science behaviours, but the human element is essential to ensure authors are supported with their specific needs. Utilising their editorial expertise, publishers can make it easier for researchers to follow the best practice guidance when it comes to metadata standards, protocols for data and code deposition or object citation. Our role as a publisher is one where we can and should play an active voice in the community, collaborating with partners to help develop standards, tools, and services to better support sustainable open research and therefore, open science practice. Examples of this include our involvement with the MDAR framework and our ongoing drive of the FAIR data principles. We also prioritise working closely with our editorial and author community to develop best practice tools and accreditations at an author and institutional level and provide specialist data support, building data expertise among our editors and helping researchers to ensure compliance.
Moving the needle – Code sharing at Nature Computational Science
These three ingredients were put into practice in Nature Computational Science. Launched in 2021, the journal enforces code sharing during peer review and strongly encourages public sharing at publication. Code Ocean was offered to authors to support code sharing from the first submission, and editors are proactive and passionate about working with authors to navigate their open science needs. As a result, of the 205 primary research articles published in the journal since launch, all provide a Code Availability section and all share their code publicly, cite it and provide it via a PID. For those interested, the dataset is here, and you can see that 23% of authors chose to share their code via the integrated Code Ocean solution, whereas the rest primarily used Zenodo and Github.
From this early data, we can see that linking key research objects to the submission of manuscripts and setting up supportive policies and editorial practices can enable achieving fully open science. Of course, different scientific communities have specific needs, and computational science may be the low hanging fruit of what is to come in promoting data and code sharing. But knowing that a world of full open science is possible, should encourage us and excite us to the necessary work ahead.
The content generated on this blog is for information purposes only. This Article gives the views and opinions of the authors and does not reflect the views and opinions of the Impact of Social Science blog (the blog), nor of the London School of Economics and Political Science. Please review our comments policy if you have any concerns on posting a comment below.
Image credit: PeopleImages.com – Yuri A om Shutterstock.
Great article. Over 350 years ago, the idea of a fledgling scientific publication was started, and the aim was to create an impact by connecting research to the real world. It took advantage of the printing press and provided a framework allowing ideas to be shared and scrutinised through peer-reviewed research publications. Academic publications have been a great success.
Computers and computer code offer new tools that will enable insights to be captured, codified and shared dynamically, potentially enhancing the opportunity to drive a more significant impact. Sharing access to code is a great start. Still, I wonder if research reporting could be based on active code, where insights can be accessed directly via APIs and tested and evaluated through inputs and outputs. Computer code itself is only valuable for what it can facilitate, and maybe a research process that delivers active code might create a new wave of unrealised impact.