Data sharing has the potential to facilitate wider collaboration and foster scientific progress. But while 88% of researchers in a recent study confirmed they would like to use shared data, only 13% had actually made their own data publicly available. Benedikt Fecher, Sascha Friesike, Marcel Hebing, Stephanie Linek, and Armin Sauermann look at the mismatch between ideal and reality and argue that academia is a reputation […]
Introduction to Open Science: Why data versioning and data care practices are key for science and social science.
A significant shift in how researchers approach their data is needed if transparent and reproducible research practices are to be broadly advanced. Carly Strasser has put together a useful guide to embracing open science, pitched largely at graduate students. But the tips shared will be of interest far beyond the completion of a PhD. If time is spent up front thinking about file […]
Standards for scientific graphic presentation: Interactive figures could significantly improve understanding of data.
Over the previous hundred years, a lot of work has gone into standardizing the way scientific data is presented. All of this knowledge has been largely forgotten. Jure Triglav wants us to bring the past back to life. Drawing on lessons learned from the New York City subway system and the graphic standards of 1914, he argues for the […]
Research funders across the world are implementing data management and sharing policies to maximize openness of data, transparency and accountability of the research they support. This guide aims to cover guidance on how to plan your research using a data management checklist, how to format and organize data, and how to publish and cite data. This is a useful guide for students […]
The Outing of the Medical Profession: Data marathons to open clinical research gates to frontline service providers.
Could greater data transparency across the medical field solve the problem of unreliable evidence? Dr. Leo Anthony Celi charts the efforts to improve the publicly available MIMIC database, a creation of the public-private partnership between MIT, Beth Israel Deaconess Medical Center and Philips Health-Care, through a series of data marathons. Data scientists, nurses, clinicians and doctors are coming together to collaborate and answer clinically […]
Reproducible computing with rctrack: Software package addresses fundamental scientific challenges of Big Data era.
Published descriptions of data sets and analysis procedures are helpful ways to ensure scientific results are reproducible. Unfortunately the collection and provision of this information is often provided by researchers in retrospect and can be fraught with uncertainty. The only solution to this problem is to computationally collect and archive data files, code files, result files, and other details while the data […]
The value of sharing research data is widely recognised by the research community and funders are setting in place stronger policy requirements for researchers to share data. But the costs to researchers in sharing their data can be considerable and the incentives are sometimes few and far between. A recent report from the cross-disciplinary Expert Advisory Group on Data […]
Data sharing may lead to some embarrassment but will ultimately improve scientific transparency and accuracy.
Open Data is important for science but in practice can be difficult for scientists afraid of the potential embarrassment of someone finding a mistake. Dorothy Bishop shares her own experience sharing her own data. When you share data you are forced to ensure it is accurate and properly documented. But she finds that error is inevitable and unavoidable in science, […]
A journal article claiming that moderate amounts of global warming have overall positive benefits has been quietly corrected after Bob Ward pointed out a number of errors. The updated analysis now claims “impacts are always negative”, but the erroneous findings have been used to inform a recent report by the IPCC which still needs to be corrected. This episode underlines the […]
Scientists can be reluctant to share data because of the need to publish journal articles and receive recognition. But what if the data sets were actually a better way of getting credit for your work? Chris Belter measured the impact of a few openly accessible data sets and compared to journal articles in his field. His results provide hard evidence that […]
“Re-purposing” data in the Digital Humanities: Data beg to be taken from one context and transferred to another.
While scientists may be well-versed in drawing on existing data sources for new research, humanists are not conditioned to chop up another scholar’s argument, isolate a detail and put it into an unrelated argument. Seth Long critically examines the practice of re-purposing data and finds data in the digital humanities beg to be re-purposed, taken from one context and […]
The changing cultural norms about open science and government mandates encouraging open data have influenced the increasing availability of research data. Limor Peer reports back from a recent conference involving leading voices in the scholarly community involved with facilitating data sharing and grappling with the challenges of data reuse. Peer argues the first step is to acknowledge data reuse is a problem, but remains optimistic that […]
Changes to the supply and demand of data are restructuring privileged hierarchies of knowledge, with amateur hackers and machine-readable technology becoming a central part of its analysis. Traditional experts may be hoping for a gradual evolution, but a parallel revolution led by practitioners in the private sector may already be underway. Prasanna Lal Das argues that partnerships will need to incorporate […]
Research funders, data managers, librarians, journal editors and researchers themselves are calling for a change in the culture of research to ensure formal data citation is the norm, rather than the exception. Sarah Callaghan looks at the reasons for and against a more fluid data environment and finds that as well as being good for science, data sharing is also good for […]
The Harvard Dataverse Network is an open-source platform that facilitates data sharing. Samuel Moore outlines how this customisable initiative might be adopted by journals, disciplines and individuals. I am a huge fan of grass-roots approaches to scholarly openness. Successful community-led initiatives tend to speak directly to that community’s need and can grow by attracting interest from members on the fringes […]
As more and more funders and journals adopt data policies that require researchers to deposit underlying research data in a data repository, the question over where to store this data and how to choose a repository becomes more and more important. Heinz Pampel is one of the people behind re3data.org, an Open Science tool that helps researchers to easily identify a suitable repository […]
Researchers, publishers, libraries and data centres all have a role in promoting and encouraging data citation.
The key to verifying and validating research is the identification and access of datasets. But cultural and behavioural barriers to sharing data are still widespread. Rachael Kotarski, the Content Expert for scientific datasets at the British Library, explains why citing data, as well as the article, is the way forward. Data are not necessarily fixed, stable or homogenous objects, so citing them […]
A replicated study on nuclear proliferation shows the critical necessity of reviewing accepted scientific results.
In replicating a 2009 study on the role of asymmetric nuclear weapons possession, Mark Bell and Nicholas Miller found that a computational error led to the overestimation of the deterrent effect of nuclear weapons by a factor of several million. It is only through constant re-evaluation of scholarly findings that scholars can reach sufficiently robust conclusions that merit the attention of policymakers. […]
A well-curated data repository is more than a place to put data. Drawing from the experience of the ISPS Data Archive, Limor Peer shows the research community has much to gain from a repository that reviews to ensure quality and reproducibility of the data. Stewardship of data in this context may be more labour intensive but ultimately provides better quality, […]
Evidence-based social policy depends on access to rich supplies of high-quality data. But how can we create, curate, enrich and reuse data already collected by government departments and researchers? James Nazroo and Matthew Woollard of the UK Data Service explore the network of trust and expertise that ensures a cost-effective pipeline of productive, policy-relevant data. James Nazroo, a Deputy Director […]