Did you ever try to use metagenomic data from scientific papers? We found that 20% of the papers we searched for between 2016 and 2019 did not have their metagenomic data available for others to use. This number has been getting higher each year. The likelihood of not sharing the metagenomic data is higher in certain scientific disciplines, but the number of citations or type of journal doesn't predict whether the data will be available or not. However, papers in high-impact factor journals are more likely to have the metagenomic data available.

Twenty-first century science demands compliance with the ethical standard of data sharing of metagenomes and DNA sequence data more broadly. Data accessibility must become one of the routine and mandatory components of manuscript submissions—a requirement that should be applicable across the increasing number of disciplines using metagenomics. Compliance must be ensured and reinforced by funders, publishers, editors, reviewers, and, ultimately, the authors.


While a misconception of "open access" is promoting the explosion of "pay per print" journals, real accessibility of data and of scientific results is reduced. This reduces the overall quality of the scientific production and of the spread of knowledge. With this research we call for a more accurate process of revision that will ensure a true availability of hard data, and thus of reproducibility, enhancing the progress of science in every field.

Gianluca Corno
National Research Council of Italy - Water research Institute (CNR-IRSA)

This page is a summary of: Every fifth published metagenome is not available to science, PLoS Biology, April 2020, PLOS, DOI: 10.1371/journal.pbio.3000698.
