Reader Comments

Post a new comment on this article

Mandatory data archives can’t change the culture of science. (published with the article 2 November 2011)

Posted by rochellet on 31 Aug 2022 at 17:56 GMT

Mandatory data archives can’t change the culture of science.

Comment on:
Wicherts JM, Bakker M, Molenaar D. (2011). Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results. PLoS ONE.

ABSTRACT
Data sharing is essential for reproducibility of scientific research, yet recent estimates of the proportion of investigators who refuse to share their data range from 47-73%. Wicherts et al. show in their 2011 PLoS ONE paper that weaker evidence in a psychology paper published in a journal requiring signed agreements to share data is associated with a failure to comply with this signed agreement to share. They conclude that the relationship between weakness of evidence and failure to comply with journal (contractual) and professional obligations to share data supports a requirement that all data from published papers should be archived by mandate.

Rochelle Tractenberg, the academic editor who handled the peer review of Wicherts et al., is a biostatistician and research methodologist at Georgetown University. She interpreted the association between weak evidence and failures to comply with data sharing requirements in a different way. Instead of supporting mandatory archiving of data, Dr. Tractenberg concludes that readers and reviewers of manuscripts and grant proposals should be notified of the author’s/applicant’s history of compliance with data sharing mandates and policies. Dr. Tractenberg concludes that such documentation will do more to change the culture towards one promoting science and data sharing than mandatory data archiving.




The article by Wicherts et al (2011) describes their analysis of 49 different papers in two psychology journals, and the refusals of the majority of authors –who tended to present the weakest science - to share their data. As the editor shepherding this article through the review process, I applaud their effort and its write up. I disagree with their conclusion that observations of incorrect, inappropriate, and/or unrepeatable results published in peer reviewed journals (see also Ionnidis, 2005; Wicherts & Bakker, 2006; Tenopir et al. 2011) evidence a need for mandatory archiving of data. I believe that journals should instead act as anonymizing third parties, documenting requests for sharing and whether or not authors comply. The archived compliance information should be accessible to all reviewers for journals and grants, as well as to scientists who look to the published literature for replicable results.

Data sharing policies apparently do not support data sharing. Wicherts & Bakker (2006) found that, while journals published by the APA all require authors whose work is published to sign an agreement to share their data, 73% of authors who were contacted failed to do so. In their current (2011) study –of APA journals with the same policy - the refusal rate was “only” 57%.

Mandatory archiving of data might ensure that data will be shared, but even if it becomes a successful policy, this may not support better science. Although few researchers may fake their data (e.g., Fanelli, 2009), many do inappropriate, incorrect, or otherwise weak analyses -this is widely documented across many different disciplines (as shown by Wicherts et al. 2011, and in the literature they review). Furthermore, many published manuscripts describe only those analyses that led to significant results, failing to identify all the variables that were actually collected and/or used in the analyses that were done. Analyses carried out without a specific intent to deceive can still be misguided, misinterpreted, or otherwise mistaken. An archive of data that is fake, that wasn’t correctly described in a published manuscript, or both, is useless.

A large proportion of data will never be of interest to another investigator. Not because the data are not of interest, but most work involves smaller samples or has other features that limit the likelihood that another investigator would want access to the raw data. The most important information about shared datasets, though, could be how the analyzed data were selected and/or manipulated (transformed, combined, or even standardized) to create the variables that were actually analyzed in any new manuscript. However, most policies on sharing specify “raw data” is what must be shared. Mandatory archives could waste valuable resources.

Fischer & Zigmond (2010) note that, among other steps that are needed, changing the scientific culture to be more supportive of sharing would foster more data sharing; they state that legal steps should not be taken to force data sharing. Given the prevalence of the problem and the lack of any repercussions for those failing to share, I propose an alternative to mandatory archiving: mandatory tracking of compliance with sharing requirements. An investigator interested in data or resources described in a publication submits a request to the journal in which it appears. If an agreement to share is on record, then the journal submits a formal – anonymous - request for sharing. If the data holder ignores this request, that becomes a part of the publication – and public - record. If the holder agrees, that will also become a part of the publication record. The only other responses to a request to share could include:

A) I am still using the data, please re-request in X months.
B) I am not prepared to share the data yet, please re-request in X months.
C) The IRB (letter must be provided) prohibits me sharing these data ever.
D) The study was concluded five or more years ago, I no longer have/maintain that dataset.

Journals would then simply be responsible for anonymizing requests to share, and maintaining archives of whether or not authors are compliant with their own sharing policy. Sharing data sh/would not be contingent on data holders knowing or ‘approving’ the requestor. Failures to comply with funder, publisher, and/or ethical guidelines can be taken into account by reviewers. Every reviewer, for funding and publication, would be aware of compliance with data sharing policies. Data archives cannot provide this.

Journals should not send papers out for review if any co-author has failed to share data in defiance of a signed agreement to do so with respect to any other manuscript published in a journal with a data-sharing mandate –unless there is a documented reason why not. If an agency has a data sharing policy, then they should also have a publicly accessible repository linked to –and described in- their data sharing policy where the email exchange outlining a failure to share data should be archived.

There are many reasons why data sharing may be limited identified in this and other recent papers (e.g., Fischer & Zigmond 2010; Tenopir et al. 2011). Institutions do not generally support data management and sharing methods (although see Giffels, 2010 who identifies potential institutional resources to use). Institutions will act to prevent limitations on their scientists being funded or published; so an archive of cases where data could be, but are not, shared might result in institutional resources being allocated for data archiving.

Not sharing data harms science, but currently has little impact on scientists. There should be concrete actions taken when investigators do NOT share data, and perhaps equally importantly, institutional and scientific cultures on sharing must also be more supportive, for example rewarding investigators who DO share (as suggested by Fischer & Zigmond, 2010). Mandatory archiving cannot push institutions to support data management, archiving, or a culture of sharing. Ramifications have a greater likelihood to increase data sharing, and a culture that supports it, than mandatory archiving does.


References

Fischer, B. A., & Zigmond, M. (2010). The essential nature of sharing in science. Science and
Engineering Ethics 16:783–799. doi:10.1007/s11948-010-9239-x.

Fanelli D (2009) How Many Scientists Fabricate and Falsify Research? A Systematic Review and Meta-Analysis of Survey Data. PLoS ONE 4(5): e5738.
doi:10.1371/journal.pone.0005738

Giffels J. (2010). Sharing Data is a Shared Responsibility. Commentary on: ‘‘The Essential Nature of Sharing in Science’’. Science and Engineering Ethics. 16:801–803.
doi 10.1007/s11948-010-9230-6

Ioannidis JPA. (2005). Why most published research findings are false. PLoS Medicine 2: e124.

Tenopir C, Allard S, Douglass K, Aydinoglu AU, Wu L, et al. (2011) Data Sharing by Scientists: Practices and Perceptions. PLoS ONE 6(6): e21101. doi:10.1371/journal.pone.0021101

Wicherts JM, Borsboom D, Kats J, Molenaar D (2006) The poor availability of psychological research data for reanalysis. American Psychologist 61:726-728.

Wicherts JM, Bakker M, Molenaar D. (2011). Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results. PLoS ONE.

No competing interests declared.