Citation: Desjardins-Proulx P, White EP, Adamson JJ, Ram K, Poisot T, Gravel D (2013) The Case for Open Preprints in Biology. PLoS Biol 11(5): e1001563. doi:10.1371/journal.pbio.1001563
Published: May 14, 2013
Copyright: © 2013 Desjardins-Proulx et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: PDP is supported by an Alexander Graham Bell scholarship from the National Sciences and Engineering Council of Canada (http://www.nserc-crsng.gc.ca/). EPW is supported by a CAREER Award from the National Science Foundation (DEB-0953694, http://www.nsf.gov/). JJA is supported by NSF DEB-0614166 and NSF DEB-0919018 (http://www.nsf.gov/). TP is supported by a FQRNT-MELS post-doctoral scholarship (http://www.fqrnt.gouv.qc.ca/). KR is supported by NSF DEB-1021553 (http://www.nsf.gov/). DG is funded by a Discovery Grant from the National Sciences and Engineering Council of Canada and by the Canada Research Chair program (http://www.nserc-crsng.gc.ca/). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: I have read the journal's policy and have the following conflicts: Ethan White is an academic editor for PeerJ and Timothée Poisot is a member of the figshare advisors program.
Public preprint servers allow authors to make manuscripts publicly available before, or in parallel to, submitting them to journals for traditional peer review. The rationale for preprint servers is fundamentally simple: to make the results of research available to the scientific community as soon as possible, instead of waiting until the peer-review process is fully completed. Sharing manuscripts using preprint servers has numerous advantages, including: 1) rapid dissemination of work-in-progress to a wider audience; 2) immediate visibility of the research output for early-career scientists; 3) improved peer review by encouraging feedback from the entire research community; and 4) a fair and straightforward way to establish precedence.
Open preprint servers offer a great opportunity for open science, especially if the community embraces the idea of discussing preprints. Initiatives like Haldane's Sieve (http://haldanessieve.org/), a new blog discussing arXiv papers in population genetics, can help make arXiv attractive for scientists looking to promote their work . These initiatives are important to fully exploit the potential of open preprint servers. Posting preprints online increases the community of available informal peer reviewers, and uses the internet for its original community-building purposes.
Preprints began to gain popularity 20 years ago with the advent of arXiv, an open preprint server widely used in physics and mathematics . Preprints are also integral to the culture of other scientific fields. Paul Krugman noted that, in economics, the “traditional model of submit, get refereed, publish, and then people will read your work broke down a long time ago. In fact, it had more or less fallen apart by the early 80 s” . In addition to a section on arXiv, economists have the RePEc (Research Papers in Economics) initiative, which aims to create an archive of working papers, manuscripts, and book chapters.
Despite the success of this approach in other fields, most manuscripts in biology are not posted to preprint servers and are therefore not seen by more than a handful of other scientists prior to publication. In this article, we highlight the advantages of open preprint servers for both scientists and publishers, discuss the preprint policies of major publishers in biology, and describe the main options to publish preprints (Box 1, Table 1).
Box 1. Preprint Server Roundup
arXiv (http://arxiv.org/) is the most widely used preprint server today, and its use is almost universal in some branches of mathematics and physics. arXiv has a system of moderators and endorsers. At least one author of a paper must be an endorser that has either previously submitted a paper or has received permission to submit. Moderators have the power to change the classification of a manuscript.
figshare (http://figshare.com) is an open server allowing scientists to submit any research output: manuscript, figures, datasets, videos, theses, presentations, and so on. There are no rules to limit what constitutes a research output and, unlike arXiv, there is no endorser system. A flexible tag system is used to classify each item.
PeerJ (https://peerj.com/) is a new commercial open access publisher focused on the biological sciences that provides a preprint server and a peer-reviewed journal. Preprints can optionally be made private. One preprint per year can be posted for free, with a onetime (i.e., lifetime) fee for unlimited public preprints. Preprints can be posted to PeerJ regardless of where they will be submitted for publication.
Whereas arXiv, figshare, and PeerJ offer an option to submit a manuscript without having it reviewed, papers submitted to F1000Research (http://f1000research.com/) will eventually be reviewed. Thus, F1000Research offers a hybrid model with publicly available manuscripts at time of submission and standard peer reviews that occur as part of the submission process. Manuscripts are considered “accepted” and will only be indexed after two positive referee responses.
This manuscript was developed entirely as an open project on GitHub (https://github.com/). GitHub is one of several hosting services for collaborative development using the Git version control system (VCS). It allows numerous contributers to work asynchronously on the same project, often in parallel branches, all of which can be effortlessly merged and version controlled. Git is primarily used for software development , but it provides a powerful tool to collaborate on every step of the manuscript development process .
Scientific publishing is more diversified than ever. There are now many alternative options for submitting articles before formal publication. For example, social networks such as ResearchGate (http://www.researchgate.net/) can be used to submit preprints . Also, if GitHub pushes openness further by opening the writing process, open notebooks go even further by opening the entire scientific process .
The Case for Public Preprints
The first and most often discussed advantage of open preprints is speed (Figure 1). The time between submission and the official publication of a manuscript can be measured in months, sometimes in years. For all this time, the research is known only to a select few: colleagues, editors, and reviewers. Thus, the science cannot be used, discussed, or reviewed by the wider scientific community. In a recent blog post, C. Titus Brown noted how posting a paper on arXiv quickly led to a citation (arXiv papers can be cited), and his research was used by another researcher . The current system of hiding manuscripts before acceptance poses problems for both scientists and publishers. Manuscripts that are unknown cannot be used and thus take more time to be cited. It has been shown that high-energy physics, with its high arXiv submission rate, has the highest immediacy among physics and mathematics . Immediacy measures how quickly articles are cited.
Meanwhile, few people are aware of the research that has been done since, typically, only close colleagues are given access to the preprints. With public preprint servers, the science is immediately available and can be openly discussed, analyzed, and integrated into current research.
Public preprints can be crucial to early-career scientists. The delay before publication is seldom compatible with the pressure to show an impressive publication record when applying for a scholarship or a position. Increasing the perceived value of preprints as close, or equal, to journal articles will allow young researchers to put their research outcome in the open, and build a reputation for themselves through the diffusion of their work without fear that this work will not be recognized by grant or job committees.
Posting manuscripts as preprints also has the potential to improve the quality of science by allowing prepublication feedback from a large pool of reviewers. In our experience, prepublication reviews by a small network of colleagues are common in the biological sciences and form an important part of the scientific process. These “friendly” reviews increase the chance of errors being caught prior to publication. Furthermore, the formal peer-review process as a whole is critically overloaded. As the number of active scientists increases and the pressure to publish increases, it is becoming difficult for journals to find reviewers . At the same time, rejection rates are high in most journals ,, and when not invited to submit a revision, authors must start the process over again at another journal. As a result, initiatives to reduce time from submission to publication have emerged across the scientific community. Rohr et al.  called for the recycling and reuse of peer reviews: by attaching previous reviews and detailed replies to a new submission, both the editor and the referees can gauge the work done on the manuscript, and perhaps evaluate it with less prejudice. A widespread use of preprint servers can achieve the same goal of reducing the time spent in review. With a rich enough community of scientists depositing preprints, and commenting on them, the process of an open prereview can become widespread and will overall increase the quality of first submissions .
Finally, public preprint servers offer a fair way to establish intellectual priority by making the work available as soon as it is complete. Some manuscripts will spend much more time than others in the review process and/or in production after acceptance. This means that publication and acceptance dates do not accurately characterize who came up with an idea first. For this reason, mathematicians and physicists have embraced arXiv in part to establish priority in a fair way ,.
Preprints in Biological Sciences
In contrast to other disciplines, the field of biology has effectively no preprint culture, with the exception of small pockets of primarily highly quantitative research (e.g., epidemiology, population genetics). While submitting to preprint servers has become more common in the past few years, the number of biology papers submitted to preprint servers still represents only a small fraction of the total research produced in biology (Figure 2).
There are a number of reasons why biologists have not developed a culture of sharing preprints, many of which are based on common misconceptions. For example, in contrast to other fields, there is a perception in biology that public preprints make it easier to steal ideas . In other fields, preprints serve the opposite role: they allow straightforward establishment of precedence, letting a researcher lay claim to an idea, thus preventing it from being “stolen” . Another major concern is based on a certain interpretation of the Ingelfinger rule: scientists should not publish the same manuscript twice . A preprint is simply a document that allows ideas to spread and be discussed, it is not yet formally validated by the peer-review system. This is why almost all the major publishers in biology are preprint-friendly, including: Nature Publishing Group, PLOS, BMC, PNAS, Elsevier, and Springer (Table 2). This year, both the Ecological Society of America and the Genetics Society of America changed their policies to allow public preprints. Nature even felt compelled to respond to the rumor that they refused manuscripts submitted to arXiv by saying that “Nature never wishes to stand in the way of communication between researchers. We seek rather to add value for authors and the community at large in our peer review, selection and editing” . Still, a few journals adopt a “by default” hostile attitude towards preprints, mostly due to the lack of clear policy of the publishers. As an example, Wiley-Blackwell, which publishes some of the leading journals in biology, has no official policy on the matter.
The ongoing discussions on the publication process, peer review, and alternative publication models are all symptoms of the current uneasiness with the ever-growing obsession with bibliographic metrics such as the impact factor . Researchers are pressured to orient their publication strategy to maximize their number of publications and total citations. A well-known consequence is to submit manuscripts first to the most prestigious journals, and then resubmit to “lower level” journals as they are rejected. The numerous negative impacts of such behavior have been discussed in depth  and include a long delay between the time a manuscript is finished and its publication. Research activities and the publication process are drifting away from their fundamental objective, namely the diffusion of novel scientific discoveries.
Developing a preprint culture in biology will not solve all problems with the current publication process. However, it might significantly reduce its negative consequences. The role of peer review is to judge the scientific quality of a study. It is the first barrier against the fraudulent and poor quality science that could impede scientific progress. In practice, the peer-review system is not only used to evaluate scientific quality but also to judge pertinence. On the other hand, preprints are not filtered, neither for their quality nor their pertinence. Widespread adoption of preprint servers has the potential to shift the diffusion strategy: journals would remain important to validate publications, but the relevance of a study should only be judged by many more readers than the typical two–four anonymous reviewers. With a shift in the diffusion strategy, the role of traditional journals and their editors would be to showcase scientific discoveries for specialized readership.
Making publication easier can lead to the proliferation of studies of uneven quality. A trade-off between the intensity of the peer-review filtering and the benefits to science has been hypothesized . With increasingly stringent peer review, the quality of published papers can improve at the cost of an increased load on authors and reviewers and greater delays for publication. Preprints are simply bypassing this model for what we believe is the progress of science: they speed up the dissemination of scientific discoveries and put on readers' shoulders the responsibility to judge originality and pertinence.
We dedicate this article to Aaron H. Swartz (1986–2013). We thank Carl Boettiger, Mark Hahnel, and Hedvig Nenzén for helpful comments on an earlier version of this manuscript.
- 1. Loman N (2012) All the cool kids are on arXiv and Haldane's Sieve… why you should be too. Available: http://pathogenomics.bham.ac.uk/blog/2012/10/all-the-cool-kids-are-on-arxiv-and-haldanes-sieve-why-you-should-be-too/. Accessed 14 April 2013.
- 2. Ginsparg P (2011) ArXiv at 20. Nature 476: 145–147.
- 3. Krugman P (2012) Open science and the econoblogosphere. Available: http://krugman.blogs.nytimes.com/2012/01/17/open-science-and-the-econoblogosphere/. Accessed 14 April 2013.
- 4. Brown C (2012) A good way to publish – arXiv FTW. Available: http://ivory.idyll.org/blog/science-f-yeah.html. Accessed 14 April 2013.
- 5. Prakasan E, Sagar A, Kalyane V, Kumar A, Harnad S (2005) Minimum impact and immediacy of citations to physics open archives of arXiv.org: Science Citation Index based reports. CogPrints 4272.
- 6. Hochberg M, Chase J, Gotelli N, Hastings A, Naeem S (2009) The tragedy of the reviewer commons. Ecol Lett 12: 2–4.
- 7. Aarssen L, Tregenza T, Budden A, Lortie C, Koricheva J, et al. (2008) Bang for your buck: rejection rates and impact factors in ecological journals. The Open Ecology Journal 1: 14–19.
- 8. Rohr J, Martin L (2009) Reduce, reuse, recycle scientific reviews. Trends Ecol Evol 27: 192–193.
- 9. Hochberg M (2012) Good science depends on good peer-review. Available: https://sites.google.com/site/perspectivesinpublishing/our-mission. Accessed 14 April 2013.
- 10. Callaway E (2012) Geneticists eye the potential of arXiv. Nature 488: 19.
- 11. Altman L (1996) The Ingelfinger rule, embargoes, and journal peer review - part 1. The Lancet 347: 1382–1386.
- 12. Board NE (2005) Nature respects preprint servers. Nature 434: 257.
- 13. Aruliah D, Brown C, Hong N, Davis M, Guy R, et al. (2012) Best practices for scientific computing. arXiv 1210.0530.
- 14. Ram K (2013) git can facilitate greater reproducibility and increased transparency in science. Source Code Biol Med 8: 7.
- 15. Lin T (2012 January 16) Cracking open the scientific process. The New York Times
- 16. Sanderson K (2008) Data on display. Nature. Available: http://www.nature.com/news/2008/080915/full/455273a.html. Accessed 16 April 2013.
- 17. Fisher J, Ritchie E, Hanspach J (2012) Academia's obsession with quantity. Trends Ecol Evol 27: 473–474.
- 18. Aarssen L (2012) Are peer-review filters optimal for the progress of science in ecology and evolution? Ideas in Ecology and Evolution 5: 9–12.
- 19. Warner S (2012) Data for arXiv submissions by subject and year. Available: http://dx.doi.org/10.6084/m9.figshare.96966. Accessed 14 April 2013.