Appendix 1: Rationale for a Central Service

Central Service model documents

Do we need new infrastructure and governance for preprints?

Because the preprint server arXiv was born very early in the history of the internet and served its community well, it has become the de facto repository for preprints in the physical, mathematics and computer sciences without any major competitors. During the past two decades, various scientific disciplines decided to join arXiv rather than start their own servers. Thus, arXiv has become a “central server” for the physical science community and has achieved high visibility.

The success of preprints in physics was aided by the coalescence of a large body of content in one highly visible site (arXiv) that had a high standard for quality, attracted outstanding work, and had a scientist-led governance model. Biologists could attempt to replicate this single server model. However, biologists already deposit work in several existing preprint servers, notably bioRxiv (established 2013), PeerJ Preprints (established 2013), and the q-bio section of arXiv (established 2003). In addition, PLOS has had a long-standing interest in posting pre-peer review manuscripts as an option for submitted papers, which could add considerable content in the near future. F1000 has developed a publishing platform that provides access to manuscripts before formal peer review (effectively a preprint; see definition below) and the Wellcome Trust has adopted the F1000 platform to launch Wellcome Open Research. Other journals or funding agencies may also decide to develop similar dissemination mechanisms for pre-peer review content. Thus, the concept of disseminating “pre-peer review” manuscripts is broadening beyond a traditional “arXiv”-like server.

The future of preprints in biology is now poised at both an exciting as well as fragile moment. If organized and thought-through properly, preprints could accelerate scientific communication, serve the public good, clarify priority of discovery, and help career transitions of young scientists (see Preprints for the Life Sciences, Science). The development of preprints could be governed by the scientific community, in partnership with publishers and other service providers, leading to exciting and innovative possibilities for science communication.

However, the future of preprints could be less bright. Preprints could become fragmented among the efforts of multiple competing parties, lack overall visibility and critical mass, fail to harness modern possibilities for dissemination and use, and lack clear governance. Preprints may fail to achieve the level of respectability needed to convince scientists, funders, and universities that the disclosure of work by scientists plays a valuable role in the ecosystem of science communication, along with post-peer review journal publications. If preprints fail to grow substantially in submissions and readership in the next five years (e.g. to the level of arXiv), scientists and funders will view them a failed experiment. Because the future of preprints is poised in critical time window, it is important to think through issues of execution that will maximize the chance of preprint adoption by the community.

If the scientific community does not act, continued fragmentation of preprint sites could undermine the potential of this communication system by generating:

  • Ambiguity about what qualifies as an acceptable preprint and a recognized content provider. Currently, funding agencies and universities are considering whether preprints or other “pre-peer review” publications should be included in applications. However, what is defined as a “recognized preprint server” is ambiguous at the moment. Every server or publisher may define their own screening protocol, causing uncertainty about whether a preprint has been screened for plagiarism or adheres to ethical standards. In this current system, each journal, funding agency, and hiring or promoting committee must define a list of approved preprint sources based on their own assessments of preprint servers. This practice, which is already occurring at certain journals, will create a situation that is confusing and discouraging for researchers.
  • Lack of visibility and difficulty of discovery. If preprints are spread across multiple sites, they will become more difficult to find. Maximizing discoverability, visibility, and respectability are key to adoption and widespread use by scientists, as is suggested by the success of arXiv.
  • Variable and potentially limited access to data. In the current system, each server sets its own licensing policies and is responsible for archiving its own content. This puts content in danger of being held under restrictive licenses or lost altogether.

Limited potential for technology development. If each server must create or outsource IT infrastructure, overall costs of the preprint system will be high, many servers will not have funds for more advanced IT development, and the potential for using and disseminating information may be limited.

Value of a Central Service

To overcome the deficiencies described above, we believe it would be in the best interest of the scientific community to create a Central Service (name subject to change at a later date) that will aggregate “pre-peer review” manuscripts from several sources, maintain standards of quality for its intake, preserve content for posterity, and disseminate information in a manner that advances scientific progress. The Central Preprint Service would, in essence, function as a database that serves the public good, analogous to the Protein Data Bank or Pubmed Central. We envision that a Central Preprint Service will be supported by a consortium of funding agencies for a minimum five year term of operation. It wil be overseen by governance body that will be 1) international, 2) led by highly respected members of the scientific community, and 3) transparent in all of its proceedings, actions, and recommendations.

Partnerships with journals and servers

The Central Service will host manuscripts that contain 1) data, 2) the methods needed by other scientists to replicate that data, and 3) an interpretation of that data. The governing body will determine how manuscripts are screened for entry into the Service (for example, to exclude content that is plagiarized, non-scientific, or in violation of ethical guidelines). However, the Service will not engage in validation or judgment of the work as is performed by traditional peer review. Thus, the Central Service will work as a partner, and not a competitor, with existing journals.

The Service also seeks to act as a partner with preprint servers and publishers that ingest manuscripts from authors. Partners who can deposit their content into the Central Preprint Service will benefit from additional infrastructure support (e.g. plagiarism detection, conversion tools, etc) and most importantly will have greater appeal to scientists who will want their preprint broadly viewed and recognized by grant and promotion committees.