Category Archives: Central Service

New developments and plans for the Central Service RFA and Governing Body

Funding agencies are crucial to the development of the preprint movement. The adoption of preprints by the life sciences community has been accelerated by grant-making policies that recognize these manuscripts as a valid form of scholarly communication. Investments in preprint infrastructure, services, and technologies are thus necessary to build capacity for growth.

The latest strong support for preprints comes from the Chan Zuckerberg Initiative (CZI), which recently announced financial support for bioRxiv, the leading preprint server in the life sciences, along with resources to develop new open source software, including tools for manuscript conversion to XML. ASAPbio applauds the decision by CZI to provide funding for bioRxiv and further technological development, both of which will very positively advance the growth and readability of preprints.

ASAPbio also has advocated for funder investment in these areas. Together with a group of funders, ASAPbio developed plans for a Central Service, an aggregation site of preprint content meeting certain standards, new search tools, and software for XML document conversion. Seven applications were received on the April 30 RFA deadline. In parallel, ASAPbio also commissioned a 30 person task force to develop bylaws for a community-elected Governing Body, which have been released for public comment.

In light of the CZI/bioRxiv partnership that was announced on April 26, ASAPbio and the Funders Consortium have jointly decided to suspend the RFA and Governing Body for a four month period in order to reassess the needs of the scientific community. Since some objectives of the RFA are now being pursued by CZI/bioRxiv, we do not wish to duplicate or compete with their efforts. The need, role and mandate of any Governing Body may also require re-evaluation. During this four month period, we will gather more information from CZI/bioRxiv and engage the broad community of scientists, funders, scientific societies, and publishers to learn more about their opinions and needs. This represents an exciting opportunity to further advance scholarly communication by building upon the CZI/bioRxiv initiative and thus better serve the scientific community. Consistent with our mission, we will aim to bring together various stakeholders for conversations, identify and debate opportunities, and encourage input from open discussions with the community. We will release more information in the next few weeks regarding this planning process. Feel free to contact us with your input, suggestions or questions now or in the future.  

We appreciate the support and patience of everyone who has provided feedback on the Central Service governance and RFA process thus far, including the funders who have articulated principles for supporting preprint infrastructure, the RFA respondents who have written thoughtful and in many cases highly collaborative applications, attendees of our technical workshop and other meetings, members of our governance task force, our external reviewers, and many other individuals who have shared their feedback on our draft proposals for infrastructure and governance. We will continue to engage the broader community as we work to advance and accelerate scientific communication.

Requesting your feedback: how should life scientists set standards for preprints?

May 10, 2017 update: ASAPbio has announced a four-month suspension of the RFA process to reassess the preprint ecosystem and community needs. 

Preprints (scientific manuscripts that have been posted prior to completion of peer review) allow for the direct exchange of knowledge between scientists. They constitute a global public good that promotes scientific progress. However, preprint servers, which have a 25 year history in physics, are relatively new to the life sciences, and their recent appearance has raised many questions about how they should best be used. What content should constitute a preprint? What type of information (metadata) should accompany preprints? How should they be licensed? How should preprints be screened? How should preprints and journals interact in productive ways? How should servers handle ethical issues such as human subjects research? These issues are examples of many that have been raised, and more will undoubtedly arise as scholarly communication evolves in the future. Rather than “hard wiring” rules for preprints now, we need to consider a thoughtful mechanism for ensuring that preprints develop and adapt to serve the scientific community, both now and in the future.

Currently, preprints in the life sciences can be found on several different servers and platforms (bioRxiv, arXiv q-bio,  PeerJ Preprints, F1000Research, figshare, preprints.org, and more are on the way). A diversity of preprint servers offers more choices for authors, but each has its own metadata, formatting, licensing, screening, and preservation standards. Each makes decisions according to its own board or advisors. Collectively, this can make it more difficult to know which servers conform to policies and technological standards requested by funders, and which ones will be most visible to scientific peers.

Creating an aggregator for life sciences preprints

The provisionally-named Central Service is a proposed aggregation site similar to PubMed Central. It would provide convenient access to a corpus of life science preprints for both humans (via a search interface) and machines (via an API and bulk download). This will ensure consistent access to preprints for purposes of archiving, text and data mining, and development of other services. Moreover, the central service will be established with a community governance structure to make it responsive to the needs and developing standards of the community.  ASAPbio has received a grant to catalyze the development of this service, and we are also working with 11 other funders to establish funding over a 5 year period. We’ve released an RFA for service providers, and expect the technical components of the service to launch in 2018.

However, the technology of the service must be complemented by outstanding leadership and oversight by respected members of the scientific community. Each depends upon the other; the cart and horse must be hooked up together. It is critical that a mechanism for community governance begins operation prior to or at the same time as the Central Service.  

Establishing community governance

What can we learn from other organizations with similar missions of serving the scientific community? Many such organizations operate through governing bodies composed of elected or appointed members from their relevant scientific community. These governing bodies make decisions according to bylaws, which effectively serve as a written constitution for that organization. Even though the elected officials turn over, the bylaws ensure that the organization maintains its operating principles over time. Virtually all scientific societies have established bylaws and elect a governing body. Some scientific resources, such as the Protein Data Bank and the preprint server arXiv, also operate in a similar manner. However, scholarly communication in biology currently lacks community governance; decisions are largely made by individual publishers or mandates by funding agencies. While preprints are just starting to gain acceptance in biology, now is an opportune time for the creation of an independent, scientist-led governing body for preprints with transparent governance processes similar to those of scientific societies and community repositories. We hope to hold elections in July and begin operation of the governing body in September or October.

To prepare for this, ASAPbio has worked with an international task force composed of ~30 scientists from a variety of fields and career stages, as well experts in scholarly infrastructure, to draft Operating Principles and Bylaws for this governing body. Representatives from the Funders Consortium also have provided valuable edits and feedback on this document.

Now is the time for communities of life scientists to establish a governance structure for preprints, and we are asking for your input. Please provide your feedback on the draft Operating Principles and Bylaws by leaving comments and suggested edits in the Google Doc. You may also email jessica.polka@asapbio.org, but we strongly encourage you to leave comments publicly in order to stimulate a dialog among stakeholders.

One point on which there has been considerable discussion and no clear consensus is the definition of the community that votes to elect Governing Body members. Should it be individuals who have submitted a preprint, those who have published a scientific paper in the past 5 years, or people holding an ORCID number (for which there are no prerequisites)? Please leave your thoughts on this important issue in the comment section below this post.

 

The commenting period will close on May 21, 2017. We look forward to hearing from you!

 

 

RFA questions

When questions of general relevance about the requirements and process of the Central Service RFA are received, we will post anonymized summaries of these questions and their answers here. Please direct any additional questions to jessica.polka@asapbio.org.

Bidders’ information meetings

Audio of February 24th, 2017 bidders’ information meeting. The March 29th 9pm EDT meeting had no participants.

Q&A

Q: What will the governance model look like?

A: ASAPbio will release a draft of a proposal for the governance for public comment shortly. In brief, the current draft describes selection of a governing body by election. The slate of candidates would be selected by a membership committee of the governing body, derived from open nominations. Terms are staggered.

Q: Would metadata only aggregation (vs full text aggregation) fulfill the needs of the CS?

A: Our ambition is to facilitate easy and reliable machine access to preprints. Therefore, convenient access to full text is essential, but we do not wish to be overly prescriptive about the approach.

Q: Will the RFA result in the selection of a single bid?

A: Possibly, but not necessarily. As described in the RFA, ASAPbio reserves the right to explore whether multiple organizations, which may not have co-submitted a bid, could work together to develop a more compelling final proposal for presentation to the funders’ consortium.

Q: Are for-profit entities candidates for the CS?

A: Yes. All candidates, regardless of tax status, should be willing to adhere to principles such as openness and community governance as laid out in the RFA.

Q: My organization has developed software that is not open source. Can we use this to develop the CS?

A: All components required to run CS must be broadly available for other parties to use. If you have an existing proprietary closed-source code base upon which new code would depend, this code must be released under an OSI-approved license as well.

Q: Can indirect costs be listed in the budget?

A: Because we do not know the agencies and mechanisms that may fund the CS, we do not know if indirect costs can be provided. Please do not include indirect costs in the budget, but rather list all the costs of delivering the service (rent, utilities, administrative personnel, etc) prorated according to the fraction of your activities occupied by the CS.

Q: Can the service include disciplines other than life science?

A: Given the requirement for independent governance, the ASAPbio effort should focus on the life sciences, at least initially. We could explore ways to expand this to other disciplines over time if desirable to other scientific communities. That said, there is nothing to prevent the inclusion of other domains in the service if supported by other funding.

Q: Can the service provide commenting?

A: Section 2B.2C of the RFA reads: “Respondents are also invited to highlight other functionality they would suggest the site should support, although all features and functionality of the CS will require Governance Body approval.”

Continue reading

ASAPbio awarded $1 million from Helmsley Charitable Trust for next-generation life sciences preprint infrastructure

Date: Thursday, February 23, 2017
Contact: Jessica Polka | Director, ASAPbio | jessica.polka@asapbio.org

ASAPbio, a biologist-driven project to promote the productive use of preprints in the life sciences, has received a $1 million, 18-month grant from The Leona M. and Harry B. Helmsley Charitable Trust to develop a new service to aggregate life sciences preprints and promote their visibility and innovative reuse

Preprints are complete scientific documents posted online and made freely available to the global scientific community. They are frequently the same version of a paper that is submitted to a journal for peer review. Preprints are widely used in physics, mathematics, and computer science, but are still a new (albeit rapidly-growing) communication system in the life sciences. Mainstream adoption of preprints is challenged by the current difficulties of finding these documents, which are hosted on several unconnected servers; the lack of community governance over the standards that define a preprint; and technological barriers to accessing content for reuse.

The Helmsley award provides funds for ASAPbio to address these problems by constructing a community-governed service that will aggregate, preserve, and deliver life sciences preprints to human and machine readers. It will also develop open-source tools for manuscript screening and conversion to formats such as XML. The guiding principles of this service have been defined by a consortium of funders including the Helmsley Charitable Trust. ASAPbio has issued an RFA to identify potential technical suppliers for the service.

“The grant from the Helmsley Charitable Trust is a giant step forward for the life science community to translate ideas for next-generation preprint services into a reality. This coming summer, we anticipate that other funders will follow the lead of Helmsley and provide further multi-year support for building the technologies for a powerful preprint knowledge repository that facilitates scientific progress through open sharing of data,” says Ron Vale, Founder of ASAPbio. “The support of major funding agencies and the development of new tools for discovering recent scientific findings should encourage life scientists to share their scientific manuscripts in the form of preprints.”

ASAPbio’s work focuses on convening stakeholders for discussions about the role of preprints in the life sciences (namely, an initial conference at HHMI in February of 2016 (see report in Science) and follow-up workshops for funders, technical experts, and scientific societies). Via these meetings, online discussions, and a network of local representatives, ASAPbio seeks to promote the cultural change necessary to complement new developments in technology and policy, from funders, universities, and journals.

ASAPbio is additionally supported by the Alfred P. Sloan Foundation, the Gordon and Betty Moore Foundation, the Laura and John Arnold Foundation, and the Simons Foundation. ASAPbio is incorporated as a nonprofit California corporation.

The Benefits of a “Central Service” for Biology Preprints

Preprints are complete and public manuscripts with associated data shared before undergoing peer review. Physicists, mathematicians, and computer scientists post 100,000 preprints per year to arXiv, a scientist-governed preprint server that has been in operation for over a quarter of a century. Preprints in the life sciences are in a more embryonic stage, with less than 10,000 posted manuscripts per year. However, several meetings hosted by ASAPbio have ended with the conclusion that preprints, in conjunction with journals, hold great potential for enhancing scholarly communication in biology.

Recently, eleven major international funding agencies (Wellcome Trust, National Institutes of Health, Medical Research Council (UK), Helmsley Trust, Howard Hughes Medical Institute (HHMI), European Research Council, Simons Foundation, Canadian Institutes for Health Research, Alfred P. Sloan Foundation, Department of Biotechnology (Government of India), Laura and John Arnold Foundation) have released a statement calling for further technology development and the creation of a central resource for preprints, which is being provisionally called the Central Service (CS). The CS will be a database that aggregates preprints from multiple sources, making them easier to read by humans and machines. These features will enable scientists to find new knowledge that can accelerate their research. The CS will be overseen by a scientist-led governing body, which will ensure its mission in serving the scientific community and the public good.

ASAPbio (a scientist-driven organization to promote the productive use of preprints in biology) has released a Request for Applications (RFA) for the development of this service, which is open to all. After independent reviewers select the preferred applicants(s), and pending commitment of funders, the CS is expected to launch in 2018. Here we discuss why the Central Service is needed and its potential for advancing knowledge dissemination in the life sciences. Continue reading

RFA

May 10, 2017 update: ASAPbio has announced a four-month suspension of the RFA process to reassess the preprint ecosystem and community needs.

ASAPbio is releasing a Request for Applications for the development of a Central Service (provisional name) for preprints in the life sciences issued by ASAPbio. This Request is open to all prospective bidders, and we encourage responses from interested parties able to deliver the services described below. For a concise description of the goals of this project, please see our blog post entitled The Benefits of a “Central Service” for Biology Preprints. Proposals are due on April 30, 2017.

 

 

Principles for establishing a Central Service for Preprints: a statement from a consortium of funders

At the ASAPbio Funders’ Workshop, representatives from a number of funding agencies asked ASAPbio to “develop a proposal describing the governance, infrastructure and standards desired for a preprint service that represents the views of the broadest number of stakeholders.” Following iterative discussions about the technical and organizational aspects of such a project, ASAPbio is now positioned to issue an RFA for the development of a “Central Service” for preprints. To guide this effort, a group of funders have independently formulated the following principles that will shape the Central Service.

The funders are interested in getting additional funding bodies and research performing organizations to endorse these Principles. If you represent such an agency and are interested in signing on to these principles (or would like to discuss this matter), please contact Robert Kiley, Development Lead, Open Research at the Wellcome Trust (r.kiley@wellcome.ac.uk.)

Continue reading

Update on development of a Central Service Request for Applications (RFA)

At the ASAPbio Funders’ Workshop in May of 2016, representatives of funding agencies requested that ASAPbio “develop a proposal describing the governance, infrastructure and standards desired for a preprint service that represents the views of the broadest number of stakeholders.” Toward this end, we proposed a model for a “Central Service” (CS) that would aggregate content from multiple preprint servers, facilitating human and machine access to preprints via a search tool and an API.

Three separate processes are now ongoing to define this service:

Continue reading

Creation of a Central Preprint Service for the Life Sciences

ASAPbio is iteratively seeking community feedback on a draft model for a Central Preprint Service. We will integrate community and stakeholder feedback into a proposal, containing several model variants, to funders this fall. Please leave your feedback on utility of the Central Service, its features, and the model described in the Summary in the comment section at the bottom of the page, or email it privately to jessica.polka at gmail.com. More comments are posted on hypothes.is (follow this link and expand the menu at right)

Central Service model documents

Summary

At the ASAPbio Funders’ Workshop (May 24, 2016, NIH), representatives from 16 funding agencies requested that ASAPbio “develop a proposal describing the governance, infrastructure and standards desired for a preprint service that represents the views of the broadest number of stakeholders.” We are now holding a Technical Workshop to advise on the infrastructure and standards for a Central Service (CS) for preprints. ASAPbio will integrate the output of the meeting and community and stakeholder feedback into a proposal to funding agencies this fall. The funders may issue a formal RFA to which any interested parties could apply for funding. More details on this process are found at the end of Appendix 2.

Background

The preprint ecosystem in biology is already diverse; major players include bioRxiv, PeerJ Preprints, the q-bio section of arXiv, and others. In addition, platforms such as F1000Research and Wellcome Open Research are producing increasing volumes of pre-peer reviewed content. PLOS has a stated commitment to exploring posting of manuscripts before peer review, and other services may be developed in the future.

Increasing the number of intake mechanisms for the dissemination of pre-peer reviewed manuscripts has several advantages, for example: 1) generating more choices for scientists, 2) promoting innovative author services, and 3) increasing the overall volume of manuscripts, thus helping to establish a system of scientist-driven disclosure of their research. However, an increasing number of intake mechanisms also may lead to confusion and difficulty in finding preprints, heterogenous standards of ethical disclosure, duplication of effort in creation of infrastructure, and uncertainty of long-term preservation. (See a more complete discussion of why we think it is essential to aggregate content in Appendix 1.)

Based upon funder interest from the May 24th Workshop, ASAPbio will propose that funding agencies support the creation of a Central Service (CS) that will aggregate preprint content from multiple entities. This service will have features of PubMed (indexing/search) and PubMed Central (collection, storage, and output of manuscripts and other data).

The advantages of this system for the scientific community would be:

  1. Oversight by a Governance Body. The content, performances, and services of the CS would be overseen by a Governance Body composed of highly respected scientists and technical experts. The formation of Governance Body, which will have international representation and be transparent in its operation, will be addressed by a separate ASAPbio task force and will not be discussed in the Technical Workshop. The connection between the CS and a community-led Governance Body will ensure that preprints continue to serve the public good and develop in ways that benefit the scientific community, beyond the needs of individual publishers and servers. This formation of a central, well-functioning Governance Body has been repeatedly described by funders and scientists as an essential element in gaining respectability for preprints and guiding the system in the future.
  2. Guaranteed stable preservation. Archiving content through a CS better assures permanence of the scientific record, even if a preprint server/publisher decides to discontinue their services.This is a key feature for both scientists and funders.
  3. Greater discoverability and visibility for scientists. The CS would become the location for scientists to search for all new pre-peer reviewed content. Lessons from arXiv indicate that a highly visible, highly respected single site for searching for new findings is essential for the scientific community.
  4. Clarity on what qualifies as a respected preprint. Scientists want their preprint to “count” for hiring, promotions, and grant applications. However, universities and funding agencies are concerned about quality control for preprints and how they can guide their scientists and reviewers on what qualifies as a credible preprint or preprint server. The CS/Governance Body will work with universities and funders to apply uniform standards of author identity, checks for plagiarism, moderation of problems, and create ethical guidelines for research and disclosure. Thus, content on the CS, coming from several sources, will meet uniform guidelines acceptable to funders and universities.
  5. Better services for scientists. Scientists, as consumers, want better ways of viewing content. They want to read manuscripts in an xml format on the web or as a PDF download, more easily link to references, and more easily view figures and movies. The CS would perform document conversion to ease viewing and searching for material, thereby accelerating new discoveries. The CS would have an API to enable innovative reuse by other parties to provides services that could be valuable for scientists beyond the scope of the CS (e.g. evaluations of work, journal clubs, additional search engines).
  6. Reduced overall cost. The central service can efficiently provide services (such as archiving, automated screening, and document conversion) that otherwise would be provided redundantly by each intake server/publisher.

We discussed various models for the CS with stakeholders (see Appendix 2 for types of models and the feedback that we received). This document describes the current iteration of the model, which is still in draft form. We will present several variations to funders this fall, based on feedback received, including the comments here. If you prefer, you may email comments privately to jessica.polka at gmail.com.

The CS would undertake several functions including centralized document conversion, accrediting (via setting guidelines for intake), archiving, search, and an API for third-party use. We are currently considering that the CS would not display full-text, but instead would send back the converted full-text to the intake server for display.

In this draft model:

  • Servers would facilitate the submission of a .doc or .tex file and a standardized set of metadata (e.g. authors names, potentially ORCID numbers, etc) to the CS. From this file, the CS could extract an html or xml file (possibly including links to references, figures, etc).
  • If this file passes CS screening (including plagiarism detection, and potentially human moderation etc), it would be admitted into the central database, assigned a unique ID, and be sent back to the intake provider for display.
  • The CS would archive the original .doc file and other associated files, and also make these available via an API; as reference extraction technology improves, etc, new html/xml derivatives can be prepared. The CS would reserve the right to display content if the intake provider is not able to do so or if required by the funders or governance body.
  • Readers could search for preprints (or receive alerts) through CS-hosted tools that would display metadata (including abstracts); readers would be sent to the intake server for full-text display of preprints.
  • All aspects of the central service would be under the control of a governing body, which would have international representation from the scientific community and could develop over time.

The Technical Workshop will discuss the features, mechanisms, existing infrastructure, potential concerns and challenges, and timelines for implementation for the elements in orange on the diagram below. 

CS model v2

(previous version)

ASAPbio will continue to modify the model before and after the Technical Workshop before presenting several variations to funders in the fall.

Below: possible early-stage implementation

CS model v2 initial

(previous version)