Towards principled metrics of scientific influence with automatic curation of preprints.

Organizer

Thomas Lemberger, EMBO

Website or social media links

https://eeb.embo.org

Current stage of development

EEB is an experimental platform under development and used as sandbox to test ideas about aggregation and human- or machine-mediated curation of preprints. It is linked to EMBO Press and ASAPbio’s Review Commons journal-agnostic preprint review platform and to the SourceData curations platform.

Project duration

3 years

Update

How has your project changed?

In view of the feedback received we have decided to fuse our two proposals “Early Evidence Base…” and “Towards Principled Metrics…” into a single project. We feel that presenting the Early Evidence Base (EEB) as a single resource that combines aggregation of refereed preprints, rendering and summarization of peer reviews and automatic mining of the scientific content of preprints will provide a more concrete view of our ideas on how to increase engagement of authors, readers and reviewers with refereed preprints.

Have you integrated any feedback received?

One point of discussion was whether it was premature to build advanced platforms such as EEB when the number of peer reviewed preprints remains low. In our view, it is key to increase engagement and trust not only of reviewers but above all of authors and readers. Readers should have an easier time finding preprints they can trust and are interesting for them and authors should be convinced that posting preprints and their reviews is an efficient and visible way of sharing findings. In view of this feedback, we will integrate more preprint reviewing services into the EEB platform to further raise awareness about peer reviewed preprints across a broader range of disciplines.
On the idea of finding ‘principled metrics’ related to novelty, depth and significance, one of the major issues raised during the discussion was to motivate the need for such metrics and be mindful of their potential misuse. To get a better sense and some data on whether ranking metrics might be useful to filter and prioritize content by users, we have already included and will add further ranking mechanisms based on the automated analysis of the knowledge graph that supports Early Evidence Base. These methods are not presented to users as ‘metrics’ (no scores are displayed) to avoid over interpretation and misuse of the rankings while allowing us to analyze their utility in filtering large amounts of preprints.
Following positive feedback on the idea of identifying studies that potentially bridge fields, we have developed methods that automatically identify fields of research in an unsupervised way and exclusively based on the scientific content of preprints. These methods are successful in identifying emerging fields, such as research on COVD19/SARS-CoV-2, and open the door to find studies that belong to more than one field of research.
The suggestion was made that different sections of the referee reports might be used to guide readers in selecting preprints and identify studies in specific fields or with a multi- or cross-disciplinary scope. We are therefore starting to integrate powerful automatic summarization methods to expose specific statements from referee reports, for example in order to highlight the expertise of the reviewers as a proxy for the depth of the reviewing and of the fields covered by a study.

Have you started any collaborations?

We are collaborating with Peer Community In (PCI) to integrate PCI into the EEB platform. This will allow us to develop the necessary interface with CrossRef which has just started to support registration of peer review material linked to preprints.

Project aims

Background information on current practices

Preprints are extraordinarily attractive for authors in part because of the ease and speed in disseminating new findings. This simplicity goes however at the expense of the absence of supervised aggregation around a scientific scope, expert filtering and certification, which are traditionally the functions of journals. In absence of such prioritization tools, navigating the rapidly increasing volume of preprints is becoming difficult.

While a number of tools exists that derive article-level metrics based on citations or social network activities, a particularly difficult challenge is to define sorting or classification metrics that are more directly and intrinsically linked to the scientific content presented in a preprint.

The progress in the automated comprehension of natural language using artificial intelligence provides the opportunity to analyze the content of preprints exposed in various text and data mining resources, for example those provided by bioRxiv/medRixv or other organizations, such as the Covid-19 Open Research Dataset (CORD-19)

dataset. Several initiatives have derived large ‘knowledge graphs’ from the automated processing of such compendia.

It is therefore timely to attempt building AI-generated knowledge graphs and other tools to derive principled metrics that attempts to evaluate how a piece of work inserts in the pre-existing knowledge graph and estimate its (potential) contribution to the current scientific literature. Ideally, these metrics should avoid taking into account authorship, citations patterns or social network activity and rather be exclusively based on the scientific content (data, evidence, claims) in as transparent a way as possible.

Such automated principled metrics of potential influence, especially if combined in a complementary way with human expert peer review, could provide an attractive solution that preserves the simplicity and speed of preprints while assorting it with powerful prioritization methods.

Overview of the challenge to overcome

To start experimenting with combining human and machine curation, we have built the experimental platform Early Evidence Base (EEB, https://eeb.embo.org). EEB combines human curation, through peer review, and machine curation, with text mining, to aggregate and filter refereed preprints.

The challenges are considerable, both at the conceptual and technical levels. At the conceptual level, principles should be found that allow to define various dimensions of ‘scientific advance’, such as novelty, nature of the advance, depth or completion of the analysis, reproducibility, etc… Ideally these principles should be articulated in terms of measurable and explainable properties when the content of a preprint can be represented in a structured machine-readable way (for example in the fom of a knowledge graph).

At the technical level, tools need to be developed that extract and derive the appropriate representation of the content such that suitable properties are exposed and quantified in a way that can be benchmarked with suitable reference sets.

As specific example of such an approach, in EMBO’s SourceData project, we are attempting to mine experimentally tested hypotheses based on the information provided in figure legends, build a knowledge graph from this representation and derive metrics that indicate the potential contribution of a result in bridging disparate fields.

The ideal outcome or output of the project

Demonstration of the feasibility of automated content-centric metrics and their value when juxtaposed with human-based peer review.

Description of the intervention

Identification of features related to scientific advance that can in principle be measured provided a suitable computable representation of the results, methodologies and claims reported in a preprint.
Assembly of demonstration and benchmark datasets.
Development of tools extracting the representations necessary for the computation of metrics defined in #1 and illustrated in #2.

Plan for monitoring project outcome

The metrics should be evaluated by researchers for the ability to produce results that make sense and that are motivated by a clear set of principles. This would favour metrics that are ‘explainable’, at least to some extent, over ‘black box’ metrics learned by machine learning based on complex combinations of features.

What’s needed for success

Additional technology development

Customized AI tools to extract and represent specific aspect of the scientific content.
Analytical methods, including graph-based approaches, to derive ranking metrics.

Feedback, beta testing, collaboration, endorsement

The project would foster a tight collaboration between machine learning specialists, data scientists and professional in editorial curation.

Funding

Support for an interdisciplinary team composed of:

Machine learning specialists and data scientists
Scientometrics specialists
Editorial curation specialists