Micropublications for Automating Genome Sequence Variant Interpretation from Medical Literature

NIH RePORTER · NIH · R44 · $855,024 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Accurate and efficient interpretation of genomic variants for clinical decision making is predicated on ready access to and extraction of information from the medical literature. The sheer number of potentially relevant articles that must be examined during this process poses a significant challenge in ensuring the accuracy and reproducibility of clinical interpretation as it is time-consuming, error-prone, and highly user-dependent. To this end, we have developed the Mastermind Genomic Search Engine - a commercial database that automatically organizes disease, gene and variant information from the medical literature by systematically indexing millions of scientific articles. Mastermind is used by over 9,100 variant scientists in more than 100 different countries to more quickly interpret genetic variants in clinical settings. In Phase I of this project, we developed and tested a micropublication platform within Mastermind that assembles literature curation along with population frequency data, computational predictions of pathogenicity, and automated ACMG/AMP classifications that improves the speed of variant interpretation by more than 70% and increases the sensitivity of these results by 2-20x. The present proposal seeks to build on the success of Phase I by 1) integrating the micropublication platform into Mastermind with migration of collaborative features for community-based evaluation of variant interpretations; 2) optimizing and improving automated variant interpretation/prioritization of articles and implementing a rigorous quality assurance process; and 3) using these improvements to curate all evidence in all variants in all genes comprising the entire human genome, beginning with the clinical exome. Integration of the pre-curated genome data in the micropublication platform will result in Mastermind Enterprise, allowing for immediate and accurate genome-wide variant interpretations with collaborative curation in real-time at the point of interaction with source material (i.e. individual references). This work will mitigate reproducibility challenges plaguing other large-scale crowd-sourced projects, including those undertaken by groups like NIH’s ClinVar and QIAGEN’s HGMD. In addition, our novel approach will not suffer from poor sensitivity as it relies on a comprehensive source of medical literature pre-annotated based on genetic content. This work will permit dramatic scaling of variant interpretation activities and allow for complete and accurate curation of the entire human genome within 2 years – a feat that could not be completed utilizing current manual methods for variant interpretation. Mastermind Enterprise will be revolutionary in the genomics industry and will represent a natural next step to build on the achievements provided by the Human Genome Project and the reduced cost of next-generation sequencing. It will substantially improve diagnostic rates and accuracy in the clinic, especially in rare disease,...

Key facts

NIH application ID
10255401
Project number
2R44HG010446-02
Recipient
GENOMENON, INC.
Principal Investigator
Mark Julin Kiel
Activity code
R44
Funding institute
NIH
Fiscal year
2021
Award amount
$855,024
Award type
2
Project period
2019-05-01 → 2023-08-31