Drug biomarker resources for precise translational research

NIH RePORTER · NIH · OT2 · $58,210 · view on reporter.nih.gov ↗

Abstract

One goal of precision medicine is to select optimal therapies for individual patients based on drug biomarkers as well as disease symptoms/signs 1–3. The clinic has started to treat patients based on biomarkers. Examples include Gefitinib used to treat lung cancer patients with mutant EGFR and Vemurafenib used to treat melanoma patients with the BRAF V600E mutation. Clinical trials have also been tailored to recruit patients with the presence of specific biomarkers. A variety of preclinical studies have been conducted to discover biomarkers of investigational drugs. Recent large-scale molecular profiling of cell lines and pharmacogenomics even enables the prediction of biomarkers in silico. All these confirmed or investigational biomarkers (in silico, preclinical, in clinic) have emerged as critical components in modern translational research. However, our current knowledge about biomarkers is scattered and locked away in different places, including FDA labels, clinical trial descriptions, or publications, presenting a significant barrier to integrating them into knowledge graphs to augment reasoning. Therefore, we propose to create a novel composite knowledge source for biomarker discovery. This new source will improve the quality and quantity of connections between drug-biomarker-disease-patient and synthesize new knowledge for precision medicine research. To comply with established standards and aid the implementation of data/software standards for Translator, we will first develop an ontology to define biomarkers and their relationships with other biomedical entities. Next, we will leverage state of the art deep learning methods to extract biomarkers from publications and clinical trials. We will further adopt a crowd-sourcing approach using a large pool of medical students to manually inspect and curate biomarkers prioritized by our machine learning models. The machine learning models will be iteratively improved through a semi-supervised approach. To ensure high quality of provided knowledge, multiple lines (in silico, preclinical, in clinic) of evidence along with confidence scores will be associated with each biomarker. Through collaborating with NCATS staff, we will link biomarkers to other available resources to augment reasoning. We expect that the resource will be a critical component of a knowledge graph, enabling the query of novel questions related to precision medicine and the building of AI models. For example, can drug x work in a mouse model y where gene z is mutated? In what patient population may drug x be effective? Can drug x be repurposed to treat condition m where the biomarker of drug x is presented? Can we find new drugs/targets for those patients with the absence of the biomarker for the approved drug? Moreover, the labeled and well-curated data along with molecular profiles provide AI-ready resources for novel biomarker discovery that could be further validated by bench scientists. To achieve the goal, we have assembl...

Key facts

NIH application ID: 10056488
Project number: 1OT2TR003426-01
Recipient: MICHIGAN STATE UNIVERSITY
Principal Investigator: Bin Chen
Activity code: OT2
Funding institute: NIH
Fiscal year: 2020
Award amount: $58,210
Award type: 1
Project period: 2020-01-24 → 2020-04-07