# Drug biomarker resources for precise translational research

> **NIH NIH OT2** · MICHIGAN STATE UNIVERSITY · 2020 · $58,210

## Abstract

One goal of precision medicine is to select optimal therapies for individual patients based on drug
biomarkers as well as disease symptoms/signs 1–3. The clinic has started to treat patients based
on biomarkers. Examples include Gefitinib used to treat lung cancer patients with mutant EGFR
and Vemurafenib used to treat melanoma patients with the BRAF V600E mutation. Clinical trials
have also been tailored to recruit patients with the presence of specific biomarkers. A variety of
preclinical studies have been conducted to discover biomarkers of investigational drugs. Recent
large-scale molecular profiling of cell lines and pharmacogenomics even enables the prediction
of biomarkers in silico. All these confirmed or investigational biomarkers (in silico, preclinical, in
clinic) have emerged as critical components in modern translational research. However, our
current knowledge about biomarkers is scattered and locked away in different places, including
FDA labels, clinical trial descriptions, or publications, presenting a significant barrier to integrating
them into knowledge graphs to augment reasoning. Therefore, we propose to create a novel
composite knowledge source for biomarker discovery. This new source will improve the quality
and quantity of connections between drug-biomarker-disease-patient and synthesize new
knowledge for precision medicine research.
To comply with established standards and aid the implementation of data/software standards for
Translator, we will first develop an ontology to define biomarkers and their relationships with other
biomedical entities. Next, we will leverage state of the art deep learning methods to extract
biomarkers from publications and clinical trials. We will further adopt a crowd-sourcing approach
using a large pool of medical students to manually inspect and curate biomarkers prioritized by
our machine learning models. The machine learning models will be iteratively improved through
a semi-supervised approach. To ensure high quality of provided knowledge, multiple lines (in
silico, preclinical, in clinic) of evidence along with confidence scores will be associated with each
biomarker. Through collaborating with NCATS staff, we will link biomarkers to other available
resources to augment reasoning.
We expect that the resource will be a critical component of a knowledge graph, enabling the query
of novel questions related to precision medicine and the building of AI models. For example, can
drug x work in a mouse model y where gene z is mutated? In what patient population may drug x
be effective? Can drug x be repurposed to treat condition m where the biomarker of drug x is
presented? Can we find new drugs/targets for those patients with the absence of the biomarker
for the approved drug? Moreover, the labeled and well-curated data along with molecular profiles
provide AI-ready resources for novel biomarker discovery that could be further validated by bench
scientists.
To achieve the goal, we have assembl...

## Key facts

- **NIH application ID:** 10056488
- **Project number:** 1OT2TR003426-01
- **Recipient organization:** MICHIGAN STATE UNIVERSITY
- **Principal Investigator:** Bin Chen
- **Activity code:** OT2 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $58,210
- **Award type:** 1
- **Project period:** 2020-01-24 → 2020-04-07

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10056488

## Citation

> US National Institutes of Health, RePORTER application 10056488, Drug biomarker resources for precise translational research (1OT2TR003426-01). Retrieved via AI Analytics 2026-05-29 from https://api.ai-analytics.org/grant/nih/10056488. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*