Cancer Deep Phenotype Extraction from Electronic Medical Records

NIH RePORTER · NIH · U24 · $924,689 · view on reporter.nih.gov ↗

Abstract

Summary Precise phenotype information is needed to advance translational cancer research, particularly to unravel the effects of genetic, epigenetic, and systems changes on tumor behavior and responsiveness. Examples of phenotypic variables in cancer include: tumor morphology (e.g. histopathologic diagnosis), co-morbid conditions (e.g. associated immune disease), laboratory findings (e.g. gene amplification status), specific tumor behaviors (e.g. metastasis) and response to treatment (e.g. effect of a chemotherapeutic agent on tumor). Current models for correlating EMR data with –omics data largely ignore the clinical text, which remains one of the most important sources of phenotype information for cancer patients. Unlocking the value of clinical text has the potential to enable new insights about cancer initiation, progression, metastasis, and response to treatment. We propose further collaboration to enhance the DeepPhe platform with new methods for cancer deep phenotyping. Several aims propose investigation of biomedical information extraction where there has been little or no previous work (e.g. clinical genomic). Visualization of extracted data, usability of the software, and dissemination are also emphasized. A diverse set of oncology studies led by accomplished translational investigators in Breast Cancer, Melanoma, Ovarian Cancer, Colorectal Cancer and Diffuse Large B-cell Lymphoma will demonstrate the utility of the software. These labs will contribute phenotype variables for extraction, test utility and usability of the software, and provide the setting for an extrinsic evaluation. The proposed research bridges novel methods to automate cancer deep phenotype extraction from clinical text with emerging standards in phenotype knowledge representation and NLP. This work is highly aligned with recent calls in the scientific literature to advance scalable and robust methods of extracting and representing phenotypes for precision medicine and translational research.

Key facts

NIH application ID
10058470
Project number
1U24CA248010-01A1
Recipient
BOSTON CHILDREN'S HOSPITAL
Principal Investigator
HARRY S HOCHHEISER
Activity code
U24
Funding institute
NIH
Fiscal year
2020
Award amount
$924,689
Award type
1
Project period
2020-09-24 → 2025-08-31