# Deep phenotyping in Electronic Health Records for Genomic Medicine

> **NIH NIH R01** · COLUMBIA UNIVERSITY HEALTH SCIENCES · 2020 · $74,999

## Abstract

SUMMARY
Sharable, innovative and scalable methods for abstracting relevant characteristic patient phenotypes from
electronic health records (EHRs) and for systematically understanding disease relationships are critical for
accomplishing precise disease diagnoses and personalized disease prevention and treatment for patients.
As of May 28, 2020, there are 5,716,271 confirmed 2019 Novel Coronavirus (COVID-19) cases worldwide,
including 1,699,933 cases in the United States, and 356,124 deaths across over 200 countries, areas, and
territories including 100,442 deaths in the United States, with the numbers continually climbing. The pandemic
has had profound economic, social, and public health impact. As Columbia University Irving Medical Center
(CUIMC) has been fighting the virus on the frontline in the epicenter of New York City and treating more than
4,100 SARS-CoV-2 positive patients, we aim to address the urgent COVID-19 Public Heath need by
developing sharable phenotyping methods to identify and characterize COVID-19 cases using our EHR data
and multiple data standards, including the Observational Medical Outcomes Partnership (OMOP) Common
Data Model (CDM) and the Human Phenotype Ontology (HPO), and generate novel knowledge about COVID-
19, such as its risk factors, disease subtypes, and temporal clinical courses.
Our specific aims for this supplement are as follows: Extension to the original Aim 1: Develop and validate
scalable and sharable approaches to abstracting characteristic phenotypes of COVID-19 from both structured
and unstructured EHR data and to standardize the concept representations of these EHR phenotypes using
widely adopted data standards, including the OMOP CDM, HPO, SNOMED-CT, UMLS, and RxNorm.
Extension to the original Aim 3: Develop and validate methods for temporal phenotyping for COVID-19 and
methods for identifying disease subtypes of varying clinical outcomes among heterogeneous populations using
deep characteristic EHR phenotypes of COVID-19.
We will disseminate the resulting methods and knowledge with the broad scientific communities and the nation.
We will also leverage this supplement to create research and training opportunities for postdocs and graduate
students from biomedical informatics, data science and computer science, advancing interdisciplinary
collaborations in data science and biomedical informatics to combat COVID-19 and other health problems.

## Key facts

- **NIH application ID:** 10175742
- **Project number:** 3R01LM012895-03S1
- **Recipient organization:** COLUMBIA UNIVERSITY HEALTH SCIENCES
- **Principal Investigator:** CHUNHUA WENG
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $74,999
- **Award type:** 3
- **Project period:** 2020-07-01 → 2021-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10175742

## Citation

> US National Institutes of Health, RePORTER application 10175742, Deep phenotyping in Electronic Health Records for Genomic Medicine (3R01LM012895-03S1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10175742. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
