Quantitative disease risk scores for common diseases, with applications to eMERGE

NIH RePORTER · NIH · R21 · $445,500 · view on reporter.nih.gov ↗

Abstract

Summary Labeling clinical data from electronic health records (EHR) in health systems requires extensive knowledge of human expert, is time-consuming, and leads to inconsistencies in case definitions across different phenotyping algorithms. There is increased recognition that common diseases are not discrete entities but rather reside on a continuum. We propose here to take advantage of rich phenotype data in electronic health records, and propose quantitative disease risk scores based on unsupervised methods that require minimal input from clinicians. We will implement the proposed methods into R packages to be made available to the scientific community. Furthermore, we propose applications to phenotypic and genomic data on approximately 100,000 individuals in the eMERGE network, and 500,000 individuals in the UK biobank. We will design a website containing the results of these analyses, including summary statistics from the GWAS analyses for these phenotypes. We believe the proposed research is very timely and novel, and has the potential to facilitate genomic research using rich phenotype data in electronic health records in general.

Key facts

NIH application ID
10151385
Project number
1R21HG012345-01A1
Recipient
COLUMBIA UNIVERSITY HEALTH SCIENCES
Principal Investigator
Iuliana Ionita
Activity code
R21
Funding institute
NIH
Fiscal year
2021
Award amount
$445,500
Award type
1
Project period
2021-09-08 → 2024-08-31