# PheBC: bias correction methods for EHR derived phenotype

> **NIH NIH R01** · UNIVERSITY OF PENNSYLVANIA · 2021 · $373,240

## Abstract

Project Summary
 In response to the (PAR-18-896), the overarching goal of this proposal is to fully develop
a joint effort between statisticians, medical informaticians, clinicians with a focus on developing
a rigorous bias correction framework through modern knowledge engineering and data-driven
statistical modeling, for improving the unbiasedness and reproducibility of health system data
driven research.
 In this proposal, we will focus on: (1) Develop a novel prior-knowledge-guided integrated
likelihood approach to enable bias correction by incorporating prior phenotyping accuracy. (2)
Develop methods and algorithms to account for EHR phenotyping errors in both outcomes and
predictors. And (3) Validation, Application and Software development. We will use the proposed
bias correction methods to several EHR datasets to replicate existing findings and investigate
new hypothesis in multiple datasets at University of Texas and University of Pennsylvania. We
will also develop software for the proposed methods to facilitate ongoing EHR-based clinical
studies.

## Key facts

- **NIH application ID:** 10101041
- **Project number:** 1R01LM013519-01
- **Recipient organization:** UNIVERSITY OF PENNSYLVANIA
- **Principal Investigator:** Yong Chen
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $373,240
- **Award type:** 1
- **Project period:** 2021-09-01 → 2025-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10101041

## Citation

> US National Institutes of Health, RePORTER application 10101041, PheBC: bias correction methods for EHR derived phenotype (1R01LM013519-01). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10101041. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
