# Discovering and Applying Knowledge in Clinical Databases

> **NIH NIH R01** · COLUMBIA UNIVERSITY HEALTH SCIENCES · 2020 · $37,243

## Abstract

PROJECT SUMMARY
The long-term goal of our parent project, “Discovering and applying knowledge in clinical
databases,” is to learn from data in the electronic health record (EHR) and to apply that
knowledge to understand and improve health. Its first two aims are as follows: (1) Taking a
knowledge engineering approach, study the effect of preprocessing and analytic choices on
reducing health care process bias, and using machine learning techniques, learn more about
health care process bias. (2) Taking a more empirical approach, use dynamic latent factor
modeling and variation inference to accommodate health care process bias, learning how a
patient's health state and health processes affect censoring, exploiting information from many
variables at once.
For this supplement, we plan to focus on COVID-19. The emergence of the virus SARS-CoV-2
and its corresponding disease, COVID-19, has led to about 100,000 deaths in the US and great
economic loss and human suffering. Carrying out randomized clinical trials to assess treatment
is essential but stymied by the difficulty recruiting sufficient patients and the urgency of the
question. Clinical databases are beginning to fill with COVID-19 patients, but the acuity and
severity of the disease make drawing causal conclusion much more difficult, resulting in a
literature filled with conflicting observational studies.
We propose to employ structural causal modeling in the study of COVID-19, engaging expertise
in such modeling. We will use the Columbia University Irving Medical Center's clinical data
warehouse with over 6000 testing positive for SARS-CoV-2 and the Observational Health Data
Science and Informatics (OHDSI) network, which includes most COVID-19 patients in Korea,
Spain, the US Veterans Administration, Stanford, Tufts, and new sites coming on board.

## Key facts

- **NIH application ID:** 10175300
- **Project number:** 3R01LM006910-20S1
- **Recipient organization:** COLUMBIA UNIVERSITY HEALTH SCIENCES
- **Principal Investigator:** GEORGE M HRIPCSAK
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $37,243
- **Award type:** 3
- **Project period:** 2000-04-01 → 2024-02-28

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10175300

## Citation

> US National Institutes of Health, RePORTER application 10175300, Discovering and Applying Knowledge in Clinical Databases (3R01LM006910-20S1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10175300. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
