# COVID-19 disease course analysis using multi-site large-scale EHR data

> **NIH NIH R21** · OHIO STATE UNIVERSITY · 2022 · $170,884

## Abstract

Project Summary/Abstract
 Since its ﬁrst case reported in December 2019, the coronavirus disease-2019 (COVID-19) has caused a pan-
demic in 188 countries/regions, and has precipitated an unprecedented health, economic and social crisis. In
order to cope with the volatile dynamic and severity of the pandemic, it is imperative that we characterize the
various clinical courses of COVID-19 infection, and determine whether and how demographic, clinical and other
variables inﬂuence them. Knowledge of the disease's transmission, symptomatology, clinical course, treatment
and outcomes is rapidly evolving based on many sources. An important source for advancing this knowledge
is data from electronic health records (EHR) and health information exchanges (HIE) because they can pro-
vide a real-time, unvarnished view of the disease. Using large-scale, well-integrated and rich EHR data enables
comprehensive proﬁling and quantiﬁcation of the COVID-19 disease course that can directly inform clinical prac-
tice. The long-term goal of our research is to develop Artiﬁcial Intelligence (AI) tools to facilitate access to and
analysis of clinical data. The goal of this application is to develop effective algorithms and tools to mine clinical
data to categorize disease courses of COVID-19, and determine the effect of clinical and other variables asso-
ciated with them. We will develop our algorithms using data from a large and comprehensive health information
exchange, the Indiana Network for Patient Care (INPC), which has about 40,000 COVID-19 patients and fairly
complete EHR data about them. We will evaluate the algorithms against other data sets, including EHR data
from the OSU Wexner Medical Center and the National COVID Cohort Collaborative (N3C). The speciﬁc aims
of this project are to (1) develop COVID-19 disease course groupings, (2) relate comorbidities and other clinical
variables to the COVID-19 disease course, and (3) validate the developed algorithms on N3C data. This pro-
posal is signiﬁcant because the methods developed in this project have the potential to signiﬁcantly increase our
capability for computational analysis of large and rich patient data during the pandemic and beyond; the knowl-
edge derived from our comprehensive proﬁling of COVID-19 courses over large, inclusive patient populations
supported by rich EHR data can positively impact clinical practice; and the tools developed in this project will be
released to the public as a free COVID-19 research re- source. It is innovative because our methods integrate
novel methods such as patient clustering using clinical variables and disease progression trajectories, and pa-
tient trajectory comparison, with established univariate and predictive analysis; our primary approach will lever-
age the oldest and one of the country's largest HIEs to derive detailed and comprehensive knowledge about a
large patient population; and the strong preliminary data generated by this project can help improve COVID-...

## Key facts

- **NIH application ID:** 10380682
- **Project number:** 5R21LM013678-02
- **Recipient organization:** OHIO STATE UNIVERSITY
- **Principal Investigator:** XIA NING
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $170,884
- **Award type:** 5
- **Project period:** 2021-04-01 → 2024-03-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10380682

## Citation

> US National Institutes of Health, RePORTER application 10380682, COVID-19 disease course analysis using multi-site large-scale EHR data (5R21LM013678-02). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10380682. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
