# Informatics Approach to Identification and Deep Phenotyping of PASC Cases

> **NIH NIH R21** · UNIVERSITY OF SOUTH CAROLINA AT COLUMBIA · 2022 · $217,865

## Abstract

PROJECT SUMMARY/ABSTRACT
Increasingly there have been reports of persistent symptoms and multi-organ multi-system manifestations (e.g.,
pulmonary, cardiovascular, renal, and neurological) among individuals who were recovered from the acute phase
of COVID-19, denoted as Post-Acute Sequela of SARS-CoV-2 infection (PASC). Given that 76.7 million people
are known to have been infected in the US as of February of 2022, millions of people will potentially experience
PASC. This projected disease burden will have a profound public health impact with respect to patients' clinical
outcomes and US health systems during post-COVID-19 care. Timely identification of individuals with PASC
from existing COVID-19 cohorts and newly identified COVID-19 patients is urgently needed for PASC clinics and
longitudinal cohort studies on PASC. Building on biomedical informatics methodologies, we propose a high-
throughput and semi-supervised Deep Phenotyping approach to identifying individuals with PASC and
characterizing their phenotypes. Our approach is based on a Graph representational model constructed based
on the South Carolina COVID-19 Cohort (S3C), funded by the National Institute of Allergy and Infectious
Diseases (NIAID) (R01A127203-4S1). S3C (n=~1,400, 000 COVID-19 patients by the February of 2022) is a
multi-modal data repository consisting of EHR, health systems data, community-based health services data, and
claims data, with complete temporal trajectory of every datum at individual-level. Building on top of the Graph
model, we will detect phenotypes of candidate PASC patients by using unsupervised clustering algorithms. We
will then identify and validate clinically plausible PASC cases and corresponding phenotypes by incorporating
clinical evaluation and supervised algorithms. This study will result in a high-throughput algorithm application
for identifying and characterizing PASC cases from COVID-19 EHR cohorts. The resulted EHR and machine
learning models are interpretable, generalizable, and will form a foundation for testing and implementing in
state-wide and national post-COVID clinics/programs.

## Key facts

- **NIH application ID:** 10574753
- **Project number:** 1R21AI169139-01A1
- **Recipient organization:** UNIVERSITY OF SOUTH CAROLINA AT COLUMBIA
- **Principal Investigator:** Xiaoming Li
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $217,865
- **Award type:** 1
- **Project period:** 2022-09-06 → 2024-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10574753

## Citation

> US National Institutes of Health, RePORTER application 10574753, Informatics Approach to Identification and Deep Phenotyping of PASC Cases (1R21AI169139-01A1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10574753. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
