# Chronic lung disease phenotyping and genomics in the Veterans Health Administration

> **NIH VA I01** · VA BOSTON HEALTH CARE SYSTEM · 2024 · —

## Abstract

Chronic lung diseases (CLDs), including asthma, chronic obstructive pulmonary disease (COPD), and interstitial
lung disease (ILD), cause significant functional limitation and disability and were, collectively, the 4th leading
cause of death in the United States in 2019. Veterans are enriched for environmental exposures which contribute
to the pathogenesis of CLDs (e.g., smoking, environmental toxins) and have a higher burden of CLD relative to
the general population. However, significant heterogeneity in CLD development and progression exists.
Investigations into the genomic contributions towards CLD susceptibility within the Veterans Health
Administration (VHA) have been limited by the lack of robust CLD phenotypes. Key challenges in [clinically-
based] CLD phenotyping include the lack of blood-based biomarkers, variable implementation of diagnostic
testing (e.g., underutilization of spirometry), and inconsistent availability of testing results in the VA electronic
health record (EHR). These barriers result in frequent misclassification and present challenges in identifying
controls as well as cases for large-scale epidemiological and genomic studies.
 To address these knowledge gaps, we propose a multi-faceted approach to CLD phenotyping and
validation, followed by genome-wide association studies (GWAS), construction of polygenic risk scores, and
examinations of gene-by-environment interactions including pharmacogenomics within the Million Veteran
Program (MVP). First, we have developed a novel natural language processing (NLP)-boosted EHR-based
phenotyping algorithm which will generate quantitative probabilities for the presence and absence of multiple
CLDs (COPD, emphysema subtype, asthma, interstitial lung abnormalities (ILA), fibrosis subtype) for all VHA
users (~16.8 million) through 2018. Following algorithm optimization and internal validation against 500 gold
standard charts (already adjudicated in duplicate), prospective validation using mortality and respiratory-related
healthcare utilization data collected after 2018 will be performed (Aim 1). Second, we propose independent
phenotyping through quantitative imaging analysis (QIA) of chest computed tomography (CT) data available in
a subset of participants enrolled in MVP (Aim 2). In preliminary work, a secure prototype pipeline behind the VA
firewall for the analysis of archived clinical chest CT data using VA Technical Reference Manual (TRM)-approved
software platforms (3D Slicer, Chest Imaging Platform13,14) has been established. For the current proposal,
through a national network of collaborating VA pulmonary investigators (J. Curtis, VISN10; C. Wendt VISN23;
V. Fan VISN20; C. Wells VISN7; F. Kheradmand, VISN16), full-resolution clinical chest CT data from a subgroup
of individuals enrolled in MVP (n=6,000-9,000) will be analyzed to generate objective, quantitative
measurements of parenchymal lung disease (e.g., percent emphysema and ILA). Each of these QIA-based
phenotypes will b...

## Key facts

- **NIH application ID:** 10583622
- **Project number:** 1I01BX005957-01A1
- **Recipient organization:** VA BOSTON HEALTH CARE SYSTEM
- **Principal Investigator:** Emily S Wan
- **Activity code:** I01 (R01, R21, SBIR, etc.)
- **Funding institute:** VA
- **Fiscal year:** 2024
- **Award amount:** —
- **Award type:** 1
- **Project period:** 2024-01-01 → 2027-12-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10583622

## Citation

> US National Institutes of Health, RePORTER application 10583622, Chronic lung disease phenotyping and genomics in the Veterans Health Administration (1I01BX005957-01A1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10583622. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
