# Mitigating Genomic Research Disparities in the Million Veteran Program

> **NIH VA I01** · JAMES J PETERS VA  MEDICAL CENTER · 2024 · —

## Abstract

To study and improve healthcare outcomes for our Veterans, the Million Veteran Program (MVP) has collected
genetic and electronic health record (EHR) data from more than 650,000 Veterans. As the largest biobank in
the world, the MVP presents a unique opportunity to advance our understanding in detecting and treating
Veterans with serious mental illnesses (SMI). Nevertheless, prior research has indicated that translational
research is often not equitable across demographic groups due to differences in sample sizes and healthcare
disparities affecting the quality of the EHR data. Consequently, this proposal leverages novel statistical
approaches to boost the effective sample sizes across demographic groups.
To achieve these goals, genome-wide and transcriptome-wide association studies will be meta-analyzed using
a novel meta-analytic approach, PheMED, that can integrate data of heterogeneous quality to boost sample
sizes to detect novel genetic risk loci correlated with SMI related traits. First, PheMED will be leveraged to
integrate results from different trait definitions within a given demographic group for 21 different SMI related
traits. For example, case definitions for bipolar disorder can be defined based on different code count
thresholds, as recorded in the EHR. By leveraging a data-driven approach for integrating different thresholds
for defining bipolar disorder cases, PheMED boosts the number of cases across different demographic groups,
such as sex and ancestry. These PheMED meta-analyses will facilitate the discovery of new genetic risk loci
linked to serious mental illness related traits. The PheMED genome-wide and transcriptome-wide association
meta-analyses will then be employed to create polygenic risk scores on a 50% hold out set across different
demographic groups. As the prevalence of many serious mental illness related traits depends on sex, this
proposal will then leverage existing imputed genetic data to conduct association tests on the X chromosome to
improve polygenic risk stratification and detection of genetic variants linked to SMI related phenotypes. Code
for implementing this pipeline will be shared with the broader MVP research community to promote equitability
in research and advance targeted genetic findings for both male and female Veterans.
Subsequently, PheMED will be used to synthesize findings across different demographic groups. Notably,
PheMED can also adjust for diluted genetic effects that are driven by hidden gene-environment interactions
and phenotype data quality issues that are confounded with sex or ancestry, such as cryptic healthcare
disparities coded in the EHR data. These cross-demographic meta-analyses will then be used to discover new
genetic variants and to generate improved polygenic risk scores to better detect SMI-related traits.
Finally, this proposal will construct data-driven phenotypes for three serious mental illness traits: schizophrenia,
bipolar disorder and major depressive disorder...

## Key facts

- **NIH application ID:** 10924783
- **Project number:** 1I01BX006500-01
- **Recipient organization:** JAMES J PETERS VA  MEDICAL CENTER
- **Principal Investigator:** David Burstein
- **Activity code:** I01 (R01, R21, SBIR, etc.)
- **Funding institute:** VA
- **Fiscal year:** 2024
- **Award amount:** —
- **Award type:** 1
- **Project period:** 2024-07-01 → 2028-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10924783

## Citation

> US National Institutes of Health, RePORTER application 10924783, Mitigating Genomic Research Disparities in the Million Veteran Program (1I01BX006500-01). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10924783. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
