Mitigating Genomic Research Disparities in the Million Veteran Program

NIH RePORTER · VA · I01 · · view on reporter.nih.gov ↗

Abstract

To study and improve healthcare outcomes for our Veterans, the Million Veteran Program (MVP) has collected genetic and electronic health record (EHR) data from more than 650,000 Veterans. As the largest biobank in the world, the MVP presents a unique opportunity to advance our understanding in detecting and treating Veterans with serious mental illnesses (SMI). Nevertheless, prior research has indicated that translational research is often not equitable across demographic groups due to differences in sample sizes and healthcare disparities affecting the quality of the EHR data. Consequently, this proposal leverages novel statistical approaches to boost the effective sample sizes across demographic groups. To achieve these goals, genome-wide and transcriptome-wide association studies will be meta-analyzed using a novel meta-analytic approach, PheMED, that can integrate data of heterogeneous quality to boost sample sizes to detect novel genetic risk loci correlated with SMI related traits. First, PheMED will be leveraged to integrate results from different trait definitions within a given demographic group for 21 different SMI related traits. For example, case definitions for bipolar disorder can be defined based on different code count thresholds, as recorded in the EHR. By leveraging a data-driven approach for integrating different thresholds for defining bipolar disorder cases, PheMED boosts the number of cases across different demographic groups, such as sex and ancestry. These PheMED meta-analyses will facilitate the discovery of new genetic risk loci linked to serious mental illness related traits. The PheMED genome-wide and transcriptome-wide association meta-analyses will then be employed to create polygenic risk scores on a 50% hold out set across different demographic groups. As the prevalence of many serious mental illness related traits depends on sex, this proposal will then leverage existing imputed genetic data to conduct association tests on the X chromosome to improve polygenic risk stratification and detection of genetic variants linked to SMI related phenotypes. Code for implementing this pipeline will be shared with the broader MVP research community to promote equitability in research and advance targeted genetic findings for both male and female Veterans. Subsequently, PheMED will be used to synthesize findings across different demographic groups. Notably, PheMED can also adjust for diluted genetic effects that are driven by hidden gene-environment interactions and phenotype data quality issues that are confounded with sex or ancestry, such as cryptic healthcare disparities coded in the EHR data. These cross-demographic meta-analyses will then be used to discover new genetic variants and to generate improved polygenic risk scores to better detect SMI-related traits. Finally, this proposal will construct data-driven phenotypes for three serious mental illness traits: schizophrenia, bipolar disorder and major depressive disorder...

Key facts

NIH application ID
10924783
Project number
1I01BX006500-01
Recipient
JAMES J PETERS VA MEDICAL CENTER
Principal Investigator
David Burstein
Activity code
I01
Funding institute
VA
Fiscal year
2024
Award amount
Award type
1
Project period
2024-07-01 → 2028-06-30