# Large-scale transcriptome and epigenome association analysis across multiple traits

> **NIH VA I01** · JAMES J PETERS VA  MEDICAL CENTER · 2024 · —

## Abstract

PROJECT SUMMARY
Precision Psychiatry is an emerging approach that considers patients’ characteristics to customize prevention
and treatment for serious mental illness. The Million Veteran Program (MVP) is the largest and most
comprehensive biobank in the world, currently involving multi-ancestry genetic data from more than 650,000
Veterans and highly dense electronic health record information that fully captures the clinical characteristics of
each participant. Given the high prevalence of serious mental illness among our Veterans, MVP provides a
unique opportunity to perform large-scale genetic discovery that will further our understanding of the
pathophysiology of serious mental illness and promote Precision Psychiatry. While well-powered genome-wide
association studies (GWAS) have identified multiple risk variants across serious mental illness, there have been
limited conclusive findings on the functional relevance of most discovered loci due to small effect size, overlap
with non-coding regions of the genome and unclear mechanisms through which they act. Our group and others
have shown that a large portion of phenotypic variability in disease risk can be explained by regulatory variants
with cell type specificity, i.e. genetic variants that affect epigenetic mechanisms and the expression levels of
genes. Studying gene expression and epigenome changes directly in MVP samples is not feasible as such data
are not available. To overcome these limitations, we propose to take advantage of large-scale datasets with
genotyping and multiscale molecular profiling that our group and others have generated in human brain tissue
and apply machine learning approaches to directly impute genome-wide transcriptomes, epigenomes and
proteomes in MVP samples using the existing MVP genotypes. The primary goals of our project are threefold:
First, imputed MVP transcriptomes, epigenomes and proteomes will be meta-analyzed to single tissue-specific
gene dysregulation scores for each individual via a novel method, called PolyXcan, which leverages a data-
driven correlation-aware meta-analytical framework and performs joint multi-omics-wide association studies. For
each serious mental illness, key gene drivers and molecular pathways will be identified with a structured,
interpretable deep learning approach and gene-gene interaction effects by leveraging patient subtypes identified
with semi-supervised graph-based cluster methods; both of these approaches are only possible with well-
powered individual-level (genotypic and phenotypic) data of the scale that exists in MVP and we expect them to
enhance efforts for gene target prioritization and drug discovery. Second, imputed gene dysregulation for each
individual in MVP will be integrated with perturbagen reference libraries (describing the effect of therapeutic
compounds on gene expression) to identify the extent to which compounds could be therapeutic by antagonizing
the predicted gene dysregulation. We have validated ...

## Key facts

- **NIH application ID:** 10830268
- **Project number:** 5I01BX004189-06
- **Recipient organization:** JAMES J PETERS VA  MEDICAL CENTER
- **Principal Investigator:** Panagiotis Roussos
- **Activity code:** I01 (R01, R21, SBIR, etc.)
- **Funding institute:** VA
- **Fiscal year:** 2024
- **Award amount:** —
- **Award type:** 5
- **Project period:** 2018-10-01 → 2028-03-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10830268

## Citation

> US National Institutes of Health, RePORTER application 10830268, Large-scale transcriptome and epigenome association analysis across multiple traits (5I01BX004189-06). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10830268. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*