Large-scale transcriptome and epigenome association analysis across multiple traits

NIH RePORTER · VA · I01 · — · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Precision Psychiatry is an emerging approach that considers patients’ characteristics to customize prevention and treatment for serious mental illness. The Million Veteran Program (MVP) is the largest and most comprehensive biobank in the world, currently involving multi-ancestry genetic data from more than 650,000 Veterans and highly dense electronic health record information that fully captures the clinical characteristics of each participant. Given the high prevalence of serious mental illness among our Veterans, MVP provides a unique opportunity to perform large-scale genetic discovery that will further our understanding of the pathophysiology of serious mental illness and promote Precision Psychiatry. While well-powered genome-wide association studies (GWAS) have identified multiple risk variants across serious mental illness, there have been limited conclusive findings on the functional relevance of most discovered loci due to small effect size, overlap with non-coding regions of the genome and unclear mechanisms through which they act. Our group and others have shown that a large portion of phenotypic variability in disease risk can be explained by regulatory variants with cell type specificity, i.e. genetic variants that affect epigenetic mechanisms and the expression levels of genes. Studying gene expression and epigenome changes directly in MVP samples is not feasible as such data are not available. To overcome these limitations, we propose to take advantage of large-scale datasets with genotyping and multiscale molecular profiling that our group and others have generated in human brain tissue and apply machine learning approaches to directly impute genome-wide transcriptomes, epigenomes and proteomes in MVP samples using the existing MVP genotypes. The primary goals of our project are threefold: First, imputed MVP transcriptomes, epigenomes and proteomes will be meta-analyzed to single tissue-specific gene dysregulation scores for each individual via a novel method, called PolyXcan, which leverages a data- driven correlation-aware meta-analytical framework and performs joint multi-omics-wide association studies. For each serious mental illness, key gene drivers and molecular pathways will be identified with a structured, interpretable deep learning approach and gene-gene interaction effects by leveraging patient subtypes identified with semi-supervised graph-based cluster methods; both of these approaches are only possible with well- powered individual-level (genotypic and phenotypic) data of the scale that exists in MVP and we expect them to enhance efforts for gene target prioritization and drug discovery. Second, imputed gene dysregulation for each individual in MVP will be integrated with perturbagen reference libraries (describing the effect of therapeutic compounds on gene expression) to identify the extent to which compounds could be therapeutic by antagonizing the predicted gene dysregulation. We have validated ...

Key facts

NIH application ID: 10830268
Project number: 5I01BX004189-06
Recipient: JAMES J PETERS VA MEDICAL CENTER
Principal Investigator: Panagiotis Roussos
Activity code: I01
Funding institute: VA
Fiscal year: 2024
Award amount: —
Award type: 5
Project period: 2018-10-01 → 2028-03-31