Estimating The Fraction of Variance Explained by Genetics and Neuroanatomy in Neuropsychiatric Conditions

NIH RePORTER · NIH · R01 · $556,500 · view on reporter.nih.gov ↗

Abstract

Abstract Mental health problems such as autism are highly prevalent in the population and incur great suffering and financial costs. Yet there is currently a dearth of biomarkers that accurately predict their diagnosis or prognosis. Characterizing the contributions of high-dimensional biomarkers to susceptibility of such complex disorders is critically important for advancing our understanding of their etiology and for developing new treatments. The fraction of variance explained (FVE) by a set of biomarkers is a measure of the total amount of information for an outcome contained in the predictor variables. It is a fundamental quantity in much of mental health-related research, e.g., human microbiome, proteomics, gene expression, etc. Canonical examples where the FVE is of fundamental interest include Genome-Wide Association Studies (GWAS) and neuroimaging, both crucial tools for understanding the biological basis of mental health disorders. GWAS have successfully mapped thousands of genetic factors by mass-univariate association of millions of single nucleotide polymorphisms (SNPs), but the top significant associations, even in aggregate, account for only a small proportion of susceptibility. To assess the amount of information in GWAS, the SNP-heritability, h2SNP, quantifies the FVE among all GWAS SNPs in aggregate, regardless of significance. Similarly, the FVE by brain imaging measures captures variation in the brain related to mental illness, which again appears to be highly distributed. In both the genetic and brain imaging domains, the number of predictors is extremely large, in the order of thousands to millions, far larger than the number of subjects. As a result, the specific associations with each predictor unit cannot be estimated, and effects of specific loci are extremely difficult to identify. In contrast, the FVE can be reliably estimated from data, even if only univariate summary statistics are available. Estimating FVE requires sophisticated statistical methods designed for these particular, high-dimensional data. In this proposal, we propose a general framework for FVE estimation, applicable to high-dimensional data including both GWAS and brain imaging settings. We develop foundational theory establishing the validity and consistency of FVE estimation, develop new methods for evaluating the required conditions in real data, and develop methods for partitioning FVE into more local components, allowing understanding of the distribution of contributions to susceptibility in a top-down approach. We apply these methods to the Adolescent Brain Cognitive Development (ABCD) Study, comprising longitudinal, multi-modal brain imaging, GWAS data, and autism-related assessments for 11,875 participants aged 9-10 at baseline and continuing into early adulthood.

Key facts

NIH application ID
10875541
Project number
5R01MH128923-03
Recipient
UNIVERSITY OF CALIFORNIA, SAN DIEGO
Principal Investigator
Armin Schwartzman
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$556,500
Award type
5
Project period
2022-08-15 → 2027-06-30