Scalable tool and comprehensive maps to interpret structural variation across the neuropsychiatric spectrum

NIH RePORTER · NIH · R01 · $760,312 · view on reporter.nih.gov ↗

Abstract

ABSTRACT Structural variants (SVs), defined as rearrangements of ≥50 DNA nucleotides, are a major source of genetic diversity among humans and an important component of the architecture of neuropsychiatric disorders (NPDs). Despite their etiological significance, remarkably little is known about the consequences of SV formation across the genome as there is a dearth of accurate measures to assess the genome-wide impact of gains or losses of DNA (‘dosage sensitivity’). In contrast, robust models of mutation intolerance in genes have been derived from single nucleotide variants (SNVs), which occur at ~200-fold higher frequency in the genome than SVs. These metrics of negative selection against loss-of-function mutations within genes (e.g., LOEUF from the genome aggregation database [gnomAD]) have been critical to gene and locus discovery across NPDs and Mendelian disorders. By contrast, the absence of equivalent measures for SVs has hindered discovery. This renewal seeks to build on the foundational tools, maps of genomic variation, and association studies across NPDs completed during the initial funding period to now define the landscape of SVs across diverse global populations and determine their relative contribution to the individual and cross-disorder NPD risk. To accomplish these goals, we will leverage the coalescence of massive-scale biobank and NPD study initiatives led by members of our research team with the development of new tools and resources that can scale to millions of individuals. We will first aggregate and harmonize SV callsets generated using our GATK-SV and GATK-gCNV tools across >2.6 million samples with genome and exome sequencing data to create expansive SV maps across diverse populations. We will then apply new statistical approaches to predict SV mutation rates and develop models of genome-wide dosage sensitivity (Aim 1). These new SV variant classes and dosage sensitivity metrics will be integrated into family-based and case-control association studies of NPDs across 387,675 cases from ongoing cohort collections (Aim 2). Notably, these datasets will include significant initiatives led by members of our team to investigate the dimensions of NPDs across diverse populations that are currently under-represented in NPD studies. Finally, we will use innovative new approaches to investigate the influence of SVs that have been cryptic to discovery from existing technologies but are now accessible to long-read sequencing and we will apply new analysis methods to explore their potential influence on NPDs (Aim 3). Overall, each aim addresses a current void in neuropsychiatric genomics and success in any one area would represent an important advance for the field. We have assembled an outstanding team of experts across all domains of computational and statistical genomics, as well as the phenotypic dimensions of neuropsychiatric conditions, and at its conclusion this proposal will yield novel tools and resources at an unprecedented...

Key facts

NIH application ID
10868517
Project number
5R01MH115957-06
Recipient
BROAD INSTITUTE, INC.
Principal Investigator
MICHAEL E TALKOWSKI
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$760,312
Award type
5
Project period
2018-08-10 → 2028-04-30