The origin, maintenance, and adaptive consequences of variation in genome structure

NIH RePORTER · NIH · R35 · $380,865 · view on reporter.nih.gov ↗

Abstract

Project Summary/Abstract Background: Understanding the origin and fate of genetic variation relies on accurate measurement of that genetic variation. Until recently, whole genome sequencing approaches exhibited blind spots regarding vast swaths of genomes comprising repetitive regions, despite repeated demonstrations of the importance of mutations involving gene duplications, tandem duplications, TE insertions, and other structural changes that are associated with repeats. Indeed, such mutations are often targets of adaptation, are associated with variation in human traits, play causative roles in many genetic diseases, and often play key roles in important phenotypes in species that coexist with humans (e.g. conferring pesticide resistance to human disease vectors). Despite this, surveys of genetic variation typically continue to select techniques based on convenience and cost-effectiveness. Indeed, even standard long-read sequencing approaches fail to recover >15% of the genome. A better way would be to apply emerging methods capable of resolving all regions, particularly ensemble approaches combining highly accurate and ultra-long sequencing technologies and long-range scaffolding techniques. Such approaches have already proven capable both of reducing the uncertainty in inferring structural mutations and of saving analysis time. Projects can now plausibly aim to obtain accurate, full genetic catalogs of each chromosome, from telomere to telomere. Now is the ideal time to discover and make inferences on the full spectrum of genetic mutations. Proposal: The Emerson lab’s research focuses on the evolution of genome structure, particularly mutations that add, subtract, or otherwise refashion genome sequence on large scales. We apply cutting edge sequencing, computational, and statistical techniques to discover and interpret structural genetic variation in the most recalcitrant regions in the genome, using Drosophila melanogaster as a model system. Over the next five years, our goal is to identify all structural genetic variation in samples within and between species, infer the evolutionary forces acting on them, and understand their functional consequences. We will adapt cutting-edge telomere-to- telomere approaches to extend our reach into every region of the genome to obtain an exhaustive inventory of genetic variation within and between species, eliminating the thorny problem of genotype-based ascertainment bias and error in evolutionary inference. In doing so, we will develop tools to aid in genome assembly, structural variant genotyping, and evolutionary analysis. We will also use functional genomics techniques to understand how perturbing primary genome structure changes genome function. Finally, we will identify individual candidate mutations for functional characterization using reverse genetics. With such comprehensive surveys of genetic variation, we can finally meet the challenge of discovering all classes of genetic variation to study...

Key facts

NIH application ID
10842876
Project number
1R35GM153327-01
Recipient
UNIVERSITY OF CALIFORNIA-IRVINE
Principal Investigator
James Jordan Emerson
Activity code
R35
Funding institute
NIH
Fiscal year
2024
Award amount
$380,865
Award type
1
Project period
2024-05-10 → 2029-02-28