# Sequence resolution of complex human genome structural variation

> **NIH NIH R01** · UNIVERSITY OF WASHINGTON · 2024 · $498,246

## Abstract

ABSTRACT
Understanding the genetic basis of human disease requires a comprehensive assessment of the full spectrum
of human genetic variation. While the majority of structural variation can now be routinely discovered by
application of long reads and phased genome assembly, inversions have proven more difficult to characterize
due to their association with repetitive DNA and their location/structure in the genome. These represent some
of the largest forms of naturally occurring human genetic variation but are the least understood. That is
because these loci are preferentially associated with gaps and are frequently subject to the highest frequency
of recurrent mutation making them difficult to genotype using standard approaches. In this competing renewal,
we focus on the complete sequence resolution of inversion hotspots of structural variation flanked by multi-
copy segmental duplications. We apply long-read high-fidelity sequencing, ultra-long-read sequencing, and
Strand-seq data to fully phase and assemble all inversion polymorphisms and flanking sequence for 120
human genomes (Aim 1). We use the associated haplotype data to develop methods to identify inversions
associated with recurrent mutation and then test whether recurrent mutations are preferentially associated with
altered structural configurations of flanking segmental duplications by genotyping these variants in a diversity
panel of 3,200 human genomes where short-read whole-genome sequence data are available (Aim 2). Finally,
we use the sequence-resolved haplotype structures coupled to long-read sequencing of patients to delineate
breakpoints of rearrangements identified in 125 individuals harboring de novo large-scale deletions or
duplications (Aim 3). This aim will test whether certain structural configurations are predisposed to recurrent
rearrangement and improve breakpoint mapping associated with de novo rearrangement events. New
sequence-based methods will also be developed to characterize more complex forms of human genetic
variation and provide fundamental insight into their diversity, mechanism of origin, and mutational properties.
This research has the additional benefit that it will improve genome assembly, characterize a large class of
missing genetic variation, and provide us with the ability to more systematically explore this form of human
genetic variation as part of future disease-association studies.

## Key facts

- **NIH application ID:** 10908745
- **Project number:** 5R01HG010169-06
- **Recipient organization:** UNIVERSITY OF WASHINGTON
- **Principal Investigator:** Evan Eichler
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $498,246
- **Award type:** 5
- **Project period:** 2018-09-06 → 2027-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10908745

## Citation

> US National Institutes of Health, RePORTER application 10908745, Sequence resolution of complex human genome structural variation (5R01HG010169-06). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10908745. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
