# Sequence-resolved structural variation of human genomes

> **NIH NIH R01** · UNIVERSITY OF WASHINGTON · 2020 · $645,000

## Abstract

Understanding the genetic basis of human disease requires a comprehensive assessment of the full spectrum
of human genetic variation. Genome structural variation, including larger deletions, insertions, and inversions
(>50 bp), has been more difficult to characterize due to the association with repetitive DNA. The majority of
structural variation, including common structural variants or SVs, has not yet been discovered using short-read
whole-genome datasets and standard SV callers. Advances in sequencing technology over the last three
years, however, have made the systematic discovery of this variation possible for the first time. This proposal
focuses on the discovery, sequence resolution, and genotyping of the most complex and under-ascertained
forms of human genetic variation, including multi-copy number variants (mCNVs), inversions, and intermediate-
size insertions and deletions. We target a diversity panel of 34 human genomes and partition long-read single-
molecule, real-time sequencing data using 10X linked reads and Strand-seq data in order to fully phase and
sequence-resolve SVs on each human haplotype. Using these long-read sequence data, we further develop a
computational graph-based approach to distinguish and assemble distinct copies underlying large mCNVs
mapping to high-identity segmental duplications. Finally, we take advantage of the sequence structure,
including breakpoints and sequence differences among the copies, to more accurately genotype these variants
in a diversity panel of >2,800 human genomes where short-read whole-genome sequence data are already
available. The work will develop new methods to characterize more complex forms of human genetic variation
and provide fundamental insight into their diversity, mechanism of origin, and mutational properties. This
research has the additional benefit that it will improve genome assembly, characterize new human genome
sequence, identify a large class of missing genetic variation, and provide us with the ability to systematically
explore this form of human genetic variation as part of disease-association studies.

## Key facts

- **NIH application ID:** 9957137
- **Project number:** 5R01HG010169-03
- **Recipient organization:** UNIVERSITY OF WASHINGTON
- **Principal Investigator:** Evan Eichler
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $645,000
- **Award type:** 5
- **Project period:** 2018-09-06 → 2022-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9957137

## Citation

> US National Institutes of Health, RePORTER application 9957137, Sequence-resolved structural variation of human genomes (5R01HG010169-03). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/9957137. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
