# Sequence and Assembly of Segmental Duplications

> **NIH NIH R01** · UNIVERSITY OF WASHINGTON · 2021 · $608,375

## Abstract

Despite the high quality of the human genome, important gaps remain in our understanding of its sequence
organization, function, and variation. Our genome is particularly enriched for interspersed segmental
duplications, which harbor rapidly evolving genes and predispose our species to recurrent rearrangements
associated with disease. The long-term objective of our research has been to develop computational and
experimental methods to understand the organization, genetic diversity, and disease impact of segmental
duplications. The goal of this competing renewal is to begin to understand the function and variation of the
duplicated genes themselves. We propose to focus here on human- and great ape-specific gene families
mapping within the most complex and duplicated regions of our genome. There are four aims: (1) determine
the sequence structure of these recent duplications by generating high-quality reference sequences using
clone-based resources and long-read sequencing technologies; (2) understand the genetic diversity of this
structure focusing on those that have most likely been targets of selection; (3) completely annotate the gene
content to distinguish protein-encoding innovations from pseudogenes; and (4) test for neurodevelopmental
disease association by comparing the burden of loss-of-function mutations in patients versus controls using
available genome sequence data and molecular inversion probe assays. We hypothesize that segmental
duplications have played an important role in human neurocognitive adaptation and that patterns of copy
number polymorphisms and substitution will differ significantly between functional and nonfunctional paralogs.
This research has the additional benefit that it will add new sequence to reference genomes, identify missing
genes, and provide us with the ability to systematically explore genetic variation of regions frequently
overlooked as part of disease-association studies.

## Key facts

- **NIH application ID:** 10149373
- **Project number:** 5R01HG002385-20
- **Recipient organization:** UNIVERSITY OF WASHINGTON
- **Principal Investigator:** Evan Eichler
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $608,375
- **Award type:** 5
- **Project period:** 2001-09-21 → 2022-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10149373

## Citation

> US National Institutes of Health, RePORTER application 10149373, Sequence and Assembly of Segmental Duplications (5R01HG002385-20). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10149373. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
