# Center for Human Reference Genome Diversity

> **NIH NIH U01** · UNIVERSITY OF CALIFORNIA SANTA CRUZ · 2022 · $3,400,956

## Abstract

Project Abstract
The goal of our Center for Human Reference Genome Diversity is to generate as error-free, gapless, complete,
and correctly haplotype-phased genome assemblies as possible from a set of 350 persons comprehensively
capturing the full extent of human diversity. We aim to capture >99% of allelic variants with >1% allele
frequency, and to provide these genomes as a resource to the international community to enable genomic
medicine and research addressing fundamental unanswered questions in biology and disease. We will employ
a multi-platform approach using cutting-edge long read and linked read technologies to obtain the highest
quality phased genomes. Aim 1 will focus on sample collection and procuring cell lines from at least 350
individuals with a specific emphasis on filling in gaps in human diversity. Aim 2 will generate highly contiguous
chromosomal level assemblies that are over 99% haplotype-phased for at least 700 haploid genomes from 350
diploid samples. Aim 3 will finish these genomes to be gapless from telomere-to-telomere (T2T) for each
chromosome. Aim 4 will evaluate the genomes for accuracy and completeness and perform initial variant
calling to assess the level of human diversity. We will use a novel combination of technologies, sequencing
strategies, and algorithms that we and others developed to produce the highest quality and most complete
genome assemblies to date. Our effort will specifically target regions that have been excluded by other efforts,
including segmental duplications, centromeres, and acrocentric DNA. To achieve these aims we have
assembled an exceptional team consisting of leaders from around the world in consent ethics, sample
collection, sample extraction, and high-quality genome sequencing, assembly, finishing and evaluation. The
team also has expertise in using genomic technologies to address a broad range of scientific questions, so is
highly cognizant of the practical needs of biomedical researchers who will use this resource. The high-quality
genomes produced will be passed to the Human Reference Genome Center (HGRC) and Genome Reference
Representation (GRR) groups for curation and release. The result will be a pan-human genome reference,
representing important human diversity not present in the current reference genome. The data we generate will
enable a fundamental shift in human genetics, fostering new discoveries from the single-nucleotide to
chromosomal levels and revealing a more accurate and global view of the human population.

## Key facts

- **NIH application ID:** 10488272
- **Project number:** 5U01HG010971-04
- **Recipient organization:** UNIVERSITY OF CALIFORNIA SANTA CRUZ
- **Principal Investigator:** Evan Eichler
- **Activity code:** U01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $3,400,956
- **Award type:** 5
- **Project period:** 2019-09-18 → 2024-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10488272

## Citation

> US National Institutes of Health, RePORTER application 10488272, Center for Human Reference Genome Diversity (5U01HG010971-04). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10488272. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
