New York Genome Characterization Center: Somatic Mosaicism across Human Tissues

NIH RePORTER · NIH · UM1 · $1,500,000 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY/ABSTRACT - NEW YORK GENOME CHARACTERIZATION CENTER Large-scale sequencing efforts over the last two decades have been focused on generating DNA sequence datasets from readily available tissues such as blood or saliva to identify germline variants associated with disease phenotypes. However, limited progress has been made in characterizing somatic variants in healthy tissues and their contribution to health and disease over the course of the human lifespan. Somatic variation has historically been studied in the context of tumor biology; however, there is mounting evidence that somatic variation plays an important role in the aging process, as well as in cardiovascular, neurodegenerative, immunologic, and neurodevelopmental diseases. There is therefore a critical need to characterize the somatic variant landscape in healthy human tissues in individuals of diverse race and ethnicity across the human lifespan. The Somatic Mosaicism across Human Tissues (SMaHT) program will address this gap by establishing a cohesive Network that will work together to create high-quality somatic variant catalog; a catalog that is broadly shareable across the scientific community and that enables studies investigating the rates and patterns of somatic mosaicism across cell populations and tissues, that can elucidate the mechanisms underlying clonal development, evolution, and expansion, and that enables studies of the role of somatic mutation in disease pathogenesis and progression. The New York Genome Characterization Center (NYGCC) will work collaboratively with other SMaHT Network Centers to generate a high-quality somatic variant catalog using three core high-depth sequencing assays: duplex whole genome sequencing (WGS), mRNA sequencing, and long- read Oxford Nanopore WGS. These three core assays will provide an unprecedented and comprehensive view of somatic mutations across a variety of healthy tissues. The data from deep WGS will enable discovery of somatic SNVs, indels, mobile elements, copy number changes, and structural variants. The RNA sequencing data will be used to confirm the presence of those variants that fall in expressed genes, and further evaluate their effect on splicing. The long read WGS sequencing will be used as a corollary to short read WGS to confirm and enhance discovery of mobile elements, copy number changes and structural variants. To these core assays we propose adding single cell WGS sequencing using Direct Library Preparation Plus (DLP+) and genotyping of transcriptomes (GoT). DLP+ is an amplification-free single cell WGS assay that allows high sensitivity detection of copy number changes, loss of heterozygosity, and structural variation. It further enables the study of replication timing, clonal expansion and fitness and is compatible with pooled pseudo-bulk analysis to compare against deep bulk WGS. The genotyping of transcriptomes assay will allow us to explore, for expressed somatic variants, the cell type or lineag...

Key facts

NIH application ID
10834251
Project number
5UM1DA058236-02
Recipient
NEW YORK GENOME CENTER
Principal Investigator
Samuel Aparicio
Activity code
UM1
Funding institute
NIH
Fiscal year
2024
Award amount
$1,500,000
Award type
5
Project period
2023-05-01 → 2025-04-30