Development and Application of Computational Methods for Single Cell DNA Sequencing Data

NIH RePORTER · NIH · R01 · $747,149 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Whole-genome sequencing has become a popular approach for comprehensive genome-wide characterization of genomic alterations, ranging from single nucleotide variants and indels to copy number changes and complex structural alterations. However, standard bulk sequencing provides information on the population average of the cells, and our understanding of genetic heterogeneity and clonal dynamics remains inadequate. In the proposed work, we aim to develop computational methods for analysis of single cell whole-genome sequencing data. Due to the allelic bias and artifacts associated with the DNA ampliﬁcation step, accurate identiﬁcation of genomic alterations is challenging. In Aim 1, we will develop methods to identify single nucleotide variants and indels, building on our experience in analysis of single neurons and utilizing the latest ampliﬁcation techniques. In Aim 2, we will focus on methods to detect copy number variants, structural variants, and tandem repeat mutations. We will employ machine learning models including graph- and autoencoder-based deep learning approaches. In Aim 3, we will apply the methods devised in the ﬁrst two aims to several important biological questions that can be best resolved by single cell DNA sequencing. These include identiﬁcation of off-target effects and on-target efﬁciency of genome editing, lineage tracing in development using somatic mutations as endogenous barcodes, correlation of driver mutation and copy number alterations in cancer cells, and quantiﬁcation of impact of environmental exposure on the mutational landscape. A single cell view of these biological phenomenon will yield new insights into the underlying processes, and the tools developed in this project will be applicable to a wide range of biological and biomedical problems.

Key facts

NIH application ID: 10500708
Project number: 1R01HG012573-01
Recipient: HARVARD MEDICAL SCHOOL
Principal Investigator: Peter J Park
Activity code: R01
Funding institute: NIH
Fiscal year: 2022
Award amount: $747,149
Award type: 1
Project period: 2022-09-01 → 2026-06-30