Development and Application of Computational Methods for Single Cell DNA Sequencing Data

NIH RePORTER · NIH · R01 · $747,149 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Whole-genome sequencing has become a popular approach for comprehensive genome-wide characterization of genomic alterations, ranging from single nucleotide variants and indels to copy number changes and complex structural alterations. However, standard bulk sequencing provides information on the population average of the cells, and our understanding of genetic heterogeneity and clonal dynamics remains inadequate. In the proposed work, we aim to develop computational methods for analysis of single cell whole-genome sequencing data. Due to the allelic bias and artifacts associated with the DNA amplification step, accurate identification of genomic alterations is challenging. In Aim 1, we will develop methods to identify single nucleotide variants and indels, building on our experience in analysis of single neurons and utilizing the latest amplification techniques. In Aim 2, we will focus on methods to detect copy number variants, structural variants, and tandem repeat mutations. We will employ machine learning models including graph- and autoencoder-based deep learning approaches. In Aim 3, we will apply the methods devised in the first two aims to several important biological questions that can be best resolved by single cell DNA sequencing. These include identification of off-target effects and on-target efficiency of genome editing, lineage tracing in development using somatic mutations as endogenous barcodes, correlation of driver mutation and copy number alterations in cancer cells, and quantification of impact of environmental exposure on the mutational landscape. A single cell view of these biological phenomenon will yield new insights into the underlying processes, and the tools developed in this project will be applicable to a wide range of biological and biomedical problems.

Key facts

NIH application ID
10500708
Project number
1R01HG012573-01
Recipient
HARVARD MEDICAL SCHOOL
Principal Investigator
Peter J Park
Activity code
R01
Funding institute
NIH
Fiscal year
2022
Award amount
$747,149
Award type
1
Project period
2022-09-01 → 2026-06-30