# Development and Application of Computational Methods for Single Cell DNA Sequencing Data

> **NIH NIH R01** · HARVARD MEDICAL SCHOOL · 2024 · $658,984

## Abstract

PROJECT SUMMARY
Whole-genome sequencing has become a popular approach for comprehensive genome-wide characterization of
genomic alterations, ranging from single nucleotide variants and indels to copy number changes and complex
structural alterations. However, standard bulk sequencing provides information on the population average of the
cells, and our understanding of genetic heterogeneity and clonal dynamics remains inadequate. In the proposed
work, we aim to develop computational methods for analysis of single cell whole-genome sequencing data. Due
to the allelic bias and artifacts associated with the DNA ampliﬁcation step, accurate identiﬁcation of genomic
alterations is challenging. In Aim 1, we will develop methods to identify single nucleotide variants and indels,
building on our experience in analysis of single neurons and utilizing the latest ampliﬁcation techniques. In Aim
2, we will focus on methods to detect copy number variants, structural variants, and tandem repeat mutations.
We will employ machine learning models including graph- and autoencoder-based deep learning approaches.
In Aim 3, we will apply the methods devised in the ﬁrst two aims to several important biological questions
that can be best resolved by single cell DNA sequencing. These include identiﬁcation of off-target effects and
on-target efﬁciency of genome editing, lineage tracing in development using somatic mutations as endogenous
barcodes, correlation of driver mutation and copy number alterations in cancer cells, and quantiﬁcation of impact
of environmental exposure on the mutational landscape. A single cell view of these biological phenomenon will
yield new insights into the underlying processes, and the tools developed in this project will be applicable to a
wide range of biological and biomedical problems.

## Key facts

- **NIH application ID:** 10874655
- **Project number:** 5R01HG012573-03
- **Recipient organization:** HARVARD MEDICAL SCHOOL
- **Principal Investigator:** Peter J Park
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $658,984
- **Award type:** 5
- **Project period:** 2022-09-01 → 2026-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10874655

## Citation

> US National Institutes of Health, RePORTER application 10874655, Development and Application of Computational Methods for Single Cell DNA Sequencing Data (5R01HG012573-03). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10874655. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
