# Single Cell Transcriptomic and Genetic Diversity by Single Molecule Long Read Sequencing

> **NIH NIH R01** · UNIVERSITY OF PENNSYLVANIA · 2020 · $585,388

## Abstract

PROJECT SUMMARY
Defining the features of cellular mixtures, where diverse cell types with distinct genomic characteristics are
physically intermingled together, is a central problem in biology. During the past decade, single cell
sequencing technologies have enabled a new era of high throughput and high resolution interrogation of cell
type diversity, vastly expanding our understanding of the role that cell types play in development and disease.
Yet, current studies in single cell genomics rely on short-read sequencing and thus suffer from limitations,
including: (1) Most studies rely on short read counting which limits the study of alternative splicing. (2) Cell
states are reflected by static snapshots, and while population dynamics can be deduced through trajectory and
RNA velocity estimation, robust estimation of these parameters remains a major challenge. (3) Despite
advances in single-cell DNA sequencing, there is yet no cost-effective way to simultaneously characterize both
the genetic variants and transcriptome-level changes in a cell, which is crucial for diseases such as cancer.
This proposal is motivated by technological breakthroughs in single-molecule sequencing (SMS) and the
recent adaptation of SMS to the massively parallel sequencing of single cell transcriptomes in our lab. We
propose to develop computational methods to harness the power of SMS in single cell transcriptomics. In
particular, we have developed a new genomic approach which allows one to repeatedly interrogate complete
transcripts from single cells using SMS long reads, rather than 3' or 5' counting with short reads. This
technology allows experimental designs where specific transcript subsets and/or cellular subsets can be
repeatedly targeted for deeper joint short and long read analysis over many iterations, which we will exploit to
conduct analyses that were previously intractable.

## Key facts

- **NIH application ID:** 10050892
- **Project number:** 2R01HG006137-10
- **Recipient organization:** UNIVERSITY OF PENNSYLVANIA
- **Principal Investigator:** Hanlee P Ji
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $585,388
- **Award type:** 2
- **Project period:** 2011-07-06 → 2023-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10050892

## Citation

> US National Institutes of Health, RePORTER application 10050892, Single Cell Transcriptomic and Genetic Diversity by Single Molecule Long Read Sequencing (2R01HG006137-10). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10050892. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
