# Development of methods for transcript quantification and differential expression analysis using long-read sequencing technologies.

> **NIH NIH R21** · UNIVERSITY OF FLORIDA · 2020 · $35,145

## Abstract

The rapid development of Third Generation, Long Read Sequencing (LRS) platforms such as Pacbio and Oxford
Nanopore Technologies (ONT) have enabled increasing precision and higher-throughput sequencing of
transcripts. Long reads can produce full-length transcript sequences, overcoming much of the uncertainty of
short-read methods to accurately define transcripts, particularity for those genes with alternative splicing (more
than 90% of human genes), for which short read sequencing has thus far proved difficult. LRS is therefore the
natural choice for the study of the expression of transcript variants and of the role of alternative isoforms in
disease and development. While the first iterations of the long-read technologies did not produce enough reads
to quantify more than the highest expressed transcripts, the current sequencing depth of up to 8 million reads
per SMRT cells on the Sequel 2 platforms promises reliable quantifiability for more modestly expressed genes.
Also significant yield increases have been reported for Nanopore. This suggests that LRS may have reached
sufficient throughput to enable accurate quantification of gene expression and differential expression analyses.
LRS transcriptomics data have, however, specific properties that are absent in other transcriptomics
technologies, such are partial matches of reference transcript models. Therefore specific methods for
quantification and statistical analysis need to be developed. In this Project, we aim to characterize in detail the
data distribution in long reads data, propose strategies to deal with their particular read uncertainty issues and
develop new strategies for differential expression analysis. The overarching goal is to create the analytical
framework to fully leverage LRS technologies for the study of isoform dynamics in relation of biomedical relevant
questions.

## Key facts

- **NIH application ID:** 10041221
- **Project number:** 1R21HG011280-01
- **Recipient organization:** UNIVERSITY OF FLORIDA
- **Principal Investigator:** Ana Victoria Conesa Cegarra
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $35,145
- **Award type:** 1
- **Project period:** 2020-09-01 → 2021-05-06

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10041221

## Citation

> US National Institutes of Health, RePORTER application 10041221, Development of methods for transcript quantification and differential expression analysis using long-read sequencing technologies. (1R21HG011280-01). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10041221. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*