# A unified probabilistic model and software implementation for analysis of nascent RNA sequencing data

> **NIH NIH R01** · COLD SPRING HARBOR LABORATORY · 2024 · $594,501

## Abstract

PROJECT SUMMARY
The process by which RNA molecules are assembled from DNA templates, called transcription, is
fundamental to all life and dysregulated in many human diseases. Over the past 15 years, studies of the
mechanisms and dynamics of transcription have increasingly relied on a family of techniques for isolating and
sequencing newly transcribed, or “nascent” RNAs. In contrast to standard RNA-seq, these nascent RNA
sequencing (NRS) methods enable transcription to be measured separately from RNA degradation, respond
rapidly to changes in transcription, and reveal the positions of RNA polymerases along a DNA template.
However, NRS data require sophisticated computational and statistical methods for analysis, which are only
beginning to emerge.
Here, we propose to develop a powerful and flexible probabilistic modeling framework for the analysis of NRS
data. Our framework is based on a highly general “unified model” that mathematically describes both the
kinetics of transcription initiation, elongation, and promoter-proximal pause release, and the generation of
sequencing read counts. It can be used to estimate transcriptional rates directly from NRS data, in either a
steady-state or nonequilibrium setting. Our proposal includes three specific aims, focused on the
development of (1) a series of statistical tests and machine-learning methods for differential analysis of
transcription-associated rates; (2) a statistical and machine-learning framework and new experimental
methods for characterizing variation in elongation rate and its dependency on genomic and epigenomic
covariates; and (3) an open-source software package implementing these new methods in the R programming
environment (STADyUM), integrated with the Bioconductor, AnVIL, and PyTorch environments. Successful
completion of these aims will result in powerful, versatile, and highly accessible new computational tools that
will accelerate progress in transcriptional research.

## Key facts

- **NIH application ID:** 10801419
- **Project number:** 1R01HG012944-01A1
- **Recipient organization:** COLD SPRING HARBOR LABORATORY
- **Principal Investigator:** Adam Charles Siepel
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $594,501
- **Award type:** 1
- **Project period:** 2024-09-16 → 2028-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10801419

## Citation

> US National Institutes of Health, RePORTER application 10801419, A unified probabilistic model and software implementation for analysis of nascent RNA sequencing data (1R01HG012944-01A1). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10801419. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
