# Computational tools for estimating cell-type-specific effects in bulk RNA-seq and spatial transcriptomics data, using reference single-cell RNA-seq datasets

> **NIH NIH R35** · ST. JUDE CHILDREN'S RESEARCH HOSPITAL · 2020 · $436,749

## Abstract

PROJECT SUMMARY / ABSTRACT
RNA-seq is a powerful tool for studying molecular biology. However, without cell sorting (or related techniques),
conventional RNA-seq applied to tissue samples cannot determine gene expression in underlying cell-types.
This is problematic because differential gene expression observed at the tissue level is not necessarily reflected
in underling cell-types, which obscures biological insight. For example, Schmiedel et al. recently applied RNA-
seq to 13 purified blood cell-types from 106 individuals1, which uncovered the molecular basis of sex-specific
differences in immune response. However, this was obscured when they applied RNA-seq to only whole-blood.
Single-cell RNA-seq is the obvious candidate to probe cell-type-specific effects more broadly. However, for most
tissues, single-cell RNA-seq has been restricted to small sample sizes, due to specialized dissociation protocols
and cost. Thus, only bulk-tissue RNA-seq data are available for large sample sizes. Crucially, much of these
bulk data are paired to enormous stores of informative clinical phenotypic data and additional -omics data. These
datasets include large NIH initiatives such as GTEx, TCGA, and All of Us, which have collected data on genetics,
disease status, outcome, drug treatments, ethnicity, sex, and much more. The critical gap is that we cannot
currently study the relationship between cell-type level gene expression and any of these phenotypes.
To overcome this limitation, we will develop computational tools for estimating cell-type-specific differential
expression from bulk RNA-seq data, when a small reference single-cell RNA-seq dataset is available from the
same tissue-type. This will allow us to study the cell-type-specific differences in expression that drive human
phenotypes and diseases, unlocking the tens-of-thousands of bulk RNA-seq samples paired to phenotypic data.
The basis for this research program is a previous study where we developed a method to recover the cell-type-
specific effects of inherited genetic variation on gene expression in bulk breast-tumor RNA-seq data. This method
allowed us to discover a novel breast cancer risk gene—which was obscured using conventional methods.
Here, we posit that a similar mathematical framework can be adapted to recover any cell-type-specific effect
from bulk-tissue RNA-seq. Hence, we can develop specific tools to perform multiple commonly applied analyses
at cell-type-specific resolution from bulk-tissue RNA-seq by leveraging matched single-cell data, including
differential expression, correlative and gene set enrichment analysis.
Finally, new spatial transcriptomics technologies are emerging that enable spatially resolved gene expression to
be measured directly in tissue sections. These platforms quantify gene expression in situ in ~100μm barcoded
spots. Each spot captures a small cluster of cells—akin to a miniaturized bulk-tissue RNA-seq experiment.
Hence, the same abstract mathematical framewo...

## Key facts

- **NIH application ID:** 10028501
- **Project number:** 1R35GM138293-01
- **Recipient organization:** ST. JUDE CHILDREN'S RESEARCH HOSPITAL
- **Principal Investigator:** Paul Geeleher
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $436,749
- **Award type:** 1
- **Project period:** 2020-08-01 → 2025-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10028501

## Citation

> US National Institutes of Health, RePORTER application 10028501, Computational tools for estimating cell-type-specific effects in bulk RNA-seq and spatial transcriptomics data, using reference single-cell RNA-seq datasets (1R35GM138293-01). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10028501. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
