# Integrative clustering of cells and samples using multi-modal single-cell data

> **NIH NIH R01** · BOSTON UNIVERSITY MEDICAL CAMPUS · 2021 · $358,875

## Abstract

Single-cell genomic technologies such as single-cell RNA-seq have emerged as powerful techniques to quantify
molecular states of individual cells and can be used to elucidate the cellular building blocks of complex tissues
and diseases. Given recent rapid advances in single-cell technologies, novel statistical and computational
approaches are needed to efficiently analyze large-scale single-cell datasets with multiple data types such as
gene and protein expression. Discrete Bayesian hierarchical models have been widely used for unsupervised
modeling of discrete data types in fields such as Nature Language Processing (NLP). We have developed a
Bayesian hierarchical model called Cellular Latent Dirichlet Allocation (Celda) to perform bi-clustering of genes
into modules and cells into subpopulations. We will develop novel models that can perform clustering of cells
into subpopulations using multi-modal genomic data or clustering of patients into subgroups using both single-
cell data and patient-level characteristics. These novel methods will be made available in a scalable and
interpretable cloud-based framework accessible to both computational and non-computational users. The aims
of this study are to (1) develop novel models to perform integrative multi-modal and multi-level clustering with
single-cell data, (2) develop an R package and cloud-based platform with a web interface for rapid inference and
visualization of large-scale datasets, and (3) apply Celda models to single-cell datasets from a variety of
biological settings including cancer, lung development, and immunology. Overall, these aims will be
accomplished by an interdisciplinary team with strong expertise in computational biology and bioinformatics,
biostatistics, computer science, and molecular and cellular biology.

## Key facts

- **NIH application ID:** 10215623
- **Project number:** 5R01LM013154-03
- **Recipient organization:** BOSTON UNIVERSITY MEDICAL CAMPUS
- **Principal Investigator:** Joshua D Campbell
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $358,875
- **Award type:** 5
- **Project period:** 2019-08-01 → 2024-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10215623

## Citation

> US National Institutes of Health, RePORTER application 10215623, Integrative clustering of cells and samples using multi-modal single-cell data (5R01LM013154-03). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10215623. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
