# Quantitative Definition of Cell Identity by Integrating Transcriptomic, Epigenomic, and Spatial Features of Individual Cells

> **NIH NIH R01** · UNIVERSITY OF MICHIGAN AT ANN ARBOR · 2022 · $235,054

## Abstract

Quantitative Definition of Cell Identity by Integrating Transcriptomic, Epigenomic, and Spatial Features of Individual Cells
Abstract
 Defining the molecular features that identify the myriad specialized subsets of cells within the human
body is foundational to a genomic approach to medicine. High-throughput single-cell sequencing has recently
opened the door to comprehensively characterizing the molecular identities of human cells. Multiple types of
features contribute to cell identity, including gene expression, epigenomic modifications, and spatial location
within a tissue, but it is not currently possible to simultaneously measure all of these modalities within the same
single cells. Each experimental context and measurement modality provides a different glimpse into cellular
identity, and how to combine these views into a unified picture of cell identity remains unclear.
 Computational integration of multiple single cell experiments performed on different individual cells pro-
vides a way forward despite these challenges. However, existing approaches are not sufficiently robust to inte-
grate single cell data across the full range of biological contexts, nor flexible enough to leverage the unique
properties of different single cell modalities, and require recalculating results each time new data points arrive.
 We recently developed LIGER, a highly robust and flexible algorithm that can integrate single cell data
sharing a common set of gene-centric features across a wide range of biological contexts and modalities. A
key property of our approach is the ability to identify both shared and dataset-specific features that define cell
types across biological contexts. Additionally, LIGER is built upon a powerful matrix factorization framework
that is readily extensible. In preliminary analysis, we showed that our approach can identify cell-type-specific
sexually dimorphic gene expression and human subject variation, map cell types across species, and jointly
define cell types from multiple single cell modalities that share corresponding features.
 Here, we build upon LIGER in several ways to develop a comprehensive framework that can most ef-
fectively leverage the unique aspects of transcriptomic, epigenomic, and spatial data for quantitative definition
of cell identity. First, we develop an “online learning” algorithm that readily scales to millions of cells and can
continually incorporate new data, allowing iterative refinement of cell identity (Aim 1). Second, we develop
novel approaches to integrate single-cell modalities that assay different types of features (such as genes and
intergenic peaks) and contain missing data (as in spatial transcriptomic datasets), enabling inference of epige-
nomic regulation and cross-modal data imputation (Aim 2). In collaboration with biomedical scientists, we apply
our approach to newly generated single cell transcriptomic and single cell epigenomic data from mouse skele-
tal stem cells and experimentally ...

## Key facts

- **NIH application ID:** 10428484
- **Project number:** 5R01HG010883-04
- **Recipient organization:** UNIVERSITY OF MICHIGAN AT ANN ARBOR
- **Principal Investigator:** Joshua Welch
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $235,054
- **Award type:** 5
- **Project period:** 2019-09-03 → 2024-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10428484

## Citation

> US National Institutes of Health, RePORTER application 10428484, Quantitative Definition of Cell Identity by Integrating Transcriptomic, Epigenomic, and Spatial Features of Individual Cells (5R01HG010883-04). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10428484. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*