# Analytical Infrastructure for Multiple Sample Single Cell Genomic Data

> **NIH NIH R01** · JOHNS HOPKINS UNIVERSITY · 2024 · $361,069

## Abstract

Project Summary
Single cell genomic technologies are fast evolving technologies capable of measuring a variety of omics data
modalities in individual cells. With a wide range of applications such as discovery of new cell types, mapping
temporal and spatial cellular programs in development, tissues and organs, and studying cell-cell interactions in
tumor microenvironments, etc., these technologies are rapidly transforming biomedical research. Early single cell
genomic studies are primarily focused on characterizing how cells are different in a sample. Recently, however,
single cell studies increasingly produce a large number of biological or patient samples, creating new opportuni-
ties and growing demands for studying how samples are different and how omics programs are associated with
sample phenotype. Despite these new opportunities and demands, the sample-level heterogeneity represents
an extra layer of complexity not fully dealt with by existing data analysis methods.
 This proposal aims to develop new analytical methods and software tools to address three open challenges in
the unsupervised analyses of multi-sample single cell genomic data at population scale, across data modalities,
and across species. Our Aim 1 will address the challenge that longitudinal data with densely sampled time points
from the same patient are difficult to obtain for studying dynamic cellular programs along disease progression.
We will develop an alternative strategy and analytical method, sample trajectory analysis, to infer temporal pro-
gression of sample phenotype and its associated cellular programs using cross-sectionally collected samples.
Our Aim 2 will tackle the problem of integrating data across samples and modalities to allow analyses of sample
similarities and differences. We will develop a systematic sample harmonization method to address challenges in
harmonizing data with unmatched features, choosing optimal feature type and resolution, and removing unwanted
technical noises while keeping meaningful biological variation. We will also create an analytical framework to sup-
port systematic unsupervised analysis of sample heterogeneity. Our Aim 3 will develop a solution to comparative
analysis of multi-sample single cell data across species which is important for identifying conserved and diverged
biological processes between human and animal models. Such knowledge is fundamental for designing and
interpreting animal model experiments for studying human diseases.
 Upon completion of this proposal, we will deliver our methods through open-source software tools. These
tools will be widely useful for analyzing single cell genomic data with multiple samples. By addressing several
major challenges in single cell genomic data analyses, our new methods and tools will help unleash the full
potential of single cell genomic technologies for biomedical research and can have a major impact on advancing
our understanding of both basic biology and human diseases.

## Key facts

- **NIH application ID:** 10989329
- **Project number:** 1R01HG013409-01A1
- **Recipient organization:** JOHNS HOPKINS UNIVERSITY
- **Principal Investigator:** Hongkai Ji
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $361,069
- **Award type:** 1
- **Project period:** 2024-09-01 → 2028-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10989329

## Citation

> US National Institutes of Health, RePORTER application 10989329, Analytical Infrastructure for Multiple Sample Single Cell Genomic Data (1R01HG013409-01A1). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10989329. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
