# Developing informatics tools to predict virtual spatial transcriptomics data with single-cell resolution in large-scale studies

> **NIH NIH R01** · UNIVERSITY OF PENNSYLVANIA · 2024 · $362,813

## Abstract

PROJECT SUMMARY
Recent developments in spatial transcriptomics (ST) technologies have enabled the comprehensive profiling of
gene expression across tissues while preserving crucial gene location information. However, the current
accessibility of commercialized ST platforms remains constrained by their high costs and long turn-around time,
limiting their utility in large-scale ST studies that involve hundreds or thousands of samples. In contrast,
hematoxylin and eosin (H&E)-stained histology images are much cheaper to generate. Previous studies have
demonstrated correlations between gene expression patterns and histological image features, suggesting the
potential to predict spatial gene expression from histology images. Therefore, the integration of histology image
data for predicting spatial gene expression in large-scale ST studies has emerged as a promising strategy. This
novel approach facilitates the generation of virtual ST data at significantly reduced costs and time commitment.
Through these predictive models, we can investigate the intricate connections between spatial gene expression
variations and clinical outcomes of interest, with a particular focus on population-based inquiries, such as those
in biobank samples. Leveraging our team's expertise in ST data analysis and computational pathology, we
propose to develop a suite of informatics tools to harmonize information from histology images and integrate it
with ST data with diverse spatial resolutions and gene coverages to build gene expression prediction models.
We will further investigate optimal sample selection strategies for ST when conducting a large-scale ST study.
These tools will make it possible to predict ST data at single-cell resolution from samples where only histology
images are available. To enhance the impact of our research, we will also develop open-source software
packages and build a cloud-based computing platform for ST data prediction and visualization in large-scale
studies. Realizing the proposed research would signify a transformative advancement in the field, potentially
leading to a paradigm shift.

## Key facts

- **NIH application ID:** 10942156
- **Project number:** 1R01LM014592-01
- **Recipient organization:** UNIVERSITY OF PENNSYLVANIA
- **Principal Investigator:** Mingyao Li
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $362,813
- **Award type:** 1
- **Project period:** 2024-08-27 → 2028-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10942156

## Citation

> US National Institutes of Health, RePORTER application 10942156, Developing informatics tools to predict virtual spatial transcriptomics data with single-cell resolution in large-scale studies (1R01LM014592-01). Retrieved via AI Analytics 2026-05-26 from https://api.ai-analytics.org/grant/nih/10942156. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
