Developing informatics tools to predict virtual spatial transcriptomics data with single-cell resolution in large-scale studies

NIH RePORTER · NIH · R01 · $362,813 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Recent developments in spatial transcriptomics (ST) technologies have enabled the comprehensive profiling of gene expression across tissues while preserving crucial gene location information. However, the current accessibility of commercialized ST platforms remains constrained by their high costs and long turn-around time, limiting their utility in large-scale ST studies that involve hundreds or thousands of samples. In contrast, hematoxylin and eosin (H&E)-stained histology images are much cheaper to generate. Previous studies have demonstrated correlations between gene expression patterns and histological image features, suggesting the potential to predict spatial gene expression from histology images. Therefore, the integration of histology image data for predicting spatial gene expression in large-scale ST studies has emerged as a promising strategy. This novel approach facilitates the generation of virtual ST data at significantly reduced costs and time commitment. Through these predictive models, we can investigate the intricate connections between spatial gene expression variations and clinical outcomes of interest, with a particular focus on population-based inquiries, such as those in biobank samples. Leveraging our team's expertise in ST data analysis and computational pathology, we propose to develop a suite of informatics tools to harmonize information from histology images and integrate it with ST data with diverse spatial resolutions and gene coverages to build gene expression prediction models. We will further investigate optimal sample selection strategies for ST when conducting a large-scale ST study. These tools will make it possible to predict ST data at single-cell resolution from samples where only histology images are available. To enhance the impact of our research, we will also develop open-source software packages and build a cloud-based computing platform for ST data prediction and visualization in large-scale studies. Realizing the proposed research would signify a transformative advancement in the field, potentially leading to a paradigm shift.

Key facts

NIH application ID
10942156
Project number
1R01LM014592-01
Recipient
UNIVERSITY OF PENNSYLVANIA
Principal Investigator
Mingyao Li
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$362,813
Award type
1
Project period
2024-08-27 → 2028-07-31