A deep-transfer-learning framework to transfer clinical information to single cells and spatial locations in cancer tissues

NIH RePORTER · NIH · R21 · $213,483 · view on reporter.nih.gov ↗

Abstract

SUMMARY In the past 10 years, there has been an explosion of new high-resolution molecular data which revolutionize the way that cancer is understood and treated. They include, single cell transcriptomics, spatial transcriptomics, and computational image analysis. However, the study of the association of those data with clinical outcomes such as survival, relapse, metastasis and drug response were left behind. In the meantime, Deep learning field is maturing very fast with many diverse applications including on biological data. It frequently utilizes multi-layer neural network models to learn and extract highly non-linear representations of data. Transfer learning is the subfield of machine learning, which focuses on transferring knowledge learned from a set of source examples to another types of samples. Combining these two approaches constitutes deep transfer learning and is a promising solution to investigate and understand the association of high-resolution components of these new cancer data with the corresponding clinical outcomes. Here we propose the use of deep transfer learning to transfer patient outcome information learned from large patient transcriptomics cohorts to the cells, cell types, spatial regions, and image features, which can then be further prioritized by their assigned risks and be evaluated as potential targets in the aggressive cancers. Specifically, we will develop deep transfer learning frameworks DEGAS for cell type prioritization and test on glioblastoma and multiple myeloma single cell data to validate this approach. Then it will be applied on single cell data of more aggressive cancer types such as triple negative breast cancer, pancreatic ductal adenocarcinoma, non-small-cell lung cancer, and gastric cancer to prioritize high risk cells and cell types. Then, it will be further modified for use with spatial transcriptomic (ST) data to prioritize high risk spatial regions of breast cancer and pancreatic ductal adenocarcinoma tumors. Since ST data can act as a bridge between single cell to patient-level transcriptomics, and histology images. We will further leverage our framework to identify high risk image features by linking histology image features to patient risk via ST data. Finally, our framework will be built into R and Python packages available through GitHub and Bioconductor for use by the broader cancer research community.

Key facts

NIH application ID
10424763
Project number
1R21CA264339-01A1
Recipient
INDIANA UNIVERSITY INDIANAPOLIS
Principal Investigator
Travis Steele Johnson
Activity code
R21
Funding institute
NIH
Fiscal year
2022
Award amount
$213,483
Award type
1
Project period
2022-07-01 → 2024-06-30