# Distributed Learning of Deep Learning Models for Cancer Research

> **NIH NIH U01** · STANFORD UNIVERSITY · 2020 · $394,824

## Abstract

Project Summary
Deep learning methods are showing great promise for advancing cancer research and could potentially
improve clinical decision making in cancers such as primary brain glioma, where deep learning models have
recently shown promising results in predicting isocitrate dehydrogenase (IDH) mutation and survival in these
patients. A major challenge thwarting this research, however, is the requirement for large quantities of labeled
image data to train deep learning models. Efforts to create large public centralized collections of image data
are hindered by barriers to data sharing, costs of image de-identification, patient privacy concerns, and control
over how data are used. Current deep learning models that are being built using data from one or a few
institutions are limited by potential overfitting and poor generalizability. Instead of centralizing or sharing patient
images, we aim to distribute the training of deep learning models across institutions with computations
performed on their local image data. Although our preliminary results demonstrate the feasibility of this
approach, there are three key challenges to translating these methods into research practice: (1) data is
heterogeneous among institutions in the amount and quality of data that could impair the distributed
computations, (2) there are data security and privacy concerns, and (3) there are no software packages that
implement distributed deep learning with medical images. We tackle these challenges by (1) optimizing and
expanding our current methods of distributed deep learning to tackle challenges of data variability and data
privacy/security, (2) creating a freely available software system for building deep learning models on multi-
institutional data using distributed computation, and (3) evaluating our system to tackle deep learning problems
in example use cases of classification and clinical prediction in primary brain cancer. Our approach is
innovative in developing distributed deep learning methods that will address variations in data among different
institutions, that protect patient privacy during distributed computations, and that enable sites to discover
pertinent datasets and participate in creating deep learning models. Our work will be significant and impactful
by overcoming critical hurdles that researchers face in tapping into multi-institutional patient data to create
deep learning models on large collections of image data that are more representative of disease than data
acquired from a single institution, while avoiding the hurdles to inter-institutional sharing of patient data.
Ultimately, our methods will enable researchers to collaboratively develop more generalizable deep learning
applications to advance cancer care by unlocking access to and leveraging huge amounts of multi-institutional
image data. Although our clinical use case in developing this technology is primary brain cancer, our methods
will generalize to all cancers, as well as to othe...

## Key facts

- **NIH application ID:** 10018827
- **Project number:** 5U01CA242879-02
- **Recipient organization:** STANFORD UNIVERSITY
- **Principal Investigator:** Jayashree Kalpathy-Cramer
- **Activity code:** U01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $394,824
- **Award type:** 5
- **Project period:** 2019-09-16 → 2022-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10018827

## Citation

> US National Institutes of Health, RePORTER application 10018827, Distributed Learning of Deep Learning Models for Cancer Research (5U01CA242879-02). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10018827. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*