A Fully Decentralized Federated Learning Framework for Automated Image Segmentation in Cancer Radiotherapy

NIH RePORTER · NIH · R21 · $442,664 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY While the recent surge of artificial intelligence (AI) has made remarkable progress in various image analysis tasks, their performance in a broad range of clinical environment is largely restricted by the limited generalization capability when being applied to new data, primarily because most models have been generated using data from a single institution or public datasets with limited training data. Aggregating data from different institutions could improve model training, but such centralized data sharing is practically challenging due to various technical, legal, privacy and data ownership barriers. This proposal aims to address these barriers by developing a novel gossip federated learning (GFL) framework to build an effective AI model by learning from different data sources without the need of sharing patient data. As compared to the traditional client/server federated learning such as FedAvg, the proposed framework is fully decentralized in that the models trained in local datasets will directly communicate to each other in a peer-to-peer manner, making our method more robust and efficient. We will develop and evaluate the proposed scheme in the task of automated organ segmentation in CT images for liver and head and neck (H&N) cancer patients treated with radiation therapy (RT) because accurate, robust and efficient delineation of those organs at risk (OARs) is a clinically important but technically challenging problem. We hypothesize that the model trained with our framework can achieve segmentation performance not inferior to a model with data pooled from all the resources. The dynamics of our recently created healthcare system mimic a diverse multi-institutional environment, which places us in an ideal setting to systematically evaluate our framework. Our specific aims include: 1) Establish the GFL-based automated OAR segmentation framework, and develop the supporting software infrastructure; 2) Optimize the GFL-based autosegmentation; 3) Evaluate GFL-based OAR segmentation framework with 400 liver and 400 H&N cancer patients collected from four hospitals within a metropolitan health system. This proposal addresses two key research priorities for NIBIB: machine learning based segmentation and approaches that facilitate interoperability among annotations used in image training databases. The success of this project will substantially increase the number and variety of data for model training without sacrificing the patient privacy, and thus improve the performance and generalization of the segmentation model on new data. We will open-source this framework, which may enable a larger scale of multi-institutional collaboration and could expedite the clinical adoption of AI-driven autosegmentation in RT. More importantly, this framework provides a flexible and robust solution to the primary barrier of applying AI to the medical domain where learning on multi-institutional data sharing is impeded by patient privacy concerns, ...

Key facts

NIH application ID: 10831775
Project number: 7R21EB030209-02
Recipient: COLUMBIA UNIVERSITY HEALTH SCIENCES
Principal Investigator: Yading Yuan
Activity code: R21
Funding institute: NIH
Fiscal year: 2021
Award amount: $442,664
Award type: 7
Project period: 2023-09-15 → 2025-08-31