Distributed approaches to train machine learning models in diabetic retinopathy

NIH RePORTER · NIH · R15 · $459,516 · view on reporter.nih.gov ↗

Abstract

Abstract: This project aims to establish distributed federated learning (FL) approaches for training robust, clinically deployable machine learning (ML) models for, i) multi-class classification of DR, and ii) prediction of proliferative DR (PDR) progression, in optical coherence tomography (OCT) angiography (OCTA). DR is one of the leading causes of severe vision loss. Early detection, prompt intervention, and reliable assessment of treatment outcomes are essential to prevent irreversible vision loss from DR. Quantitative OCTA analysis and OCTA-ML models have recently been applied to diagnose, classify, and understand the progression trends of DR. Despite promising results, the clinical utility of OCTA based diagnostic algorithms is not yet fully determined, due to small OCTA data-cohorts in clinical institutions, and the lack of wide-spread validation. More specifically, a major limitation of OCTA-ML models is the need for large amounts of well curated datasets from a diverse sub- population for robust performance. Moreover, efforts towards large, centralized datasets for ML research are hindered by significant barriers to data sharing and privacy concerns. In this project, we aim to establish novel federated ML approaches, where the model training is distributed across institutions instead of sharing patient data and only the model parameters are shared with a central server. This enables gaining insights collaboratively, e.g., in the form of a consensus model, without moving patient data beyond the firewalls of the institutions. Three data cohorts from the Stanford University, University of Illinois Chicago (UIC), and National Taiwan University (NTU) will be used to test the hypothesis that the accuracy of the OCTA-ML models using federated approach is more robust than models built on single institutional datasets. Our first aim is to establish an FL framework with adaptive domain alignment and enhanced data representation learning capability. Key success criterion of aim 1 is to successfully integrate the pilot institutions into the FL framework for distributed training of DR models for multi-class DR classification backed by comprehensive OCTA (textural, geometric, and differential artery-vein (AV)) features. The second aim is to validate the FL-trained OCTA-ML and differential AV complexity features for PDR progression on new longitudinal data from UIC and NTU. Key success criterion of aim 2 is to validate OCTA-ML model performance and identify AV features that provide sensitive biomarkers to predict PDR in patients with DR. As an alternative approach, we propose a vision transformer deep learning model for PDR prediction. The attention mechanism of a transformer model can identify features of DR that can provide new information and specific onsets of PDR progressions. Further investigation of the relationship between the new features learned through the transformer model and clinical biomarkers will allow us to optimize the design for bet...

Key facts

NIH application ID
10795486
Project number
1R15EY035804-01
Recipient
UNIVERSITY OF NORTH CAROLINA CHARLOTTE
Principal Investigator
Minhaj Nur Alam
Activity code
R15
Funding institute
NIH
Fiscal year
2024
Award amount
$459,516
Award type
1
Project period
2024-03-01 → 2027-02-28