Abstract: This project aims to establish distributed federated learning (FL) approaches for multi-task training of foundational machine learning (ML) models for diabetic retinopathy (DR), using multi-modal, real-world optical coherence tomography (OCT) data (OCT cross-section, OCT angiography (OCTA), and OCT enface). DR is one of the leading causes of severe vision loss. Early detection, prompt intervention, and reliable assessment of treatment outcomes are essential to prevent irreversible vision loss from DR. However, there are major challenges towards developing clinically relevant holistic algorithms that can perform multi-tasks, i.e., multi-class classification of disease stages (diagnosis), prediction of onset and progression of disease stages (prognosis), and assessment of treatment outcomes. They require large amounts of well curated and labelled datasets from a diverse sub-population for robust performance. Moreover, efforts towards large, centralized datasets for ML research are hindered by significant barriers to data sharing and privacy concerns. In this project, we propose to develop foundational ML models that allow efficient learning of feature representations from a large corpus of ophthalmic imaging data for various downstream tasks – breaking the task-specific paradigm of current ML models. We also establish novel federated ML approaches, where the model training is distributed across institutions instead of sharing patient data. Our first aim is to establish and validate a domain adaptive FL framework for DR diagnosis across four independent institutions. We propose a novel ophthalmic adaptive personalized FL (optho-APFL) technique to tackle domain shift caused by heterogeneous data distribution at different institutions (due to different sub-population density and OCT devices/imaging protocols). We will conduct experiments on the FL deployment in a clinical setting and integrate a granular differential privacy (DP) algorithm into our FL framework