# Privacy-Aware Federated Learning for Breast Cancer Risk Assessment

> **NIH NIH U24** · INDIANA UNIVERSITY INDIANAPOLIS · 2024 · $696,689

## Abstract

ABSTRACT:
Federated learning (FL) has gained a lot of attention recently, as it enables analyses of data from numerous
collaborating sites without the need to share data, i.e., each collaborator’s data are always retained within their
site. FL is advantageous as it can: 1) overcome cultural/ownership, privacy, and regulatory concerns (since data
never leave the local site), 2) provide access to restricted data, 3) allow the collection of meaningful amounts of
data for analyses of rare diseases, and 4) address health disparities and inequities. Thus, FL can be noted as a
novel paradigm for multi-site collaborations, enabling access to ample and importantly diverse data, essential to
developing robust models generalizable in unseen data. To this end, we have developed the Federated Tumor
Segmentation (FeTS) platform and the Open Federated Learning (OpenFL) library, as open-source tools with a
commercially friendly license that have facilitated a) the largest to-date real-world federation, involving 3D brain
tumor MRI data from 71 sites across 6 continents, and b) the very first computational challenge in FL, forming
the first benchmarking environment and dataset in the field. This FeTS-OpenFL infrastructure has further been
used to c) identify tumor-infiltrating lymphocytes in histopathology images and d) segment dense tissue in 2D
digital mammography (DM), highlighting its generalizability in different imaging and disease types. Building upon
our successful FeTS-OpenFL infrastructure, we propose to enhance its functionality with new developments on
privacy-aware FL towards classification workloads and evaluate it on a first-of-its-kind use case on breast cancer
(BC) risk assessment. BC is the most diagnosed cancer in the US, the 2nd leading cause of death from cancer
in women, and screening is performed routinely with 2D digital mammography (DM) for women in their 40s-50s.
However, DM yields a lot of false positives and unnecessary subsequent procedures. To alleviate these issues,
3D Digital Breast Tomosynthesis (DBT) has been developed and increasingly replacing DM. Our group has
developed novel volumetric breast density (VBD) measures from DBT scans. Building upon our team’s collective
pioneering work in FL and BC risk assessment, in this proposal we focus on developing a trustworthy, zero-code
principle FL framework for training AI-based classification models and built-in functionality to i) generate realistic
synthetic data, matching local population characteristics, for data augmentation & privacy preservation, and ii)
automatically determine quantitative & interpretable settings of optimal privacy preservation. We will use this
framework to perform the largest to-date evaluation of training deep-learning models for BC risk assessment
using DBT VBD measures and other established risk factors while leveraging multi-site, ethnically diverse data
of women undergoing BC screening. We will also disseminate resources via distribution of source code...

## Key facts

- **NIH application ID:** 10932257
- **Project number:** 5U24CA279629-02
- **Recipient organization:** INDIANA UNIVERSITY INDIANAPOLIS
- **Principal Investigator:** Spyridon Bakas
- **Activity code:** U24 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $696,689
- **Award type:** 5
- **Project period:** 2023-09-20 → 2028-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10932257

## Citation

> US National Institutes of Health, RePORTER application 10932257, Privacy-Aware Federated Learning for Breast Cancer Risk Assessment (5U24CA279629-02). Retrieved via AI Analytics 2026-05-27 from https://api.ai-analytics.org/grant/nih/10932257. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
