# Population-Based Evaluation of Artificial Intelligence for Mammography Prior to Widespread Clinical Translation

> **NIH NIH R01** · UNIVERSITY OF WASHINGTON · 2022 · $679,234

## Abstract

PROJECT SUMMARY
Multiple artificial intelligence (AI) technologies are now commercially available for automated interpretation of
screening mammography. These AI technologies hold promise for improving screening performance and
outcomes for the 40 million U.S. women who undergo routine breast cancer screening each year. Federal
regulatory approval of new AI technologies requires only a demonstration of non-inferior accuracy to existing
computer-aided detection systems in small, retrospective reader studies, but their widespread clinical
translation is contingent upon more robust population-based evaluation. Specifically, the impact of these AI
technologies on actual patient outcomes needs to be assessed, including whether or not they lead to improved
detection of clinically meaningful cancers in the general screening population. Robust external validation of AI
algorithms for mammography screening has thus far been limited by use of single institution datasets not
representative of the entire target population, use of AI algorithms that are not publicly available, comparison to
radiologist performance in enriched case sets, limited follow-up time for cancer diagnoses influencing ground
truth labels, and evaluation on 2D digital mammography rather than 3D digital breast tomosynthesis (DBT)
exams. Our study objective is to conduct a comparative evaluation of five commercially available AI
technologies for automated DBT screening interpretation that overcomes all of these limitations and then
estimate the long-term benefits, harms, and costs of AI-driven DBT screening at the U.S. population level.
Specifically, we will 1) use a centralized honest broker, model-to-data paradigm infrastructure to perform an
independent, external validation of five leading commercial AI technologies for DBT screening using
prospectively collected data obtained from eight diverse U.S. regional breast imaging registries; 2) stratify AI
vs. radiologist performance on detailed woman-, exam-, radiologist-, and tumor-level characteristics to inform
targeted algorithm training and refinement efforts to ensure generalizability of the AI algorithms; 3) explore
targeted approaches for improving clinical workflow efficiency by using AI to safely triage exams highly likely to
be negative; and 4) use a validated breast cancer microsimulation model to determine population-level, long-
term health benefits, harms, and costs associated with AI technologies for DBT screening both as a standalone
screening tool and as a second independent reader to radiologist interpretation. Our proposed study will
represent the most objective and rigorous evaluation of deep learning algorithms for DBT screening
interpretation in the U.S. to date. Our results will provide urgently needed evidence to inform key stakeholders
including women, physicians, payers, industry partners, and policymakers regarding how to maximize the
value of AI technologies for DBT screening prior to their widespread clinical t...

## Key facts

- **NIH application ID:** 10445206
- **Project number:** 1R01CA262023-01A1
- **Recipient organization:** UNIVERSITY OF WASHINGTON
- **Principal Investigator:** CHRISTOPH I LEE
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $679,234
- **Award type:** 1
- **Project period:** 2022-07-01 → 2027-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10445206

## Citation

> US National Institutes of Health, RePORTER application 10445206, Population-Based Evaluation of Artificial Intelligence for Mammography Prior to Widespread Clinical Translation (1R01CA262023-01A1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10445206. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
