# Can machines be trusted? Robustification of deep learning for medical imaging

> **NIH NIH R01** · UNIVERSITY OF WISCONSIN-MADISON · 2020 · $318,155

## Abstract

Machine learning algorithms have become increasing popular in medical imaging, where highly functional
algorithms have been trained to recognize patterns or features within image data sets and perform clinically
relevant tasks such as tumor segmentation and disease diagnosis. In recent years, an approach known as
deep learning has revolutionized the field of machine learning, by leveraging massive datasets and immense
computing power to extract features from data. Deep learning is ideally suited for problems in medical imaging,
and has enjoyed success in diverse tasks such as segmenting cardiac structures, tumors, and tissues.
However, research in machine learning has also shown that deep learning is fragile in the sense that carefully
designed perturbations to an image can cause the algorithm to fail. These perturbations can be designed to be
imperceptible by humans, so that a trained radiologist would not make the same mistakes. As deep learning
approaches gain acceptance and move toward clinical implementation, it is therefore crucial to develop a
better understanding of the performance of neural networks. Specifically, it is critical to understand the limits of
deep learning when presented with noisy or imperfect data. The goal of this project is to explore these
questions in the context of medical imaging—to better identify strengths, weaknesses, and failure points of
deep learning algorithms.
We posit that malicious perturbations, of the type studied in theoretical machine learning, may not be
representative of the sort of noise encountered in medical images. Although noise is inevitable in a physical
system, the noise arising from sources such as subject motion, operator error, or instrument malfunction may
have less deleterious effects on a deep learning algorithm. We propose to characterize the effect of these
perturbations on the performance of deep learning algorithms. Furthermore, we will study the effect of random
labeling error introduced into the data set, as might arise due to honest human error. We will also develop new
methods for making deep learning algorithms more robust to the types of clinically relevant perturbations
described above.
In summary, although the susceptibility of neural networks to small errors in the inputs is widely recognized in
the deep learning community, our work will investigate these general phenomena in the specific case of
medical imaging tasks, and also conduct the first study of average-case errors that could realistically arise in
clinical studies. Furthermore, we will produce novel recommendations for how to quantify and improve the
resiliency of deep learning approaches in medical imaging.

## Key facts

- **NIH application ID:** 9972588
- **Project number:** 1R01LM013151-01A1
- **Recipient organization:** UNIVERSITY OF WISCONSIN-MADISON
- **Principal Investigator:** John William Garrett
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $318,155
- **Award type:** 1
- **Project period:** 2020-07-02 → 2024-03-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9972588

## Citation

> US National Institutes of Health, RePORTER application 9972588, Can machines be trusted? Robustification of deep learning for medical imaging (1R01LM013151-01A1). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/9972588. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*