# Machine learning algorithms to analyze large medical image datasets

> **NIH NIH R01** · BOSTON CHILDREN'S HOSPITAL · 2024 · $369,580

## Abstract

Machine learning (ML) is poised to enable faster and more accurate interpretation of medical images by
augmenting the capabilities of experts. The cost and difficulty of generating expert quality labelled image data
is the primary limitation preventing faster progress and deployment in more domains. Success of ML
techniques for medical image interpretation may reduce the burden on radiologists, reducing errors arising
from fatigue or interruption, while simultaneously reducing costs and increasing speed and accuracy for
patients. Our overall objective for this research is to dramatically reduce the burden of creating high quality
reference labels by requiring only a small set of such labels from experts. We propose to address this problem
by creating innovative algorithms that will construct reference quality labelled data with little input from domain
experts, thus dramatically reducing the cost of labelling. This will enable us to apply ML techniques to generate
high quality labels of the large amounts of unlabeled data that are already available, which in turn will facilitate
the assessment of potential quantitative imaging biomarkers. We will develop, extend and evaluate novel
algorithms that represent three distinct strategies for reducing labelling cost. These three strategies are
learning from unlabelled data incorporating a novel strategy for characterizing uncertainty, optimizing sample
selection for expert quality labelling with a novel form of Active Learning especially suited for deep learning,
and reducing the cost of achieving quality labeling by replacing or augmenting an expert with a crowd of
inexperts. We will then implement and distribute these novel algorithms, facilitating the replication of our
experiments. Finally, we will demonstrate the practical efficacy of these three strategies by applying them to
the important challenge of identifying quantitative imaging biomarkers that best capture alterations in brain
structure that are associated with characteristics of ASD. These fundamental advances in informatics
algorithms will reduce the cost and increase the rate of obtaining quality labels, which will in turn facilitate the
widespread adoption and deployment of machine learning algorithms for image interpretation. Ultimately, this
will stimulate the development of new imaging biomarkers that hold the potential to dramatically improve
clinical decision-making and patient outcomes.

## Key facts

- **NIH application ID:** 10818374
- **Project number:** 5R01LM013608-04
- **Recipient organization:** BOSTON CHILDREN'S HOSPITAL
- **Principal Investigator:** SIMON K WARFIELD
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $369,580
- **Award type:** 5
- **Project period:** 2021-07-01 → 2026-03-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10818374

## Citation

> US National Institutes of Health, RePORTER application 10818374, Machine learning algorithms to analyze large medical image datasets (5R01LM013608-04). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10818374. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*