Machine learning algorithms to analyze large medical image datasets

NIH RePORTER · NIH · R01 · $369,580 · view on reporter.nih.gov ↗

Abstract

Machine learning (ML) is poised to enable faster and more accurate interpretation of medical images by augmenting the capabilities of experts. The cost and difficulty of generating expert quality labelled image data is the primary limitation preventing faster progress and deployment in more domains. Success of ML techniques for medical image interpretation may reduce the burden on radiologists, reducing errors arising from fatigue or interruption, while simultaneously reducing costs and increasing speed and accuracy for patients. Our overall objective for this research is to dramatically reduce the burden of creating high quality reference labels by requiring only a small set of such labels from experts. We propose to address this problem by creating innovative algorithms that will construct reference quality labelled data with little input from domain experts, thus dramatically reducing the cost of labelling. This will enable us to apply ML techniques to generate high quality labels of the large amounts of unlabeled data that are already available, which in turn will facilitate the assessment of potential quantitative imaging biomarkers. We will develop, extend and evaluate novel algorithms that represent three distinct strategies for reducing labelling cost. These three strategies are learning from unlabelled data incorporating a novel strategy for characterizing uncertainty, optimizing sample selection for expert quality labelling with a novel form of Active Learning especially suited for deep learning, and reducing the cost of achieving quality labeling by replacing or augmenting an expert with a crowd of inexperts. We will then implement and distribute these novel algorithms, facilitating the replication of our experiments. Finally, we will demonstrate the practical efficacy of these three strategies by applying them to the important challenge of identifying quantitative imaging biomarkers that best capture alterations in brain structure that are associated with characteristics of ASD. These fundamental advances in informatics algorithms will reduce the cost and increase the rate of obtaining quality labels, which will in turn facilitate the widespread adoption and deployment of machine learning algorithms for image interpretation. Ultimately, this will stimulate the development of new imaging biomarkers that hold the potential to dramatically improve clinical decision-making and patient outcomes.

Key facts

NIH application ID: 10818374
Project number: 5R01LM013608-04
Recipient: BOSTON CHILDREN'S HOSPITAL
Principal Investigator: SIMON K WARFIELD
Activity code: R01
Funding institute: NIH
Fiscal year: 2024
Award amount: $369,580
Award type: 5
Project period: 2021-07-01 → 2026-03-31