Cloud strategies for improving cost, scalability, and accessibility of a machine learning system for pathology images

NIH RePORTER · NIH · R01 · $347,144 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Machine learning (ML) has seen tremendous advances in the past decade, fueled by growth in computing power and the availability of large labeled datasets. While the impact of these advances on clinical and biomedical research is potentially significant, these domains face unique challenges due to the difficulty in acquiring labels from experts. This proposal will develop new methodology and open-source software that biomedical data scientists can use with their applications to 1. Improve data labeling by identifying the best samples for labeling that provide the most benefit for training ML algorithms; 2. Improve generalization of ML models across institutes; and 3. Perform this work on Amazon Web Services. These methods and software will be developed in digital pathology applications using multi-institutional datasets. This supplemental funding will enhance cloud support beyond the original proposal, leveraging recent advances in inference server technology that can accelerate our software by orders of magnitude, providing significant savings to users. The supplemental funding will support the additional implementation that is required to incorporate this technology into our software and will help us benchmark the cost-to-benefit ratio of different compute and storage asset classes on Amazon Web Services, Microsoft Azure, and Google Cloud Platform. The deliverables from this work include enhanced software that can support inference serving on multiple cloud service providers and best practices and recommendations for our users.

Key facts

NIH application ID
10824959
Project number
3R01LM013523-03S1
Recipient
NORTHWESTERN UNIVERSITY
Principal Investigator
Lee Cooper
Activity code
R01
Funding institute
NIH
Fiscal year
2023
Award amount
$347,144
Award type
3
Project period
2021-09-01 → 2025-05-31