Cloud strategies for improving cost, scalability, and accessibility of a machine learning system for pathology images

NIH RePORTER · NIH · R01 · $347,144 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Machine learning (ML) has seen tremendous advances in the past decade, fueled by growth in computing power and the availability of large labeled datasets. While the impact of these advances on clinical and biomedical research is potentially significant, these domains face unique challenges due to the difficulty in acquiring labels from experts. This proposal will develop new methodology and open-source software that biomedical data scientists can use with their applications to 1. Improve data labeling by identifying the best samples for labeling that provide the most benefit for training ML algorithms; 2. Improve generalization of ML models across institutes; and 3. Perform this work on Amazon Web Services. These methods and software will be developed in digital pathology applications using multi-institutional datasets. This supplemental funding will enhance cloud support beyond the original proposal, leveraging recent advances in inference server technology that can accelerate our software by orders of magnitude, providing significant savings to users. The supplemental funding will support the additional implementation that is required to incorporate this technology into our software and will help us benchmark the cost-to-benefit ratio of different compute and storage asset classes on Amazon Web Services, Microsoft Azure, and Google Cloud Platform. The deliverables from this work include enhanced software that can support inference serving on multiple cloud service providers and best practices and recommendations for our users.

Key facts

NIH application ID: 10824959
Project number: 3R01LM013523-03S1
Recipient: NORTHWESTERN UNIVERSITY
Principal Investigator: Lee Cooper
Activity code: R01
Funding institute: NIH
Fiscal year: 2023
Award amount: $347,144
Award type: 3
Project period: 2021-09-01 → 2025-05-31