# A Unified Machine Learning Package for Cancer Diagnosis

> **NIH NIH U01** · UNIVERSITY OF WASHINGTON · 2020 · $323,139

## Abstract

The long-term goal of this project is to develop a unified software package for sharing image
analysis and machine learning tools to improve the accuracy and efficiency of cancer
diagnosis, thus aiding in improving the quality of both cancer research and clinical
practice. Since diagnostic errors can cause altered treatment recommendations and significant
patient harm, tools that semi-automate the diagnostic process to improve efficiency, reliability
and accuracy are thus needed by both cancer researchers and clinical pathologists. Our work
will produce a set of machine-learning tools to identify regions of interest (ROI), classify these
regions into the full spectrum of diagnostic categories (ranging from benign, to pre-invasive/risk
lesions, to invasive cancer), and culminate in the development of a unified software package for
cancer diagnosis that can be shared among cancer researchers and clinicians. Our aims are:
Aim 1: Regions of Interest. Produce 1a) a ROI finder classifier and associated tools for use
by researchers or pathologists for automatic identification of potential ROIs on whole slide
images of breast biopsy slides and 1b) a ROI analysis classifier and associated tools that can
point out image regions that tend to cause misdiagnosis and produce suitable warnings as to
why such regions may either be distractors or indicate cancer.
Aim 2: Diagnosis. Produce a diagnostic classifier and associated tools that can not only
suggest the potential diagnosis of a whole slide image, but can also produce the reasons for the
diagnosis in terms of regions on the image, their color, their texture, and their structure.
Aim 3: Dissemination. Develop a unified software package containing this suite of tools, so
they can be easily shared and provided (standalone and through the existing PIIP platform) to
both cancer researchers and clinical pathologists. In addition to specific classifiers for breast
cancer research, we will provide the methodology to train related classifiers for other biopsy-
diagnosed cancers, such as melanoma, prostate, lung, and colon cancer.
Our highly innovative diagnostic tools will include our state-of-the-art (2017) feature
extraction and machine learning methods as opposed to the generic image management
tools of the current ITCR projects. Our classifiers were trained on a unique data set acquired in
a carefully designed breast cancer research study, so they are immediately useful in the
breast cancer domain, but designed to be easily retrained for other data sets and cancers.
The addition of our unique multi-disciplinary team and these tools to the ITCR program will
be an important aid to cancer research and will improve the process of diagnosis for clinicians.

## Key facts

- **NIH application ID:** 9997819
- **Project number:** 5U01CA231782-03
- **Recipient organization:** UNIVERSITY OF WASHINGTON
- **Principal Investigator:** JOANN G ELMORE
- **Activity code:** U01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $323,139
- **Award type:** 5
- **Project period:** 2018-09-07 → 2022-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9997819

## Citation

> US National Institutes of Health, RePORTER application 9997819, A Unified Machine Learning Package for Cancer Diagnosis (5U01CA231782-03). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/9997819. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
