Guiding humans to create better labeled datasets for machine learning in biomedical research

NIH RePORTER · NIH · R01 · $332,168 · view on reporter.nih.gov ↗

Abstract

ABSTRACT The whole-slide images used in digital and computational pathology are stored in a tiled pyramidal format to support smooth visualization. While there are a number of tools to read these images, there is a lack of adequate tools available for easy and fast conversion of data to these tiled pyramidal formats. This limits the ability of investigators who generate image analysis or other visualizations from viewing these using pathology software tools. This has resulted in a disconnect between pathology software tools and general-purpose software tools for data and image analysis like Numpy. In this proposal we will create optimized and easy-to- use open-source programming interfaces that allow generation of tiled pyramidal images from a variety of popular array and vector data formats. This will allow users to create arbitrarily large tiled pyramidal images from Numpy, Zarr, and Dask arrays, and vector formats like Scalable Vector Graphics. Firstly, we will generate and document a modular and general-purpose tiling interface for use in python. Second, we will implement support for the most popular input and output formats. Third, we will focus on software engineering to ensure that the software is maintainable and extensible by the research community. This includes documentation of code and examples for use, implementing testing and code review, and packaging for package managers and cloud-readiness. Altogether, this will allow investigators to better visualize the results of their analyses, and will better integrate the now disconnected domains of digital pathology software and general purpose scientific and data analysis software.

Key facts

NIH application ID
10609284
Project number
3R01LM013523-02S1
Recipient
NORTHWESTERN UNIVERSITY
Principal Investigator
Lee Cooper
Activity code
R01
Funding institute
NIH
Fiscal year
2022
Award amount
$332,168
Award type
3
Project period
2021-09-01 → 2025-05-31