# Practical Data-Centric AI/ML for Biomedical Researchers

> **NIH NIH T32** · UNIVERSITY OF DELAWARE · 2024 · $80,983

## Abstract

PROJECT SUMMARY
The biomedical research landscape is shifting towards data-driven discovery, fueled by massive datasets and
powerful Artificial Intelligence (AI)/Machine Learning (ML) technologies. However, many researchers lack the
necessary skills to harness this potential. This project proposes a cloud-based NIGMS Sandbox learning module,
"Practical Data-Centric AI/ML for Biomedical Researchers," to bridge this gap and empower researchers with
the expertise needed to unlock the transformative power of AI/ML. The module is based on the summer 2022
training workshop sponsored by the Chemistry-Biology Interface T32 Program and Center for Bioinformatics and
Computational Biology at the University of Delaware with funding support from the National Institute for General
Medical Sciences (T32GM133395-03S1). The teaching materials of that workshop subsequently was
incorporated into the Introduction to Data Science course offered by the Bioinformatics Data Science program
at the University of Delaware. The proposed module will cover essential data-centric AI/ML topics like data
preparation, feature engineering, model building and interpretation, with hands-on exercises using real-world
datasets and popular tools like Pandas, Scikit-learn, TensorFlow, and PyTorch. The module will be delivered
using Google Cloud Platform (GCP) for accessibility and efficient data analysis, removing technological barriers
and empowering researchers without significant institutional resources. The module emphasizes immediate
application of learned knowledge through capstone projects tackling real-world problems, and integrates FAIR
data principles and responsible AI development methods to ensure ethical and sustainable research practices.
The module will employ engaging videos, interactive demos, quiz-based knowledge assessments, and hands-
on project challenges to cater to various learning preferences. The proposed module will break down barriers to
cutting-edge technology and knowledge, fostering a more inclusive and collaborative research landscape,
regardless of researchers’ institutional resources. It will expand the skilled workforce by contributing to a future
where AI/ML expertise is readily available for driving biomedical advancements. The module will also improve
research quality by focusing on practical skills and FAIR principles to ensure responsible and reproducible
research. The proposed learning module will be disseminated via NIGMS Sandbox to have broad reach and
accessibility and will be integrated into the curriculum of the University of Delaware to enrich
undergraduate/graduate bioinformatics and data science education. In addition, the module can also be adapted
for tailored training workshops for researchers new to AI/ML, online MOOC for individual professional
development, and foster knowledge sharing and peer support for new researchers.

## Key facts

- **NIH application ID:** 11037154
- **Project number:** 3T32GM142603-03S1
- **Recipient organization:** UNIVERSITY OF DELAWARE
- **Principal Investigator:** Shawn W Polson
- **Activity code:** T32 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $80,983
- **Award type:** 3
- **Project period:** 2022-07-01 → 2027-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/11037154

## Citation

> US National Institutes of Health, RePORTER application 11037154, Practical Data-Centric AI/ML for Biomedical Researchers (3T32GM142603-03S1). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/11037154. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
