Practical Data-Centric AI/ML for Biomedical Researchers

NIH RePORTER · NIH · T32 · $80,983 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY The biomedical research landscape is shifting towards data-driven discovery, fueled by massive datasets and powerful Artificial Intelligence (AI)/Machine Learning (ML) technologies. However, many researchers lack the necessary skills to harness this potential. This project proposes a cloud-based NIGMS Sandbox learning module, "Practical Data-Centric AI/ML for Biomedical Researchers," to bridge this gap and empower researchers with the expertise needed to unlock the transformative power of AI/ML. The module is based on the summer 2022 training workshop sponsored by the Chemistry-Biology Interface T32 Program and Center for Bioinformatics and Computational Biology at the University of Delaware with funding support from the National Institute for General Medical Sciences (T32GM133395-03S1). The teaching materials of that workshop subsequently was incorporated into the Introduction to Data Science course offered by the Bioinformatics Data Science program at the University of Delaware. The proposed module will cover essential data-centric AI/ML topics like data preparation, feature engineering, model building and interpretation, with hands-on exercises using real-world datasets and popular tools like Pandas, Scikit-learn, TensorFlow, and PyTorch. The module will be delivered using Google Cloud Platform (GCP) for accessibility and efficient data analysis, removing technological barriers and empowering researchers without significant institutional resources. The module emphasizes immediate application of learned knowledge through capstone projects tackling real-world problems, and integrates FAIR data principles and responsible AI development methods to ensure ethical and sustainable research practices. The module will employ engaging videos, interactive demos, quiz-based knowledge assessments, and hands- on project challenges to cater to various learning preferences. The proposed module will break down barriers to cutting-edge technology and knowledge, fostering a more inclusive and collaborative research landscape, regardless of researchers’ institutional resources. It will expand the skilled workforce by contributing to a future where AI/ML expertise is readily available for driving biomedical advancements. The module will also improve research quality by focusing on practical skills and FAIR principles to ensure responsible and reproducible research. The proposed learning module will be disseminated via NIGMS Sandbox to have broad reach and accessibility and will be integrated into the curriculum of the University of Delaware to enrich undergraduate/graduate bioinformatics and data science education. In addition, the module can also be adapted for tailored training workshops for researchers new to AI/ML, online MOOC for individual professional development, and foster knowledge sharing and peer support for new researchers.

Key facts

NIH application ID
11037154
Project number
3T32GM142603-03S1
Recipient
UNIVERSITY OF DELAWARE
Principal Investigator
Shawn W Polson
Activity code
T32
Funding institute
NIH
Fiscal year
2024
Award amount
$80,983
Award type
3
Project period
2022-07-01 → 2027-06-30