# Big Data Training for Cancer Research

> **NIH NIH R25** · UNIVERSITY OF CALIFORNIA-IRVINE · 2023 · $108,000

## Abstract

PROJECT SUMMARY
Following the NIH Big Data to Knowledge (BD2K) initiative, we have been continuously funded by NCI to create
a two-week summer training program for cancer researchers who are novices in big data analysis. During the
past seven years, we have successfully organized both in-person and online hands-on training opportunities for
traditionally trained biomedical and cancer researchers. Our current workshop uses applications involving cancer
data to teach valuable data science and bioinformatics approaches. However, we strongly believe that these
data science skills are somewhat general. With this funding opportunity, we are excited to extend our workshop
materials with an additional module utilizing data familiar to infectious and immune-mediated disease (IID)
researchers and relevant approaches such as scripting in R, exploratory analysis, data wrangling, and
visualization of longitudinal data. The proposed supplement is directly responsive to NOT-AI-23-010 and will
enable IID researchers to more confidently explore existing IID data, set up their own analysis plans, and
communicate within research teams.
Our proposed supplement course has three goals: (1) Develop two new IID-related case studies for teaching
purposes, both relevant to IID researchers and a more general audience of biomedical researchers; (2) Create
publicly accessible, reusable online materials for IID and cancer research communities that provide instruction
in exploratory data analysis, data wrangling, quality control, and computer programming; and (3) Add these new
case studies and materials to supplement the currently funded workshop on big cancer data as a new hybrid (in-
person and online) pre-module. Like our original R25 workshop, this course will target graduate students,
postdoctoral trainees, physician-scientists, and biomedical scientists, with strong IID backgrounds yet limited
advanced coursework in statistics, bioinformatics, and computer science.
We plan to offer this new module as an addition to our current course rather than as a separate course. Even a
brief scan of the current literature will provide evidence between cancer, immunology (including COVID), and
microbiome. We expect additional benefits for participants based on interdisciplinary interactions and knowledge
they will obtain through participation in the combined course. Finally, many participants in past courses state that
having dedicated time to interact with faculty and other participants and to explore topics of interest independently
gave them the confidence to ask and answer questions—in essence to be self-directed learners.

## Key facts

- **NIH application ID:** 10785775
- **Project number:** 3R25CA233429-05S1
- **Recipient organization:** UNIVERSITY OF CALIFORNIA-IRVINE
- **Principal Investigator:** MIN ZHANG
- **Activity code:** R25 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2023
- **Award amount:** $108,000
- **Award type:** 3
- **Project period:** 2023-09-01 → 2024-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10785775

## Citation

> US National Institutes of Health, RePORTER application 10785775, Big Data Training for Cancer Research (3R25CA233429-05S1). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10785775. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
