Abstract Data exploration, analysis, and visualization are integral to modern pharmacological research. We propose a cloud-based learning module focused on best practices in these areas, with an emphasis on creating repro- ducible workflows and understanding statistical principles relevant to large datasets. This module will support research training under the parent T32 ‘Pharmacological Sciences Training Grant’ by enabling trainees to effec- tively employ cloud computing and establish reproducible data analysis workflows with JupyterLab Notebooks and GitHub. It will also facilitate the application of contemporary software tools for data analysis and visualiza- tion, and provide a solid understanding of the key statistical principles involved. In summary, this module will offer a comprehensive educational experience in data science and statistical principals, tailored specifically for large genomic datasets. Its reach extends beyond our T32 trainees, including biomedical graduate students from various disciplines. Additionally, it caters to an indeterminate number of online learners, broadening its impact and accessibility in the field of pharmacological research.