A cloud-based learning module to analyze ATAC-seq and single cell ATAC-seq data

NIH RePORTER · NIH · P20 · $106,732 · view on reporter.nih.gov ↗

Abstract

Abstract The recent surge in genomic datasets can be attributed to advancements in genomic methodologies and the availability of user-friendly kits, enabling genome-wide research across a diverse array of scientific disciplines. These datasets harbor invaluable information, facilitating groundbreaking conclusions. Among these methodologies, ChIP-seq to measure the chromatin occupancy of proteins has been a staple for many researchers. CUT&RUN and CUT&Tag are two new methods measuring protein occupancy and require their own bioinformatic considerations. These technologies have transformed our understanding of TF-mediated gene regulation and the responsiveness of chromatin modifications. Despite the recognized value of these data, the analysis often becomes a bottleneck due to the need for bioinformatic expertise and specific computational resources. We will address these challenges and aid researchers in integrating chromatin occupancy data with chromatin accessibility (ATAC-seq) and gene expression (RNA-seq) taught in other NIH/NIGMS Sandbox modules. Therefore, we propose the development of cloud-based training focused on Chromatin Occupancy by Next-Generation Sequencing, including ChIP-seq, CUT&RUN, and CUT&Tag. By integrating these with other Sandbox modules (ATAC-seq and RNA-seq), users will learn how to perform basic processing steps as well as critical downstream analyses. Our approach will use interactive lessons in Jupyter notebooks that follow an example analysis of p63, H3K27ac in relation to the BAF complex. This module targets a broad audience, including students, postdocs, INBRE scholars, and researchers with minimal or no programming background. The project leaders, who are experienced in teaching and using these methods, are uniquely qualified to develop this training module. Through this initiative, we aim to democratize access to bioinformatics training, facilitating the analysis of chromatin occupancy genome-wide.

Key facts

NIH application ID
11037534
Project number
3P20GM103427-23S1
Recipient
UNIVERSITY OF NEBRASKA MEDICAL CENTER
Principal Investigator
PAUL L SORGEN
Activity code
P20
Funding institute
NIH
Fiscal year
2024
Award amount
$106,732
Award type
3
Project period
2001-09-30 → 2026-04-30