# Genome-wide structural organization of proteins within human gene regulatory complexes

> **NIH NIH R01** · PENNSYLVANIA STATE UNIVERSITY, THE · 2020 · $66,712

## Abstract

Project Summary/Abstract
 The DNA sequence of the human genome informs us as to the composition of proteins that make up
healthy cells, but also altered compositions that create diseased cells. How protein production is controlled
through the regulation of the genes that encode them is of critical importance for healthy and diseased cells.
Knowing precisely where gene regulatory proteins bind, and are organized throughout the genome, including
their interactions with each other, informs us as to how genes are regulated and mis-regulated. Since there
are potentially thousands of different kinds of regulatory proteins and thousands of different kinds of human
cell types and environmental responses that are a product of various subsets of regulatory proteins, the
entire “universe” of gene regulatory events is quite substantial and consequently, quite costly to identify. One
of the main bottlenecks in analysis of genomic data is efficient and scalable visualization approaches. The
PEGR open source platform will provide programmatic access to any number of human cell sequenced
datasets, from any stage of NGS processing, with the pipeline analysis results available for high-throughput
machine learning testing and development. This project will empower discovery through the automated
analysis and visualization of results from both small- and large-scale datasets. This architecture will include
the following features: 1) a secure, cloud-based, metadata management system that instills best practices of
experimental rigor, reproducibility, and data sharing; 2) automated Galaxy-based epigenomic data
processing pipelines that provide easy-to-use “wizards” for standardized processing of common epigenomic
data types; and 3) an easily de-ployable, open source software package as a means to disseminate data, tools,
and discoveries via cloud services.

## Key facts

- **NIH application ID:** 10166093
- **Project number:** 3R01GM125722-03S1
- **Recipient organization:** PENNSYLVANIA STATE UNIVERSITY, THE
- **Principal Investigator:** Shaun Aengus Mahony
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $66,712
- **Award type:** 3
- **Project period:** 2018-01-19 → 2021-12-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10166093

## Citation

> US National Institutes of Health, RePORTER application 10166093, Genome-wide structural organization of proteins within human gene regulatory complexes (3R01GM125722-03S1). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10166093. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
