# Developing Informatics Technologies to Model Cancer Gene Regulation

> **NIH NIH U24** · DANA-FARBER CANCER INST · 2022 · $647,787

## Abstract

PROJECT SUMMARY
How trans-acting factors regulate genome-wide gene expression in cancer is poorly understood, motivating an
increasing number of ChIP-seq, DNase-seq, and ATAC-seq (simplified as “cistrome”) experiments to map
genome-wide transcription factor binding sites and chromatin status. Significant biological insights have been
gained through the computational analysis of cistrome data, especially when integrated with other published
cistrome and gene expression data sets. However, most cancer biologists find computational data analysis and
integration of cistrome and epigenome data to be the single most limiting bottleneck in their cancer gene
regulation studies due to the lack of informatics expertise and computational infrastructure relative to the
extraordinary volume of publicly available data. We have previously developed Cistrome Analysis Pipeline (AP)
and Cistrome Data Browser (DB) to overcome this challenge. The objective of this proposal is to expand
the functionality of Cistrome AP and DB to improve the collection, management, analysis, integration,
visualization, and dissemination of cistrome and related data types. A flexible and intuitive user
experience will empower experimental cancer biologists to create more insightful models of transcriptional and
epigenetic gene regulation in cancer research.
Specifically, we propose to improve and extend our existing Cistrome Analysis Pipeline (http://cistrome.org/ap)
and Cistrome Data Browser (http://cistrome.org/db) infrastructure and interface by developing informatics
technologies that address four critical aspects of cistrome data analysis. First, we will design, develop, and
deploy software through a user-friendly interface to improve automated data collection, processing, and
annotation. This will enable unpublished and public cistrome data to be jointly analyzed and converted into
formats and statistical expressions that can be used for integrative analysis. Second, we will develop methods
to use all available cistrome data to impute TF binding/cell-type combinations that are not represented in public
repositories. Third, we will develop systems to allow gene expression data to be integrated with cistrome data
to elucidate regulatory mechanisms. Fourth, we will develop interactive web based tools for the visualization of
hundreds of cistrome samples at multiple resolutions. Finally, we will engage in outreach activities to improve
improve Cistrome functions, user interface, interoperability with other tools, and promote the Cistrome data use
in cancer research.

## Key facts

- **NIH application ID:** 10359846
- **Project number:** 5U24CA237617-04
- **Recipient organization:** DANA-FARBER CANCER INST
- **Principal Investigator:** Clifford Meyer
- **Activity code:** U24 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $647,787
- **Award type:** 5
- **Project period:** 2019-03-15 → 2024-02-29

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10359846

## Citation

> US National Institutes of Health, RePORTER application 10359846, Developing Informatics Technologies to Model Cancer Gene Regulation (5U24CA237617-04). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10359846. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
