# Global Infrastructure for Collaborative High-throughput  Cancer Genomics Analysis

> **NIH NIH U24** · BROAD INSTITUTE, INC. · 2020 · $941,937

## Abstract

Abstract
The Cancer Genome Atlas (TCGA) set the standards for large-scale cancer genome
projects worldwide. In the next phase, the National Cancer Institute and its Center for
Cancer Genomics are planning large-scale projects closely tied to clinical questions and
trials. In order to perform the analysis of these data, the NCI is creating a Genome Data
Analysis Network (GDAN) of different types of Genome Data Analysis Centers (GDACs).
Central to this Network is a single Processing GDAC, which will take all the harmonized
data, as stored in the NCI's Genomics Data Commons, and perform higher level integrated
analyses on these data to support both the Analysis Working Groups (AWGs) within the
Network (which will be formed for each project to perform special analyses of the data and
write manuscripts) as well as the entire biomedical research community.
Herein we propose to build the centralized Processing GDAC on top of our FireCloud
platform, an infrastructure to run large scale computation on the cloud in a fully rigorous
and reproducible fashion. FireCloud development was based on our experience with
Firehose, the Broad internal platform on which the standard TCGA data and analyses
currently run. We propose to create and operate the GDAN Standard Workflow,
incorporating tools actively developed and used within the GDAN and across the entire
field, with particular emphasis on clinical tools. This Workflow will serve as the starting
point for AWGs and set the highest standards of transparency, reproducibility and rigor for
cancer genome analysis. The results of the Standard Workflow will be stored in a public
database, and accessible via standard APIs, and used together with a continuously
updated database of prior knowledge to create scientific reports that will be made available
to the community, in a pre-publication manner. Finally, a major innovation is that AWG
members will be able to login into FireCloud and rerun the entire workflow, or parts of it,
with their own parameters and subsets of the data – thus making the entire GDAN analysis
fully reproducible and scalable.
Our goals are therefore: (1) To create a global infrastructure for collaborative extreme-
scale cancer analysis; (2) Operate the Standard Workflows at scale; (3) Rapidly and
continuously evolve the Standard Workflows; and (4) created improved capabilities for
reporting, exploring the results, clinical diagnostics and reproducibility.

## Key facts

- **NIH application ID:** 10011769
- **Project number:** 5U24CA210999-05
- **Recipient organization:** BROAD INSTITUTE, INC.
- **Principal Investigator:** GAD A GETZ
- **Activity code:** U24 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $941,937
- **Award type:** 5
- **Project period:** 2016-09-20 → 2022-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10011769

## Citation

> US National Institutes of Health, RePORTER application 10011769, Global Infrastructure for Collaborative High-throughput  Cancer Genomics Analysis (5U24CA210999-05). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10011769. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
