# Cancer-specific gene set testing

> **NIH NIH R21** · DARTMOUTH COLLEGE · 2020 · $451,000

## Abstract

PROJECT SUMMARY
Cancer develops when pathways controlling cell survival, cell fate or genome maintenance are disrupted
by the somatic alteration of key driver genes. Understanding the mechanism and impact of pathway dis-
ruption is therefore essential for an accurate characterization of cancer biology and identification of ther-
apeutic targets. A common approach for studying pathway dysregulation in cancer involves the analysis
of tumor gene expression data using gene set testing or pathway analysis techniques. Gene set testing
is an effective and widely applied hypothesis aggregation method that uses prior knowledge regarding
gene function to test a smaller number of more biologically meaningful hypotheses and thereby improve
interpretation, replication and power relative to a gene-level analysis. Although the gene set analysis
of large cancer gene expression data sets has successfully identified pathways commonly impacted in
human cancer, existing pathway analysis methods have two important limitations when applied to can-
cer gene expression data. First, most existing gene set collections model the pattern of gene activity
found in normal tissues, which can differ significantly from the pattern found within tumors. Using these
gene sets to analyze cancer gene expression data can result in misleading results with the potential
for a significantly inflated type II error rate. Second, standard gene set testing methods leverage only
the gene expression data for the analyzed samples. Although there are some cancer-specific pathway
analysis methods that consider multiple omics modalities, e.g., expression and mutations, information
regarding the expression of genes in the associated normal tissue is not utilized by existing techniques.
Ignoring normal tissue gene expression can result in a cancer-focused analysis that simply recapitulates
the phenotype of the associated normal tissue rather than capturing cancer-specific activity. To address
these challenges, we will develop novel and innovative bioinformatics algorithms that 1) optimize exist-
ing gene set collections to reflect the pattern of gene activity found in dysplastic tissue, and 2) leverage
information regarding normal tissue gene activity during gene set analysis.

## Key facts

- **NIH application ID:** 10058552
- **Project number:** 1R21CA253408-01
- **Recipient organization:** DARTMOUTH COLLEGE
- **Principal Investigator:** Hildreth Frost
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $451,000
- **Award type:** 1
- **Project period:** 2020-09-01 → 2024-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10058552

## Citation

> US National Institutes of Health, RePORTER application 10058552, Cancer-specific gene set testing (1R21CA253408-01). Retrieved via AI Analytics 2026-05-27 from https://api.ai-analytics.org/grant/nih/10058552. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
