# Integration of Genomic and Clinical Data to Enhance Subtyping of Colon Cancer

> **NIH NIH R01** · MAYO CLINIC ROCHESTER · 2020 · $413,430

## Abstract

ABSTRACT
Colon cancer (CC) is a clinically and molecularly heterogeneous disease. While the TCGA data has implicated
numerous molecular aberrations in cancer etiology and mechanisms, a direct link between genomic events
and patient outcomes is lacking. While the TNM (tumor, node, metastasis) staging system is widely utilized
and provides prognostic information, CCs show considerable stage-independent variability in outcome
indicating that more robust classifiers are needed for prognostic stratification. Prognostic information is critical
to guide patient management and surveillance after cancer resection and can inform treatment selection.
Using only gene expression data, we identified four consensus molecular subtypes (CMS) of CC with distinct
prognoses. We hypothesize that inclusion of additional genomic features will enable more granular molecular
subtyping by identifying additional molecular patterns. Toward this objective (Aim 1), we will utilize multi-omics
data sets generated from two completed phase III adjuvant chemotherapy trials in CC (NCCTG N0147,
NSAPB C-08). We will also develop a supervised prognostic model by integrating comprehensive molecular
data with clinicopathological variables and outcome data (Aim 2). Our unique resource for supervised learning
is the high-quality survival data from the clinical trial cohorts. We hypothesize that integration of genomic
alterations within clinically relevant genes and gene expression levels with clinicopathological variables can
improve the prediction of recurrence/survival compared to traditional TNM staging alone. We will include in a
step-wise fashion in our training models selected genes and miRNA expression, somatic mutations, minor
allele frequencies, somatic copy number alterations as well as CMS and clinical features, to optimize predictive
performance. Given that immune and stromal infiltrating cells are well recognized as determinants of
prognosis in CC, we propose to characterize tumor immune and stromal markers among distinct CC molecular
subtypes and determine their contribution to prognosis (Aim 3). Specifically, we will characterize these
transcriptomic markers computationally, and determine whether they can refine molecular subtypes and
improve prognostic modeling. Our proposal represents the first comprehensive prediction of CC patient
survival using features from both genomic and transcriptomic alterations that will be integrated with immune
and stromal markers using state-of-the-art supervised learning approaches. The impact of this work is
substantial in that it will identify determinants of recurrence at the molecular pathway level or in the tumor
microenvironment, which will help prioritize targets for therapeutic intervention. Furthermore, the outcome of
this grant is expected to have practice-changing implications that can further advance the field of precision
oncology.

## Key facts

- **NIH application ID:** 9842277
- **Project number:** 5R01CA210509-04
- **Recipient organization:** MAYO CLINIC ROCHESTER
- **Principal Investigator:** Frank A. Sinicrope
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $413,430
- **Award type:** 5
- **Project period:** 2017-01-18 → 2021-12-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9842277

## Citation

> US National Institutes of Health, RePORTER application 9842277, Integration of Genomic and Clinical Data to Enhance Subtyping of Colon Cancer (5R01CA210509-04). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/9842277. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
