# Transcriptome-wide association study to identify susceptibility genes for colorectal cancer

> **NIH NIH R37** · VANDERBILT UNIVERSITY MEDICAL CENTER · 2022 · $614,645

## Abstract

PROJECT SUMMARY
Genetic factors play an important role in the etiology of colorectal cancer (CRC). To date, approximately 50
genetic loci have been identified for CRC through genome-wide association studies (GWAS). However, these
loci explain only a small fraction of heritability. Moreover, target genes and underlying mechanisms for most of
these risk loci remain unclear. The large majority are noncoding variants, many of which have been shown to
regulate gene expression. Recent studies suggest that ~80% of disease heritability can be explained by
regulatory variants. However, these variants are each associated with only a small alteration in disease risk;
thus they are difficult to identify using GWAS. Recently, a novel approach, the transcriptome-wide association
study (TWAS), was developed to systematically investigate the transcriptome's association with disease risk.
In TWAS, models are built to predict gene expression with cis-SNPs using a reference transcriptome, and then
applied to GWAS data to evaluate their associations with disease risk. Here, we propose to use this innovative
approach to scan the whole transcriptome to discover novel CRC susceptibility genes and uncover likely
causal genes in loci revealed in previous GWAS. In Aim 1, we will conduct a TWAS in European descendants.
We will build expression prediction models for coding genes and non-coding RNAs in hundreds of colorectal
tissues, other multiple tissues, and cross tissues using transcriptome and high-density genotyping data from
individuals of European ancestry in the Genotype-Tissue Expression (GTEx) project. The models will be used
to predict gene expression levels using GWAS data from approximately 27,911 CRC cases and 23,059
controls included in the ColoRectal Transdisciplinary Study (CORECT) and the Genetics and Epidemiology of
Colorectal Cancer (GECCO) consortia, and then to evaluate their associations with CRC risk. In Aim 2, we will
conduct a TWAS in East-Asian descendants. We will generate transcriptome data and high-density genotyping
data from 400 CRC patients of Asian ancestry from the Asia Colorectal Cancer Consortium (ACCC). We will
use these data to build expression prediction models for coding genes and non-coding RNAs and perform a
TWAS in approximately 18,999 CRC cases and 31,269 controls from the ACCC. In Aim 3, we will experi-
mentally evaluate biological function of the top 30 genes identified in Aims 1 and 2. Based on the association
direction between their expression levels and CRC risk, we will either suppress expression using CRISPRi or
promote it using CRISPRa in multiple normal colon epithelial and CRC cell lines. We will then perform in vitro
assays and analyze bioinformatics evidence to examine the biological functions of these selected genes and to
assess their potential roles in regulating known cancer-related pathways. Our proposed study is extremely
cost-efficient, as both the transcriptome dataset (GTEx) for European descendants and the G...

## Key facts

- **NIH application ID:** 10620374
- **Project number:** 4R37CA227130-05
- **Recipient organization:** VANDERBILT UNIVERSITY MEDICAL CENTER
- **Principal Investigator:** Xingyi Guo
- **Activity code:** R37 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $614,645
- **Award type:** 4N
- **Project period:** 2018-09-04 → 2024-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10620374

## Citation

> US National Institutes of Health, RePORTER application 10620374, Transcriptome-wide association study to identify susceptibility genes for colorectal cancer (4R37CA227130-05). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10620374. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
