# Integrating genomic and transcriptomic data to identify breast cancer susceptibility genes

> **NIH NIH R01** · VANDERBILT UNIVERSITY MEDICAL CENTER · 2022 · $661,923

## Abstract

Project Summary
Genetic factors play an important role in the etiology of both sporadic and familial breast cancer.
Since 2007, common genetic variants in ~200 loci have been identified in genome-wide
association studies (GWAS) in relation to breast cancer risk. However, it is often difficult to
translate GWAS findings to disease prevention and treatment since causal genes in the large
majority of GWAS-identified loci are unknown. Furthermore, a large fraction of breast cancer
heritability remains unexplained. Recent studies suggest that nearly 80% of disease heritability
can be explained by genetic variants regulating gene expression. Herein, we propose three
well-powered transcriptome-wide association studies (TWAS) to systematically investigate the
association of breast cancer risk with gene expression across the transcriptome of African,
Asian and European descendants. In Aim 1, we will perform RNA sequencing and high-density
genotyping assays using normal breast tissue samples and build race-specific gene expression
prediction models using data from 1000 women of African, Asian and European descent. These
models will be applied to the GWAS data generated from approximately 320,000 breast cancer
patients and controls to impute gene expression for association analyses of predicted gene
expression with risk of breast cancer overall and by estrogen receptor and HER2 status. In Aim
2, we will select the top 50 genes identified in Aim 1 for in vitro functional assays to assess their
influence on major cell functions related to cancer biology. In Aim 3, we will evaluate whether
TWAS-identified genes may express differently in normal breast tissues and breast cancer
tissues collected from African, Asian, and European descendants to assess whether these
genes may contribute to racial differences in breast cancer risk by molecular subtypes. With
strong methodology and a large sample size, we believe that this proposed study should be
able to identify and characterize a large number of novel genes related to breast cancer risk.
Uncovering breast cancer susceptibility genes will greatly improve the understanding of the
genetic and biological basis for breast cancer and accelerate the translation of genetic findings
to disease prevention and patient care.

## Key facts

- **NIH application ID:** 10440254
- **Project number:** 5R01CA235553-04
- **Recipient organization:** VANDERBILT UNIVERSITY MEDICAL CENTER
- **Principal Investigator:** Jirong Long
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $661,923
- **Award type:** 5
- **Project period:** 2019-07-01 → 2025-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10440254

## Citation

> US National Institutes of Health, RePORTER application 10440254, Integrating genomic and transcriptomic data to identify breast cancer susceptibility genes (5R01CA235553-04). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10440254. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*