# Development of an integrated prognostic score using common risk variants and RNA-expression data from primary breast cancers for improved prognostication of outcomes across populations

> **NIH NIH F32** · UNIVERSITY OF CHICAGO · 2020 · $75,930

## Abstract

PROJECT SUMMARY/ABSTRACT
My short term goal is to integrate training in bioinformatics with a postdoctoral fellowship in clinical oncology and
prior training in statistical genetics. Relevance of germline data to somatic tumor development and clinical
outcomes remains poorly understood due to clinical separation between germline and somatic testing. Somatic
expression from breast tumors offers clinical prognostication and prediction only in hormone-receptor positive
patients. This limitation particularly affects women of African ancestry, who experience higher rates of hormone-
receptor negative and HER2-positive disease. Somatic expression assays also do not accurately reflect
outcomes in hormone-receptor positive women of African ancestry. Primary tumors from this population
demonstrate more aggressive molecular features (such as more homologous recombination deficiency, greater
chromosomal instability, and more TP53 mutations) relative to those from European-ancestry patients,
suggesting that common germline variation may influence the trajectory of somatic cancer development and
clinical outcomes. We hypothesize that integration of common germline variation and somatic tumor expression
data will yield prognostic information for women across breast cancer subtypes and populations. To test this, we
will: (1) Develop an integrated prognostic score incorporating common germline breast cancer risk
variants and RNA-expression data from breast cancer patients and test for prediction of clinical
outcomes. Summary-PrediXcan is a computational method that takes summary-level GWAS and phenotype of
interest as inputs, tests how expression changes in each gene affect phenotype, and outputs gene-level
association results. We will use summary-level GWAS data from the largest meta-analysis of breast cancer risk
with more than 225,000 European-ancestry cases and controls. We will refine/train this score using RNA-
expression data from 1,096 patients in The Cancer Genome Atlas (TCGA) and 932 patients from METABRIC.
We will test the predictive power of clinical outcomes tracked in TCGA, including overall survival, disease-specific
survival, and progression-free survival using Cox regression analysis; (2) Test translation across populations
of an integrated germline/somatic approach to breast cancer outcome prediction using patients of
African ancestry. We will derive an independent score using summary-PrediXcan with summary-level GWAS
data from a 3-consortia meta-analysis of breast cancer risk that includes 6,522 African-American breast cancer
patients and 7,643 controls. We will test outcome prediction using 183 African-American TCGA patients with
RNA-expression and survival data. We will also pilot validation of this score from two unique cohorts of 78
Nigerian patients and 35 African-American patients with RNA-seq and survival data. As we enter a new era in
precision oncology, patients with breast cancer do not have access to prognostic stratification that p...

## Key facts

- **NIH application ID:** 9911887
- **Project number:** 1F32CA247244-01
- **Recipient organization:** UNIVERSITY OF CHICAGO
- **Principal Investigator:** Padma Sheila Rajagopal
- **Activity code:** F32 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $75,930
- **Award type:** 1
- **Project period:** 2020-07-01 → 2021-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9911887

## Citation

> US National Institutes of Health, RePORTER application 9911887, Development of an integrated prognostic score using common risk variants and RNA-expression data from primary breast cancers for improved prognostication of outcomes across populations (1F32CA247244-01). Retrieved via AI Analytics 2026-06-03 from https://api.ai-analytics.org/grant/nih/9911887. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
