Transcriptome-wide association studies and genetic risk prediction for breast cancer integrating RNA splicing and gene expression from multiple tissues

NIH RePORTER · NIH · R01 · $361,829 · view on reporter.nih.gov ↗

Abstract

ABSTRACT Breast cancer is the most common cancer in women in the United States and worldwide. Although genome-wide association studies have identified multiple loci for breast cancer, most of heritability is still hidden. To date, transcriptome-wide association studies (TWAS) have been performed to quantify associations of genetically predicted gene expression with breast cancer risk. Our recent work showed that genetic variants that affect RNA splicing are very important contributors to complex traits but were previously missed when considering the genetic effects on gene expression only. Therefore, evaluating associations of genetically predicted splicing (as a linear combination of SNPs) with phenotypes has a great promise to discover novel putative candidate disease genes. Splicing events in local regions (such as intron excision clusters) can be highly correlated. However, existing statistical methods for TWAS do not account for correlation among splicing events, and thus may result in loss of power in detecting disease genes. Additionally, splicing levels (quantified as relative count ratios) in a gene and the overall gene expression level have not been considered together in previous gene mapping methods. For breast cancer prevention, stratification of women according to the risk of developing the cancer could improve risk reduction and screening strategies by targeting those most likely to benefit. SNP-based polygenic risk scores have been developed to predict breast cancer but their prediction accuracy remains low. To increase prediction accuracy, there is a need to incorporate useful information from genetically predicted expression and splicing. Recently, several transcriptome studies, such as GTEx, have collected DNA and RNA from multiple tissue samples; integrating information across multiple tissues into TWAS could significantly improve the identification of disease genes. In addition, African Americans (AAs) have different linkage disequilibrium (LD) pattern from Europeans, so genetic variants that affect RNA splicing and disease phenotypes could be ethnicity-specific. The objective of this study is to develop effective methods for gene mapping and genetic risk prediction of complex traits such as breast cancer by integrating multi–omics data from multiple tissues. Specifically, we will 1) develop methods for TWAS that leverage information of RNA splicing and expression from multiple tissues and apply the methods to identify novel breast cancer susceptibility genes; 2) develop joint polygenic risk prediction scores for breast cancer that model different LD patterns in distinct populations (including AAs) and incorporate information of genetically predicted splicing and gene expression from multiple tissues. We will account for correlation among splicing events in local regions and across multiple tissues. We expect that the proposed methods have higher power in gene mapping or higher accuracy in prediction of breast cancer than exi...

Key facts

NIH application ID
10456122
Project number
5R01CA242929-04
Recipient
UNIVERSITY OF CHICAGO
Principal Investigator
Guimin Gao
Activity code
R01
Funding institute
NIH
Fiscal year
2022
Award amount
$361,829
Award type
5
Project period
2019-09-13 → 2024-08-31