Methods for RNA splicing variations detection, quantification, visualization, and association from large heterogeneous datasets

NIH RePORTER · NIH · R01 · $432,541 · view on reporter.nih.gov ↗

Abstract

Abstract The goal of this research program is to develop methods and tools to analyze large heterogeneous RNA-seq data sets to better understand RNA splicing. The vast majority of human genes are alternatively spliced and variation in splicing has been shown to be associated with complex disease risk. Despite the wide spread adoption of affordable high throughput sequencing, variation in RNA splicing has remained understudied due to the limitations of short read sequencing data and the computational challenges associated with accurate transcript-level quantification of gene expression. We propose to develop methods to improve the detection, quantification, and visualization of complex splicing events. We will further develop methods to identify genetic variants associated with complex splicing variation and to characterize the mechanisms by which splicing variation affects complex traits. Importantly, the variations and mechanisms predicted by our methods will be replicated in independent cohorts and experimentally validated using orthogonal methods. The computational methods and software we will develop will be applied both to publicly available data and data generated by our groups. We propose to leverage not only our expertise but also our existing code base and tools. The tools will support both standalone and cloud based execution for scaling up analysis, and will integrate with existing tools for downstream analysis.

Key facts

NIH application ID
9857039
Project number
5R01GM128096-03
Recipient
UNIVERSITY OF PENNSYLVANIA
Principal Investigator
Yoseph Barash
Activity code
R01
Funding institute
NIH
Fiscal year
2020
Award amount
$432,541
Award type
5
Project period
2018-05-01 → 2022-01-31