Abstract The goal of this research program is to develop methods and tools to analyze large heterogeneous RNA-seq data sets to better understand RNA splicing. The vast majority of human genes are alternatively spliced and variation in splicing has been shown to be associated with complex disease risk. Despite the wide spread adoption of affordable high throughput sequencing, variation in RNA splicing has remained understudied due to the limitations of short read sequencing data and the computational challenges associated with accurate transcript-level quantification of gene expression. We propose to develop methods to improve the detection, quantification, and visualization of complex splicing events. We will further develop methods to identify genetic variants associated with complex splicing variation and to characterize the mechanisms by which splicing variation affects complex traits. Importantly, the variations and mechanisms predicted by our methods will be replicated in independent cohorts and experimentally validated using orthogonal methods. The computational methods and software we will develop will be applied both to publicly available data and data generated by our groups. We propose to leverage not only our expertise but also our existing code base and tools. The tools will support both standalone and cloud based execution for scaling up analysis, and will integrate with existing tools for downstream analysis.