Project Summary By creating seminal tools for the computational analysis of massively parallel sequencing data, the Getz group has generated vital pipelines and the analysis framework for large-scale processing and systematic analysis of cancer genome datasets. Through collaborations with an international network of investigators, we have gathered whole-exome, matched transcriptome, and methylome data from >1000 CLL patients. Through saturation analysis and statistical modeling, we have calculated this collection of samples to provide sufficient statistical power to detect all intermediate and high frequency genetic drivers of this disease (94% power to detect events in >2% of patients), based on the background mutation frequency of CLLs. The goals of our analyses are to: (1) Build a comprehensive catalog of all genetic and epigenetic drivers of CLL and their interdependencies, both clonal and subclonal, integrating information on somatic point mutations, copy- number changes, and DNA methylation; (2) Integrate all genomic data modalities to identify molecular subtypes of CLL and associate with drivers, cellular processes and cancer hallmarks; and (3) Develop new models to predict outcome based on the genomic map of CLL subtypes. Designing a framework and tools to maximize our understanding of the relevant genetic and epigenetic determinants of CLL development and response to treatment is the focus of this project. These results and framework will generate a valuable resource for the CLL and the broader cancer community.