PROJECT SUMMARY / ABSTRACT The past three decades have witnessed an accumulating body of evidence that epigenetic mechanisms play an instrumental role in human cancer. Epigenetic alterations can serve as driver events in cancer by inactivating tumor-suppressor genes. The finding that these silencing events are mutually exclusive with structural or mutational inactivation of the same gene reinforces the functional significance of epigenetic silencing. The majority of cases of microsatellite instability in sporadic human tumors can be attributed to epigenetic silencing of the MLH1 mismatch repair gene. One of the most striking discoveries to emerge from cancer genome projects has been the previously unappreciated preponderance of somatic mutations in epigenetic regulators in most types of human cancer. Clearly, epigenetic mechanisms play a key role in human cancer, and a comprehensive molecular characterization of cancer should include epigenomic profiling. We propose to create an Integrative Cancer Epigenomic Data Analysis Center (ICE-DAC) to provide specialized analysis pipelines and expertise as part of the Genome Data Analysis Network (GDAN). We anticipate that epigenomic data will be provided as bisulfite-based sequence data or as DNA methylation BeadArray data, and we provide an analysis workflow that can accommodate either. We propose to apply specialized epigenetic analyses we have developed for both data types in our extensive experience in cancer genome consortia. In Specific Aim 1, we will develop, improve and implement analytic bioinformatic tools for epigenomic data analysis, including improvements to analysis tools for processing bisulfite sequence data. We will continue the development of analysis tools that use DNA methylation data to analyze tumor heterogeneity and subclonal structure, including the deconstruction of non-malignant cellular composition of the tumor. In Specific Aim 2, we will provide advanced specialized analysis of cancer epigenomic data generated by the Genome Characterization Center and/or provided through the Data Processing GDAC. Our automated workflow will provide timely primary data analysis for AWGs, and can accommodate both sequence-based or array-based DNA methylation data. This workflow will call differentially methylated regions (DMRs), identify cancer subtypes through stratified cluster analysis, analyze CpH methylation, and analyze tumor purity and subclonal heterogeneity. Performing variant analysis from bisulfite sequence data allows us to determine the impact of non-coding mutations on epigenetic state. In Specific Aim 3, we will integrate epigenomic data with other genomic, transcriptomic, proteomic, and clinical data to derive biologically and clinically relevant novel insights. Integration of DNA methylation and RNA-Seq data will be used for epigenetic silencing calls and for our custom enhancer identification ELMER pipeline, both of which will feed into pathway and network analyses.