Project Summary B progenitor acute lymphoblastic leukemia (B-ALL) remains a leading cause of childhood cancer death. With the advances in RNA sequencing (RNA-seq) technology, many recurrent chimeric genes have been identified that has led to refined classification of B-ALL and tailored therapies. Still, around 10-30% B-ALL cases could not be classified into the established subtypes, which are termed as “B-other”, thus general chemotherapy will be applied and the outcome for many is poor. This study will apply integrative genomic data analysis to identify novel B-ALL subtypes with a focus on B-other cases. With the experience and skills from prior work, I will analyze RNA-seq data from over 2000 childhood and adult ALL cases and define novel subtypes based on distinct gene expression profiles and shared genetic alterations. Case lacking driver lesions from RNA-seq will be subjected to whole genome sequencing (WGS) to identify various genetic alterations. The remaining unclassified cases with the genetic alterations in non-coding regions will be studied by functional genomic data (ChIP and ATAC- seq) to provide mechanistic annotation. Furthermore, functional experiments will be performed to explore the role of the newly identified subtype-defining genetic alterations. In the pilot study, I have analyzed 1,988 RNA- seq samples and defined 23 distinct B-ALL subtypes, with 8 novel ones identified. Besides the ones defined by gene rearrangements, I also observed point mutations on key transcription factors could play potent role in defining novel subtypes, which include PAX5 P80R (n=44) and IKZF1 N159Y (n=8). In this proposal, I will expand the sample size and interrogate the rest B-other cases with WGS to define the residual novel subtypes. Through this study, I will provide definitive B-ALL subtypes and maximize the potential of defining new ones from B-other cases. As an exemplar of single-point-mutation-defined subtype, PAX5 P80R will be thoroughly studied in this proposal. Specifically, I will use PAX5 plus other key activating/repressing chromatin marks through ChIP-seq to study PAX5 P80R specific binding sites, coupled with the chromatin accessibility information from ATAC-seq. With the CRISPR/Cas9 knock-in Pax5 P80R mouse model, I will use single-cell sequencing of preleukemic and leukemic B cells to elucidate the correlation between genetic alterations and deregulated genes on cellular level. Moreover, the markedly overexpressed gene MEGF10 (Multiple Epidermal Growth Factor-Like Domains Protein 10) in PAX5 P80R group will be explored through in vitro and ex vivo models to test its role in cellular localization and leukemogenesis. Knock-down or -out of MEGF10 through RNAi or CRISPR will be applied in human P80R xenografts to test if MEGF10 could be a potential target for tailored therapy. The mentored phase of this proposal will occur at St. Jude Children’s Research Hospital, under Dr. Charles Mullighan, and will finish the aim of characterizin...