PROJECT SUMMARY Alternative splicing (AS) is a major driver of protein isoform diversity and is regulated in a highly cell type-specific manner. A better understanding of the cell type-specific splicing code will not only provide novel insights into the role of alternative splicing in disease and development but will also result in novel genetic tools for perturbing and interrogating cell types of interest. Synthetic splicing constructs have been successfully used to target activation of reporter and therapeutic genes to cancer cells carrying mutations in splice factors or to make gene therapies conditional on a small molecular trigger. Existing examples highlight the potential of AS as a programmable control mechanism but do not provide a clear path towards engineering splice regulatory sequences that can be used to target gene expression to any desired cell type or state. Here, we propose to combine synthetic biology with machine learning to generate highly cell type-specific splicing constructs. Building on our earlier work, we will first quantify cell type-specific AS using splicing massively parallel reporter assays (MPRAs). We will focus on exon skipping and intron retention because they are among the most common forms of AS and can be highly cell type-specific. For each type of AS, we will create libraries with hundreds of thousands or even millions of reporters with variation targeted to regions of interest. We will then measure AS for these libraries in a panel of cell lines and cultured primary cells (Specific Aim 1). Next, we will use these data to train machine learning models that can accurately predict AS isoform abundance from reporter gene sequence. We will systematically compare different network architectures and approaches including convolutional and recurrent neural networks. We will then combine models with sequence design approaches previously developed in the lab to generate synthetic sequences with enhanced target cell specificity. We aim to show that we can generate reporter constructs that are specific to any cell type in our panel. We will validate predictions experimentally and use resulting data to iteratively improve model predictions (Specific Aim 2). Finally, we will generalize our approach to an experimental setting that more accurately reflects the diversity and complexity of cell types encountered in multi-cellular biological systems. Specifically, we will perform splicing MPRAs in organotypic developing rat brain slice culture. We will optimize conditions for library delivery to slice culture and we will similarly optimize approaches for reading out splicing MPRAs at the single cell level. We will combine the resulting data with the generative models from Specific Aim 1 to design reporter constructs that precisely target protein expression to cell types of interest (Specific Aim 3). We believe that this work will result in novel genetic tools for biology research and provide a path towards gene therapies with i...