Project Summary High-throughput profiling of hundreds of thousands of cells in the central nervous system (CNS) is currently underway. One of the goals of the BRAIN initiative is to build a census of cell types in the CNS, however previous work in single cell RNA sequencing (scRNAseq) has demonstrated that reliance on small collections of marker genes for cell type/state/position classification is insufficient to account for the dynamic nature of and variation in cellular classes/states. Previous work from both myself and others has demonstrated that latent space methods identify low dimensional patterns from high dimensional profiling data can discover molecular drivers of cell types and states in scRNAseq. However, the use of algorithms untethered to biological constraints or not extensively functionally validated can lead to the arbitrary delineation of cell class/state and the trivial designation of “novel” cell types. As proper development of the CNS requires precise regulation and coordination of spatial and temporal cues, the overall objective of this application is to develop analytic and experimental methods that integrate spatiotemporal information with scRNAseq to learn meaningful latent spaces. Specifically, I will 1) generate a comprehensive collection of transcriptional signatures for spatial features of the brain, 2) build dimension reduction software to encode spatial and cell cycle information to account for the highly specific organization of cells in the CNS, 3) derive a statistic, projectionDrivers, that allows for quantification of the gene drivers of differential latent space usage, and 4) define a statistic, proMapR, that will tell you the probability of a cell existing in a particular location in the brain at a given point in time from the cell's transcriptional signature. The ability to define and validate biologically meaningful latent spaces not only enables multiOmic data integration and exploratory analysis of scRNA-seq data via the massive amount of publicly available data, but also lays the groundwork for multimodal data integration—a necessary next step to characterize how individual cells and complex neural circuits interact in both time and space.