The ability to measure which genes are expressed in cells has revolutionized our understanding of biological systems. Discoveries range from pinpointing what makes different cell types unique (e.g., a skin vs. brain cell) to how diseases emerge from genetic mutations. This gene expression data is now a ubiquitously used tool in every cell biologist’s toolbox. However, the mathematical theories for reliably extracting insight from this data have lagged behind the amazing progress of the techniques for harvesting it. This CAREER project will develop key theoretical foundations for analyzing imaging data of gene expression. The advances span theory to practice, including developing mathematical models and machine-learning approaches that will be used with data from experimental collaborators. Altogether, the project aims to create a new gold standard of techniques in studying spatial imaging data of gene expression and enable revelation of new biological and biomedical insights. In addition, this proposed research will incorporate interdisciplinary graduate students and local community college undergraduates to train the next generation of scientists in the ever-evolving intersection of data science, biology, and mathematics. Alongside research activities, the project will create mentorship networks for supporting first-generation student scientists in pursuit of diversifying the STEM workforce. The supported research is a comprehensive program for studying single-molecule