Project Summary Background Single-cell sequencing data has enormous potential to improve our understanding of human health, with direct applications in the areas of diagnosis and therapeutic selection. Single- cell sequencing of mRNA expression levels (scRNA-Seq) initially focused on understanding fun- damental biological systems at the single-cell level, but there is an increasing emphasis on using scRNA-Seq to understand the role of single-cell variability on human health outcomes. While the exploration of single-cell human variability and its relationship to disease is advancing, the cor- responding statistical methodology to handle this type of data at the human population level lags behind. Project Objectives Broadly, the long-term goal of this proposal is a coherent methodological framework for the analysis of the effect of single-cell variability on patient phenotypes. This pro- posal considers the setting of population scRNA-Seq studies, where scRNA-Seq data is collected from many patients representing populations with differing health outcomes. The proposed re- search consists of the development and evaluation of statistical methodologies for these kinds of scRNA-Seq population studies. The methodology developed by this proposal will fill a critical gap, helping to unlock the potential of scRNA-Seq data for improving human health. Project Methods The proposed research program focuses on three specific aims that target the most common analysis needs in scRNA-Seq population studies. Aim 1: Patient-level represen- tation for scRNA-Seq data. This Aim will develop a summary representation of the scRNA-Seq profile of a patient and create statistical methods that allow comparisons of this summary profile between different patient populations. Aim 2: Predicting patient phenotypes based on scRNA-Seq data. This aim will develop models that can predict health phenotypes based on the scRNA-Seq measurements on a patient. Aim 3: Identifying cell-level and gene-level biomarkers for patient phe- notypes. The methods developed in this aim will allow for identifying genes and cell populations that differ at the single-cell level between patient populations. The biomarkers identified from these methods will generate testable hypotheses for future exploration of the mechanistic relationship between single-cell variability and patient outcome.