PROJECT SUMMARY Together with the ability to measure genome-wide expression of millions of individual cells, single-cell technologies have also brought the challenge of translating such data into a better understanding of the underlying biological phenomena. Existing computational methods and software for single-cell data analysis have critical limitations related to scalability, accuracy, usability, and interpretation capabilities. The main goal of this project is to pioneer a new platform for the analysis of single-cell data that is capable of: i) accurately identifying cell types and their composition in complex tissues, ii) inferring cell developmental stages and pseudo- time trajectories, and iii) identifying cell-type-specific pathways and putative mechanisms in a phenotype comparison. The proposed platform will also be able to deconvolve bulk expression data to identify the cell type composition of each bulk sample. The significance of the proposed work lies in its potential to provide new methodologies for single-cell data analysis that far exceed the performance of current state-of-the-art techniques. The accurate deconvolution will also allow researchers to extract more information from the vast repositories of existing bulk data, including GDC/TCGA, NCBI SRA, GEO, and ArrayExpress, which are currently containing data from bulk experiments that collectively cost over a billion dollars. The hypothesis driving this work is that single-cell data analysis and cellular deconvolution of bulk data can greatly benefit from: i) the systems-level knowledge that holds key characteristics for cellular developments, and ii) the valuable information available in validated cell types and reference single-cell datasets available in single-cell atlases. Indeed, our preliminary work shows that single-cell data analysis and cellular deconvolution can achieve an outstanding accuracy of approximately 90—100% if we properly utilize reference single-cell datasets and pathway knowledge. The proposed platform will be extensively validated by comparing its capabilities against the state-of-the-art software in both single-cell data analysis (cell type identification, developmental states and time-trajectory inference, systems-level analysis) and cellular deconvolution of bulk expression data. This will be done using both 663 datasets representing 279 cell types and 116 human organ parts (including bulk data, single-cell data, and matched cell flow cytometry). The pathway analysis and mechanisms inference capabilities will be further validated using real knock-out datasets (in which the true cause of the phenotype is known). The company, Advaita, has a strong IP portfolio, an experienced team, and a proven track record in this area, having developed and commercialized similar analysis platforms. Advaita's existing products are currently used by top principal investigators, core facilities, and pharmaceutical companies around the world.