Project Abstract The paradigmatic approach to chemotherapy has been to identify and target driver mutations. However, after initial response to therapy, many patients develop a recurrent drug-resistant disease leading to high mortality rates. This resistance may be encoded, driven by somatic mutations, or adaptive, where changes in the epigenetic programs result in phenotypic plasticity. Critically, the relative contribution of encoded versus adaptive mechanisms of drug resistance and how these impact therapeutic response is poorly understood. Advances in single cell multiomics have been crucial for the detection of rare genetic and epigenetic events that may drive resistance and cannot be observed by bulk sequencing. However, progress has been limited as most experiments only profile either the encoded (via genome sequencing) or adaptive (via transcriptome or epigenome profiles) states. Only recently have new techniques made it possible to measure these modalities from the same cell, or population of cells. This project proposes the development of a new class of scalable statistical models that will help identify causal determinants of treatment failure in small cell lung cancer (SCLC) and metastasis in high grade serous ovarian cancer (HGSOC) and gastric adenocarcinoma (GAC) — all diseases with significant morbidity and low cure rates. These cancers each exemplify components of intratumoral heterogeneity and its interplay with the tumor microenvironment. Each translational study in this project generates datasets comprising high-dimensional covariates that require scalable computational methods to analyze. Machine learning methods are highly scalable but have difficulty with actionable interventional and counterfactual queries, and do not account for confounding factors — covariates that affect both intervention and its target. Causal models on the other hand, are designed to account for confounding factors, but do not scale well. Here, we address these two needs by developing novel computational methods at the intersection of multiview learning and causal inference. In the K99 phase, the focus will be on developing a causal inference framework and software to identify the impact of cell intrinsic processes on patient response to therapy, inferred from high dimensional multiomic single cell data. In the R00 phase, this framework will be extended to focus on cell extrinsic processes, including profiling the tumor microenvironment and cell-cell interactions. The methods developed here will be applicable to any type of cancer. Thus, we anticipate that this project will not only improve our understanding of SCLC, GAC, and HGSOC progression, but have a broader impact on cancer research as major consortia release similar data to the public. I have put together an interdisciplinary mentorship group with expertise in genomics, phenotypic plasticity, and causal machine learning. This proposal also details a training program that will help me successfull...