Sepsis causes an estimated one in five deaths globally, including approximately 190,000 deaths per year in the United States. Given the complexity and heterogeneity of the condition, a “one-size-fits-all” approach to sepsis care, which is largely the approach taken by clinical guidelines, is unlikely to be most effective. Yet, it is not feasible to conduct a randomized controlled trial (RCT) among each patient subgroup that can be formed from the hundreds of possible combinations of important sociodemographic (e.g., age group) and clinical (e.g., comorbidities or cause of sepsis) characteristics of patients. Observational studies in electronic health record (EHR) data could circumvent this feasibility constraint thanks to the large size and “real-life” representativeness of EHR data. However, such observational studies have the critical disadvantage that they are thought to merely yield associations rather than causal effect estimates, because they make the untestable and frequently implausible assumption that all confounders were perfectly measured and adjusted for in the analysis. The objective of this New Innovator Award is to develop and test a new study design for clinical research on sepsis – machine-learning-facilitated regression discontinuity (ML-facilitated RD) – that would allow researchers to determine causal effects for common sepsis care interventions in large-scale EHR data without needing to rely on confounder adjustment. ML-facilitated RD combines machine learning with a novel causal inference technique (regression discontinuity) to improve the robustness of the technique for causal effect estimation, its ability to reliably determine causal effects for each of a large number of highly granular patient subgroups, and to ascertain the optimal threshold in continuous variables (e.g., in mean arterial pressure) at which the intervention of interest should be initiated in each patient subgroup. We will additionally develop RD such that it can be applied to the multi-factorial decisions that are common in clinical care for sepsis. This project has two steps. In the first step, we will develop these methodological innovations with the aid of extensive simulation exercises. In the second step, we will test the feasibility and validity of ML-facilitated RD for each of 12 common clinical interventions for sepsis in each of 15 EHR datasets from a variety of clinical settings. The key innovation of this project is that it aims to establish a study design for EHR data on sepsis that uses a fundamentally different approach for causal effect estimation than current state-of-the-art methods. By providing a new tool to clinical researchers for determining the causal effects of clinical interventions for sepsis in routine care and among highly granular patient subgroups (including which threshold in continuous clinical measurements is optimal for initiating these interventions in each subgroup), this research would constitute a major step forward...