SUMMARY Affecting over 6 million people in the U.S., atrial fibrillation (AF), the most common cardiac arrhythmia, is a major public health concern. AF is costly to the health care system and leads to significant health consequences (e.g., stroke, heart failure, dementia, decreased quality of life). With time, AF patients experience increased frequency and duration of AF episodes. Random occurrence of sporadic AF episodes and the need for anticoagulation to prevent stroke make AF difficult to manage. Many AF patients seek out atrial fibrillation ablation (AFA) in order to improve quality of life and decrease AF episodes. AFA, cauterization of areas of the left atrium, is the most effective treatment for persistent / paroxysmal AF. AFA success rates vary, but many patients will not be AF-free following AFA. At leading AFA centers, AF-free rates at one and two years after initial AFA were 40% and 37%, respectively. Given the modest success rates of AFA, patient selection for this procedure should receive more attention. Sociodemographic and clinical phenotype data have been used to predict AFA response, but collectively they have poor predictive ability. The widespread adoption of electronic health record (EHR) systems presents a ripe opportunity for a paradigm shift for predicting AFA outcomes. A better understanding of patient specific factors predicting AFA outcome will inform patient selection for this procedure. To this end we propose to use machine learning techniques to develop predictive models for outcomes of primary AFA procedures, addressing the following specific aims and research questions: 1. Aim 1: Predict adverse AFA outcomes using machine learning. • How well do existing risk scores predict AFA complications prior to initial procedure? • Can a machine learning model trained on EHR data provide better prediction of AFA complications? 2. Aim 2: Data-driven AFA outcome subgroup identification. • Can cluster analysis identify useful subgroups based on outcome trajectory? • Are other unsupervised ML algorithms such as sequential pattern mining alternatives for analyzing patient outcome trajectories? 3. Aim 3: Develop an open-source software toolkit. This project will lay the foundation for future refinement of existing machine learning methods as well as development of new methods to improve prediction of AF recurrence following AFA.