7. Project Summary/Abstract Adverse events pose a significant challenge to medical interventions (drugs, devices, others) with an estimated 2.3 million cases of adverse drug events between 1969-2002. Adverse events are responsible for longer hospital stay, higher healthcare costs, and higher mortality. There is a clear need for adverse event surveillance, but the standards of manual chart review and voluntary reporting are time-consuming and unsustainable. Voluntary reporting also misses most adverse event cases. The widespread adoption of electronic health records (EHRs) captures medical data for the majority of US patients and presents an opportunity for sustainable adverse event surveillance via automated strategies. However, there are two barriers to automating adverse event surveillance. First, adverse events are poorly represented by International Classification of Disease (ICD) diagnosis codes. This has inhibited efforts to use simple rules-based code or flag/trigger approaches, while complex and high-performing text-mining approaches are thwarted by the difficulty of adapting them to other healthcare sites and large data networks for wider surveillance. Second, temporal information in the EHR inherent to adverse event timing and sequencing is challenging to capture. The challenges to existing approaches include – treatment of related medical concepts as independent entities, the rapid explosion of data inhibiting scaling to large numbers of medical concepts, and human interpretability. Our overarching goal is to expand on existing biomedical informatics tools to better capture adverse events and more comprehensively represent the full patient medical trajectory to identify archetypes of adverse event development. We will pilot these methods for cancer patients undergoing immune checkpoint inhibitor (ICI) therapy. In Specific Aim 1, we will incorporate medical concept embedding and clustering methods to draw a “map” of disease, segmented into “neighborhoods” labeled for the conditions they describe, including adverse events. In Specific Aim 2, we will test a novel method for tracking patient trajectories on a map of disease and hypothesize that we can identify archetypal patient trajectories that have different clinical outcomes using time-series clustering. This work addresses gaps in EHR-based phenotyping and adverse event surveillance. It has the potential to inform risk factor identification, prediction of adverse event development, and prognostication of patient outcomes, as well as lay a crucial stepping-stone for further progression of EHR-based phenotyping in biomedical informatics. This fellowship award will enable me to develop my skills in biomedical informatics methods, integrate clinical perspective into my research, hone my writing and presentation skills, and expand my professional network. At the conclusion of this award, I will have made strides towards becoming an independent physician-informaticist, fusing clinical exper...