PROJECT SUMMARY/ABSTRACT Electronic health records (EHRs) are now ubiquitous in routine cancer care delivery. The large volumes of data that EHRs contain could constitute an important resource for research and quality improvement, but to date, EHRs have not fully realized this potential. Important clinical endpoints, such as disease histology, stage, response, progression, and burden, are often recorded in the EHR only in unstructured free-text form. Even when structured data are available, they may be recorded only at one point in time, such as diagnosis, and may not be as relevant later in a patient's dynamic disease trajectory. These barriers prevent scalable analysis of EHR data for even relatively straightforward research tasks, such as identification of a cohort of patients potentially eligible for clinical trials. Identifying patients for trials is an important challenge in cancer research, since under 5% of adults with cancer have historically enrolled in therapeutic trials. Tools are in development to better match patients to trials, but no such tools are both publicly available and capable of incorporating time- specific patient phenotypes generated using unstructured EHR data. Recent rapid innovation in deep learning techniques could provide novel solutions to these challenges. In ongoing work, I have found that natural language processing based on a neural network architecture can reliably extract clinically relevant oncologic endpoints from free-text radiology reports. My goal is to develop an independent research program focused on leveraging such methods to put the EHR to use at scale for discovery and improving cancer care delivery. My specific aims are (1) to develop and validate a clinically relevant, dynamic, pre-trained cancer trajectory model by applying deep learning to integrated structured and unstructured EHR data; (2) to apply transfer learning to a pre-trained cancer trajectory model to match patients to clinical trials using EHR data and clinical trial protocols; and (3) to pilot the incorporation of cancer trajectory modeling into an institutional clinical trial matching tool. In the near term, this work will facilitate accrual to clinical trials at our institution. During the independent research portion of the proposal, it will constitute the basis for a general framework for conducting scalable cancer research using EHR data.