ABSTRACT Alzheimer’s disease (AD) and AD-related dementia (AD/ADRD) is the 6th leading cause of death in the United States (US) – is an aging-related neurodegenerative disease with complex pathogenic mechanism affecting an estimated 6.2 million Americans in 2021. Both the pathogenic mechanism and pathophysiology of AD/ADRD are complex, creating difficulties in finding effective new treatment or prevention strategies, despite significant investments in the last decade. On the other hand, the proliferation of large clinical research networks (CRNs) with real-world data (RWD), such as electronic health records (EHRs), claims, and billing data among others, offer unique opportunities to generate real-world evidence (RWE) that will have direct translational impacts on AD/ADRD. In the past, RWD such as EHRs have limited use for AD/ADRD drug repurposing and primarily used only for validating and evaluating the hypotheses generated by molecular level predictions of AD/ADRD repurposing agents, partially due to a number of key methodological gaps: (1) the lack of integration with existing rich biological and pathophysiological knowledge of AD/ADRD for hypothesis generation, (2) the lack of validated computable phenotyping (CP) and natural language processing (NLP) algorithms and tools that can accurately define the study populations, extract key relevant patient characteristics and meaningful outcomes (e.g., MMSE scores to determine severity), (3) the lack of consideration on the heterogeneity of the disease (i.e., AD/ADRD subtypes), and (4) the lack of recognition of the inherent biases in RWD and the need of applying causal inference principles. The goal of this project is to develop a comprehensive machine learning based causal inference framework for generating high-throughput and high-quality drug repurposing hypotheses for AD/ADRD by integrating heterogeneous information sources. There are three aims in this project. Aim 1 aims at developing computable phenotypes to extract key patient characteristics and outcomes relevant to AD/ADRD drug repurposing studies from RWD. Aim 2 aims at developing a learning-based causal inference framework for generating drug repurposing hypotheses from RWD, a deep knowledge embedding framework for generating drug repurposing hypotheses from biomedical knowledge bases (BKB); and a mutual information enhancement framework that combines the information from both RWD and BKB to further improve the quality of the generated hypotheses. Aim 3 aims at validating the generated hypotheses with diverse data sources and approaches. The project will leverage the patient data from two large clinical research networks (CRNs) contributing to the national Patient-Centered Clinical Research Network (PCORnet) – covering ~15 million Floridians and ~11 million New Yorkers. The developed algorithms and software will be open sourced and widely disseminated within the CRNs and the AD/ADRD research communities.