Summary The fragmented clinical data in EHRs and trials makes it hard to study the relationship between Alzheimer's disease (AD) and multiple chronic diseases (MCC). This is because the data is often spread out across different platforms and databases, making it difficult to get a complete picture. In addition, the data is often incomplete. This can lead to gaps in research and missed opportunities to understand MCC’s contribution to AD progression. To overcome these challenges, we will develop interoperable electronic health records (EHR) with an application programming interface (API) that follows the standard data format, i.e., Fast Healthcare Interoperability Resources (FHIR). Partnering with ACTIVE MIND, an interventional trial that examines the potential efficacy of cognitive training (CT) in reducing dementia incidences, we will link, consent, extract and harmonize local EHRs and other relevant health information from ~1,000 patients. We will develop ontology models and use them to guide the natural language processing (NLP) models to distill, organize, and convert MCC and relevant concepts into FHIR-accessible data. Using these data together with FHIR-mapped structured data, we propose a demonstration project to develop novel missing data imputation and computational phenotyping models to stratify heterogeneous subpopulations based on longitudinal MCC patterns to predict their AD onset risks.