Project Summary While over 80% patients with Coronavirus Disease 2019 (COVID-19) experienced only mild illness, the mortality rates have been reported to be 6.4-13.4% in vulnerable populations, including older adults and patients with multiple co-morbidities. Pharmacological treatments are primarily used for patients with moderate to severe disease. Optimal prescribing of drug therapy relies heavily on accurate risk stratification based on patient prognosis. Since it is known that COVID-19 can often cause rapid clinical deterioration, it is critical to have a prognostic tool well-predictive of disease progression and adverse clinical outcomes, so the pharmacological treatments or other interventions can be initiated timely. Also, during the COVID-19 pandemic, many healthcare facilities need to operate beyond regular capacity with limited resources, such as mechanical ventilators, therapeutic agents, and intensive care unit (ICU) bed availability. A reliable prognostic tool is essential for optimal decisions regarding medical disposition (e.g., home monitoring vs. admission) and resource allocation (eg, ICU beds and mechanical ventilators). While there are seemingly abundant data in prognostic prediction for patients with COVID-19, there remain two major knowledge gaps. First, all of the existing prediction models only consider factors measured at hospital admission without incorporating dynamic changes of biomarkers over time. The models thus have limited clinical applicability since many of these biomarkers are repeated multiple times during a treatment course and clinicians need to know how these dynamic changes can inform medical decisions. Second, while medication use and the initiation timing are highly informative of disease severity, they were not used for prognostic prediction in the prior models. We aim to build a prospective prognostic modeling system based on near-real-time electronic health record (EHR) data from Mass General Brigham, a large care delivery network in Massachusetts that includes 2 tertiary and 11 secondary hospitals and >30 ambulatory centers. We have established the basic infrastructure and currently receive weekly data updates. The database currently has >14,000 confirmed cases of COVID-19 and are expanding at the rate of 500-1000 confirmed cases per week, allowing us to build prediction models with rich data input and ability to perform prospective validation. We will develop a dynamic prognostic tool incorporating baseline characteristics, time-varying factors with their dynamic changes, medication use and its timing to predict key clinical outcomes. Data accrued from March to August, 2020 will be used for model derivation and data from September to December, 2020 will be used for prospective validation. In addition to the predictors reported in the literature, we will search for novel predictors by screening through the rich EHR data using TreeScan, a novel, validated, statistical tool adopted by the US Food and D...