Project Summary/Abstract Non-alcoholic fatty liver disease (NAFLD) affects >80 million people in the United States and is implicated in up 36% of liver-related deaths. While NAFLD is the fastest-growing cause of cirrhosis and liver-related complications, not all patients with NAFLD ultimately develop cirrhosis. Our ability to identify which patients are at highest risk is limited, which makes it challenging to allocate intensive lifestyle intervention and pharmacologic therapy to those at highest risk. The strongest predictor of incident cirrhosis is fibrosis stage, but existing fibrosis only identifies patients who have already progressed toward cirrhosis and requires advanced phenotyping such as biopsy or transient elastography which are not universally available. It will be critical to develop improved models for disease progression. This project focuses on two factors which may improve risk stratification of progression to cirrhosis: genetics and machine learning using electronic medical record (EMR) data. Heritability of liver fibrosis and cirrhosis is as high as 50%, and a number of genetic variants have been linked to risk of cirrhosis. The EMR is a rich but complex source of data used in clinical practice. When constructing models with such high-dimensional data, non-linear effects and interactions between predictors are common; machine learning algorithms may outperform the more commonly-used logistic regression models in this respect. The overall goal of this project is to generate predictive models for which patients with NAFLD are most likely to progress to cirrhosis by integrating genetics and EMR-based predictors with machine learning. The specific aims are (1) characterizing the effect of genetic risk factors on rate of progression from NAFLD to cirrhosis, (2) training and validating machine learning models for incident cirrhosis based on EMR data, and (3) generating integrated models incorporating both EMR and genetic data. To accomplish these aims, Dr. Chen will obtain further training in processing of EMR data, the fundamentals of statistical genetics, and machine learning and predictive modeling. Dr. Chen’s long-term goal is to become a leading, independent investigator generating models to predict outcomes in NAFLD and eventually even prioritize patients for treatment accordingly. An NIDDK K08 award will provide Dr. Chen with the necessary time and training to achieve his career goals and improve care for patients with NAFLD. Overall, this project will improve ability to predict which patients with NAFLD are most likely to develop cirrhosis and therefore enhance precision health by helping medical providers prioritize persons at highest risk to more intensive intervention.