PROJECT SUMMARY: Congenital heart disease is the most common congenital anomaly and affects approximately 1% of infants. Hypoplastic left heart syndrome (HLHS), a severe form of congenital heart disease in which the left ventricle is underdeveloped, has a 10-year mortality of 40%. Only 6% of HLHS patients have a genetic cause identified on exome sequencing, limiting the ability of patients to receive a diagnosis and potentially benefit from targeted treatments. There are two theoretical mechanisms for HLHS: a cardiomyocyte origin, where there a defect in cardiac muscle cells causes underdevelopment of the ventricle, or an endothelial origin, where value abnormalities attenuate flow through the left ventricle. Two known HLHS genes, RBFOX2 and NOTCH1, are primarily expressed in cardiomyocytes and cardiac endothelial cells, respectively and provide an opportunity to study these mechanisms. Discovery of additional pathogenic HLHS variants could increase the proportion of diagnosed patients and improve our molecular understanding of cardiac development. Currently, most pathogenic variants in exome sequencing are loss-of-function variants that reduce gene expression. To test my hypothesis that missense and noncoding variants also contribute to HLHS by altering gene expression or activity, I propose to use machine learning on HLHS patient genome sequencing, three-dimensional protein structure, and enhancer assay data to identify new genetic contributors to HLHS. By completing these aims, I will advance my training in functional assays and machine learning to be best prepared for a career as an independent physician scientist. My scientific goal is to identify new variants and loci that contribute to HLHS. First, in Aim 1 I will use machine learning to predict the pathogenicity of missense variants in RBFOX2 from HLHS patients. Accuracy of these predictions will be determined by genome editing of induced pluripotent stem cells to introduce the RBFOX2 missense variants, followed by assessment of RBFOX2 expression and function during cardiomyocyte differentiation. In Aim 2, NOTCH1 missense variants will be similarly assessed for pathogenicity during cardiac endothelial cell differentiation. Finally, in Aim 3 I will use massively parallel reporter assays to identify active cis-regulatory regions near RBFOX2- and NOTCH1-pathway genes, and then determine if rare variants in HLHS patients within these regions cause gene dysregulation. I will use linear models and machine learning to determine which cardiac genomic annotations that best predict enhancer activity, and use those annotations to identify additional candidate HLHS loci. Together this proposal will employ machine learning on biological data in a way that uses my background in developmental biology and develops new skills in computational and functional genomics. These results will contribute towards the long-term objective of understanding the molecular basis of heart development and human disease to...