PROJECT SUMMARY: Congenital heart disease (CHD) is the most common anomaly at birth, affecting 1% of infants. Damaging genic variants contribute significantly to CHD risk but a likely genetic cause is identified in only 50% of patients. The genetic basis for the remaining half of CHD is unknown. The Gabriella Miller Kids First (GMKF) and TOPMed programs funded whole genome sequencing (WGS) to tests our hypothesis that variants undetected by whole exome sequencing (WES) contribute to CHD. WGS from 1813 CHD trios (affected probands and parents) provides a unique opportunity to define additional coding and noncoding variants that convey CHD risk. First, coding variants in CHD sequencing data will be comprehensively analyzed. WGS allows for improved detection of damaging coding variants that are not detected by WES, including structural variants and variants outside WES capture regions. Therefore, in Aim 1, damaging structural, mosaic and single nucleotide variants will be identified in WGS data. Novel CHD genes with a burden of damaging coding variants in CHD compared to non-CHD cohorts will be identified. Second, integration of CHD cardiac tissue gene expression with WGS data will to prioritize noncoding variants likely to impact developmental gene regulation. Aim 2a assesses the potential contribution of rare noncoding variants adjacent to cardiac expression quantitative trait loci (eQTLs) to CHD. In a parallel approach, Aim 2b will leverage 430 human cardiac developmental functional genomic annotations including those ascertained from human induced pluripotent stem cells throughout differentiation into cardiomyocytes. Human cardiac epigenetic landscape may be more successful in defining genetic mechanisms of the dominant CHD that typifies human CHD, as mouse CHD is typically a recessive phenotype. Available annotations include histone methylation and acetylation states, as well as chromatin accessibility (ATACseq), chromosome conformation (Hi-C), and RNA expression. A neural net will be trained on CHD eQTL variants to identify a subset of annotations that are able to separate eQTL from non-eQTL loci. Prioritized functional annotations will be used to calculate a per-base regulatory score across the genome (EpiCard), and score thresholds will be queried for a burden in the CHD cohort. Finally, Aim 3 addresses the role of common genetic variants in CHD risk and phenotypic variance. Leveraging the power of the trio structure, common variants over-transmitted to CHD probands will be identified. Over-transmitted loci will then be assessed for association with CHD in a case-control study in a second CHD cohort. Functional modeling of prioritized genes, variants and loci is essential; committed collaborators are already engaged in preliminary studies. Together this proposal will employ innovative computational approaches to prioritize variants and loci associated with CHD. These results will contribute towards the long-term objective of understanding th...