Project Summary / Abstract A central problem in biology is to understand how genomic variation affects genome function to influence phenotypes. Key challenges and opportunities lie in linking genomic variants to phenotypes, human health, and disease. Because it is not feasible to experimentally probe all genomic variants of interest in all contexts, improved computational methods to accurately predict the impact of unknown genomic variants are necessary. The aim of this research proposal is to gain mechanistic understanding of functional genomic interactions and ultimately to develop computational approaches to model and predict relationships among variation, functional elements, genome function, and phenotype. Two recently acquired key assets will be used to infer distal functional interactions among DNA elements: i) 3D genomics data and ii) multiple genome alignments. High- resolution contact mapping experiments (Hi-C and similar methods) have shown that the structural ensembles of chromosomes are fluid and yet specific to cell type and phase of life1. These ensembles of partially organized structures bring sections of DNA separated by great genomic distance into close spatial proximity and play an important role in controlling gene transcription 2,3. By measuring the frequency of physical contacts among DNA elements, DNA-DNA proximity ligation assays offer insight into the existence of functional interactions among the same elements, even when the nature of the interaction is unknown. In the last few years, there has been an explosion of activity directed toward assembling the genomes of many species 35–37. Hundreds of newly assembled end-to-end genomes constitute a dataset of transformative importance in studying the general operating principles of genomes across the tree of life using evolutionary information. This proposal aims to combine data from proximity ligation assays and coevolutionary information extracted from multiple genome alignments to infer the network of functional interactions among DNA elements. The computational approach will be based on Direct Coupling Analysis 29–32 (DCA) and other machine learning methods. The PI has previously employed DCA to study genome architecture 33 as well as in other contexts 34, and has already made important contributions to the field of 3D genomics.