PROJECT SUMMARY/ABSTRACT DNA microarray is a routine clinical pediatric diagnostic test to identify genomic copy number variants (CNVs) that could be causative of autism, developmental delay, and multiple congenital anomalies. This test is also used frequently in the prenatal setting to find genomic causes of fetal anomalies found by ultrasound, or to predict potential phenotypes postnatally. Current clinical guidelines for interpretation of CNVs focus solely on the characteristics of genes contained within the CNV breakpoints. However, recent studies on chromatin architecture, utilizing Hi-C or related techniques, have demonstrated that CNVs can also disrupt the structure of topologically associated domains (TADs). TADs are “neighborhoods” of physical DNA interactions that serve several functions, including the prevention of ectopic gene-enhancer interactions. This TAD disruption can lead to pathogenic alterations in transcription of genes outside the CNV region that are ultimately causative of disease. The central hypothesis of this proposal is by only focusing on genes within the CNV region for clinical interpretation, critical genomic information is being entirely ignored in DNA microarray interpretation, ultimately leading to missed diagnoses for patients. To address this issue, we have recently developed the free-to-use software ClinTAD (www.clintad.com; J Hum Genet (2019)) to assist in the interpretation of CNVs while taking potential TAD disruption into account. To our knowledge, this is the first software of its kind to attempt to integrate TADs into clinical CNV interpretation. While ClinTAD v1.0 is currently available as a decision support tool to assist in clinical practice, it is currently limited both in its ease-of-use as well as its predictive power. Further enhancing the utility of ClinTAD motivates the two Aims of our proposal here: 1) We aim to optimize ClinTAD as both a clinical decision support and research tool by allowing incorporation of TAD boundaries from different datasets, enabling an API for analysis of large case cohorts, adding interpretation tools for Regions of Homozygosity found on SNP array, and allowing for creation of a de-identified database where users can upload cases with suspicion for pathogenicity based on TAD disruption. 2) We aim to improve the predictive power of ClinTAD through machine learning to identify the most predictive features of pathogenicity in a large, publicly available CNV cohorts, as well as by incorporating a recently-described convolutional neural network-based model which can predict TAD disruption as a function of CNV breakpoints. In this proposal we aim to make ClinTAD the premier tool for the interpretation of CNVs in the context of TAD disruption. Our long- term goal is to build a collaborative network of users that will enable us to identify patients with the most probability of having clinical phenotypes caused by TAD disruption. Such a unique patient cohort could then form the...