Project Summary/Abstract The ability for transcription factors to achieve specificity in the nucleus is critical for proper gene regulation. Mis-targeting of protein binding can disrupt an organism's ability to maintain homeostasis and result in disease states. While sequence-specific transcription factors ostensibly derive their specificity for binding based on preference to specific DNA patterns (motifs), multiple confounding variables such as epigenetic state, co-factor binding partners, and chromatin accessibility make the reality far more complicated. This research program applies a high-throughput (robotic) biochemical genomic approach with machine learning algorithms to identify the rules and mechanisms that govern the binding of proteins to the genome. We have previously developed multiple high-resolution genomic assays (e.g., ChIP-exo, PB-exo, WhIP-exo) that quantify genome-wide binding of proteins to DNA with varying levels of regulatory features present. We demonstrated the utility of these assays to understand the native binding preferences of general regulatory factors in the yeast model organism. The next stage of this research will be to apply these assays on human transcription factors in ultra-high throughput using a liquid handling robotic system to identify the mechanisms underlying transcription factor sequence specificity. The first major direction will be to determine the ability of purified transcription factors to bind naked DNA genome-wide (PB-exo) and to examine how the epigenetic status of the DNA can change the detected binding of a protein. By using genomic DNA sourced from different cell states, we will be able to precisely characterize the effect of cell state-specific DNA methylation on protein binding at base-pair resolution. We will also apply AI/ML algorithms that we have developed to cross-validate biological discoveries and make new testable hypotheses. An orthogonal and complementary approach will apply the WhIP-exo assay to examine transcription factor binding specificity again on naked DNA, but this time in the context of various cellular extracts. In addition to uncovering the effects of cell-state specific co-factors on protein binding specificity, we will also incorporate ChIP-mass spectrometry to identify the co-factors complexing with transcription factors of interest when they are bound to DNA. These assays will provide downstream testable hypotheses with regards to which protein co- factors are responsible for modulating binding specificity. The goals of this research program will result in a detailed understanding of the features responsible for regulating binding in hundreds of transcription factors along with identities of the protein co-factors that modulate binding.