# Dissecting natural variation in transcription factor - DNA interactions

> **NIH NIH R35** · NEW YORK UNIVERSITY · 2022 · $390,795

## Abstract

PROJECT SUMMARY/ABSTRACT
Gains or losses of transcription factor binding at specific locations in the genome have been linked to a wide
range of human diseases. Despite our knowledge about the determinants of transcription factor-DNA
interaction, it is still challenging to accurately predict changes in transcription factor binding due to genetic and
epigenetic variants in the genome. Several critical gaps remain in our understanding of the integration of
sequence and non-sequence information on endogenous genomic DNA that give rise to the genome-wide
binding patterns of transcription factors. Our long-term vision is to shed light on how genome and epigenome
variation, which leads to variation in the genome-wide targets of transcription factors, affects the regulatory
networks of the cell, and the gene expression programs that give rise to phenotypic diversity.
 Our previous study characterized the genome-wide binding locations of more than 500 transcription factors
in Arabidopsis thaliana on the reference genome. Our integrative computational analysis revealed the features
of endogenous genome context, consisting of sequence motif, DNA shape, and 5-methylcytosine modification
of genomic DNA, that play a role in determining the binding landscape of transcription factors of major
structural families. To further study the variability of these binding sites, driven by native genome and
epigenome variation, we generated genome-wide, base-resolution maps of 5-methylcytosine, an epigenomic
mark on DNA, in a collection of over 1,000 world-wide, natural strains (accessions) of A. thaliana,
complementing the efforts to catalog genome sequence variation in these accessions. Guided by the diversity
in the genome and epigenome, a wealth of phenotypic data, and preliminary results suggesting transcription
factor binding variation in these accessions, our goals for the next five years are to address three major
challenges in understanding natural variation in transcription factor binding: 1) to determine the genome-wide
transcription factor binding variation across multiple accession genomes; 2) to characterize the effect of
transcription factor coding variants on their genome-wide binding specificities and target genes; and 3) to
investigate how natural variation of protein-protein interactions alters target genes and genome-wide binding
specificities for interacting transcription factors. All three projects will use computational modeling to evaluate
the contributions from features in the binding site environment.
 Our proposed experiments and computational models will make a broad impact by characterizing
transcription factor binding variation and understanding the role played by sequence and non-sequence
features of endogenous genomic DNA. Our results will shed light on the fundamental principles underlying the
regulatory functions of genome and epigenome variation, empowering the discovery and prediction of
regulatory variants and their molecular mechanism...

## Key facts

- **NIH application ID:** 10399604
- **Project number:** 5R35GM138143-03
- **Recipient organization:** NEW YORK UNIVERSITY
- **Principal Investigator:** Shao-shan Carol Huang
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $390,795
- **Award type:** 5
- **Project period:** 2020-07-01 → 2025-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10399604

## Citation

> US National Institutes of Health, RePORTER application 10399604, Dissecting natural variation in transcription factor - DNA interactions (5R35GM138143-03). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10399604. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
