Sequence models of genome regulatory architecture in 3D

NIH RePORTER · NIH · DP2 · $273,211 · view on reporter.nih.gov ↗

Abstract

Project Summary Interpreting individual genome sequence and the consequence of any sequence variation is critical for the study of the genetic basis of diseases and the path toward precision medicine. Genomic sequence is at the basis of multiple levels of genome regulation which are highly intertwined, including chromatin protein binding, 3D genome architecture, and gene transcription. Decoding these regulatory functions directly from the sequence will provide a computational platform for scalable prediction of variant effects and interrogation of base pair-level sequence functions with “in silico mutagenesis”. Progress has been made in decoding regulatory genomic sequence, including with the development of deep learning sequence models. However, the sequence-basis of complex phenomena such as transcriptional regulation will not be fully resolved without accounting for genome structure and long-range 3D sequence context. Genome structures at multiple spatial scales, including promoter- enhancer interactions, transcriptional condensates, topologically associating domains, chromatin compartments, and nuclear bodies can have strong impacts on transcriptional regulation. With data and techniques that have only now become sufficient to tackle this challenge, we will study these phenomena and trace complex regulatory output back to the basis of sequence dependencies by developing sequence models of 3D genome regulatory architecture. We will develop a computational framework of deep learning sequence models with the capability of modeling multiscale 3D genome interactions and integrating long-range sequence information, for comprehensively interpreting the regulatory functions of genome sequence. The proposed work will open up new possibilities for interpreting and applying structural and transcriptional impacts of sequence variations, including asking how genetic factors, such as large structural variants, affect gene expression through remodeling of genome structural organization in healthy and disease states.

Key facts

NIH application ID: 11014636
Project number: 4DP2GM146336-02
Recipient: UT SOUTHWESTERN MEDICAL CENTER
Principal Investigator: Jian Zhou
Activity code: DP2
Funding institute: NIH
Fiscal year: 2024
Award amount: $273,211
Award type: 4N
Project period: 2021-09-23 → 2025-09-12