Uncovering Nodal signaling and transcription factor interactions in somitic mesoderm development using single-cell deep learning methods

NIH RePORTER · NIH · F30 · $51,149 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY/ABSTRACT Major gaps remain in our knowledge of how transcription factors (TFs) interact to bind target cis-regulatory elements (CREs) and dictate gene expression during development. There are ~1600 TFs in vertebrates, and therefore traditional approaches of genetic screens with TF pairwise knockouts would require >2.5 million experiments. Even with high throughput methods, this is not experimentally feasible. I will build novel computational tools and deep neural networks and use multiplexed high-throughput single-cell Assay for Transposase-Accessible Chromatin (scATAC-seq) data from zebrafish throughout development. These deep neural networks will be used for in silico experiments to model CRE interactions to learn the cell-type specific regulatory syntax of T-box proteins during development. These combinations of TF-TF interactions from in silico experiments will then be tested with targeted CRISPR-Cas9 mutagenesis followed by phenotype profiling with in situ hybridization and high-throughput low-cost scATAC and scRNA-seq. In Aim 1, I will make a genome-wide cis-regulatory map of cell-type specific gene regulation of zebrafish to uncover the role of Nodal signaling in zebrafish somitic mesoderm development. In zebrafish, mutations to Nodal, a ligand to TGF-Beta receptor proteins, cause a phenotype of aberrantly undifferentiated trunk somitic mesoderm and correctly differentiated tail somitic mesoderm. The mechanisms driving the differences between these somites are unknown. To resolve this mystery, I will generate single-cell time series wild-type and Nodal deficient embryos across the continuum of zebrafish development using multiplexed high-throughput scATACseq and scRNAseq data. Computationally linking these data will represent a comprehensive reference of zebrafish CRE and transcriptional development and a valuable resource for all zebrafish biologists. By improving the software package, Cicero, to include flexible Poisson lognormal network models, we can achieve the resolution necessary to find novel cell-type specific differences in enhancer-promoter links during development and perturbationc In Aims 2, I will train and validate a deep learning neural network model to predict pairs of transcription factors that interact to activate cell-type specific gene programs. I will use these data and computational tools to perform in silico experiments to learn the cell-type specific regulatory syntax of T-box TFs during development. After performing in silico experiments using this neural network, I will rank candidate TF-TF interactions to test using high-throughput methods for targeted CRISPR-Cas9 mutagenesis to knock out TFs. I will apply this method to uncover the cis-regulatory syntax that allows T-box family transcription factors to exert their DNA loci specificity.

Key facts

NIH application ID
11013782
Project number
5F30HD113217-02
Recipient
UNIVERSITY OF WASHINGTON
Principal Investigator
Andrew Carter Mullen
Activity code
F30
Funding institute
NIH
Fiscal year
2024
Award amount
$51,149
Award type
5
Project period
2023-08-16 → 2026-12-31