# Sequence-based Machine Learning for Inference of Dynamic Cell State Gene Network Models

> **NIH NIH R01** · JOHNS HOPKINS UNIVERSITY · 2024 · $461,554

## Abstract

Most disease associated GWAS variants have relatively modest effects on expression in reporter or
CRISPR perturbation assays. In addition, enhancer disruption in vivo often has surprisingly weak
phenotypic consequences. We hypothesize that a critical missing element is our lack of quantitative
models of how multiple TFs interact at an enhancer, and how multiple enhancers interact at a locus to
respond to perturbations in a nonlinear way through altered gene network activity. Predicting the impact
of genomic variation thus requires quantitative modeling of how one variant's impact depends on other
variants through their combined effect on altered cellular regulatory state. The central aim of this
proposal is to develop computational methods to infer quantitative models of these combinatorial
interactions by training on temporally-resolved measurements of gene activity, enhancer activity, and
core cell fate-regulating transcription factor (TF) activity across cell state transitions in early human
development. Our preliminary studies show that while promoter knockdown has robust effects on target
gene expression, individual enhancer knockdown is often weaker and affects temporal transition
dynamics, but not the final steady state. We show that gene network models based on sequence-based
machine learning are consistent with these observations. We propose improvements to our sequence
based models to develop kinetic rate equation and stochastic simulation gene network models to predict
the variable and often temporal effects of enhancer perturbation. We will generate high time resolution
ATAC, H3K27ac, and scRNA-seq data to train these models, and validate the gene network predictions
of network response with CRISPRi in a native genomic context. We will first focus on our embryonic-
stem-cell to definitive-endoderm (ESC-DE) system, and we will then develop methods to generalize
application of these focused models to larger ENCODE regulatory datasets. Our work will enable a
quantitative understanding of how the altered activity of regulatory elements affects the stability and
dynamics of the gene regulatory networks within which the element operates, and how they play a role in
controlling developmentally important and disease relevant cell state transitions.

## Key facts

- **NIH application ID:** 10829913
- **Project number:** 5R01HG012367-03
- **Recipient organization:** JOHNS HOPKINS UNIVERSITY
- **Principal Investigator:** Michael A Beer
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $461,554
- **Award type:** 5
- **Project period:** 2022-07-15 → 2026-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10829913

## Citation

> US National Institutes of Health, RePORTER application 10829913, Sequence-based Machine Learning for Inference of Dynamic Cell State Gene Network Models (5R01HG012367-03). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10829913. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
