Learn Systems Biology Equations From Snapshot Single Cell Genomic Data

NIH RePORTER · NIH · R01 · $318,000 · view on reporter.nih.gov ↗

Abstract

Understanding how cells respond to environmental changes is a fundamental task in systems biology and has profound biomedical implications. Mathematical modeling on small network motifs using dynamical systems theories has been successful on providing mechanistic insight and guidance, but generalization to a genome- wide intertwined gene regulatory network is challenging. Single cell genomics approaches emerge as powerful tools for studying cellular processes, but the destructive nature of most single cell techniques makes it unfeasible to extract dynamical information of cellular processes. In addition, a number of grand challenges impede further development of the field, such as trajectory inference, effect of various sources of errors on data analysis, and validating and benchmarking tools for single cell measurements and analyses. The goal of this proposed research is to tackle these challenges through integrating dynamical systems modeling into single cell genomics analyses. The proposed research is based on recent advances in the single cell genomics field that one can extract both transcriptome (x) and estimation of RNA velocity (i.e., instant time derivatives of transcriptome, dx/dt) from single cell genomics data. We further developed a unified theoretical framework that allows estimating the velocity information from various types of single cell data, and a machine learning based computational pipeline of reconstructing systems biology equations for genomewide regulatory networks, together with a computer package, dynamo, released to the community. This integration between single cell genomics analyses and systems biology modeling provides quantitative mechanistic and dynamics information. We propose to further develop our package and computational framework to address several limitations in our published work. In Aim 1, we will first develop dynamo to interface with other single cell analysis and dynamics modeling packages, and expand the types of single cell data to be analyzed. Then we will develop and test a discrete dynamical model for full stochastic cellular dynamics based on the graph representation of discrete vector fields. In Aim 2, we will first develop a systematic pipeline of integrating data of multi-modality (e.g., ATAC-seq, DNA sequencing and binding site analyses, etc) and dynamo to identify genetic codes of combinatorial function of transcriptional factors, the so-called composite elements in genetics. Eukaryotic cells use a combination of a finite number of transcription factors to generate a large number of different target gene regulation patterns. Cracking the genetic code at the genome-wide level is fundamental to cell biology but challenging despite extensive efforts. Then we will expand the pipeline to reconstruct biology- informed systems biology models for the genomowide gene regulation. We will evaluate the in silico predictions from the model against several Perturb-seq datasets.

Key facts

NIH application ID
10929427
Project number
5R01GM148525-02
Recipient
UNIVERSITY OF PITTSBURGH AT PITTSBURGH
Principal Investigator
Jianhua Xing
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$318,000
Award type
5
Project period
2023-09-15 → 2027-06-30