Bayesian Differential Causal Network and Clustering Methods for Single-Cell Data

NIH RePORTER · NIH · R01 · $304,449 · view on reporter.nih.gov ↗

Abstract

Project Description DMS/NIGMS 2: Bayesian Differential Causal Network and Clustering Methods for Single-Cell Data A Signiﬁcance A.1 Importance of the Problem to Be Addressed Single-cell RNA-sequencing (scRNA-seq) technologies have facilitated new biological discoveries that were impossible with bulk RNA-seq, such as discovering at the single-cell level new gene regulatory activities and cell types. However, in order to translate the fundamental biological knowledge advanced by the scRNA- seq to improved disease diagnosis, treatment, and prevention, new methods are required to comparatively study the molecular differences between normal and pathological cells/tissues, and between control and case/treatment groups. Although identiﬁcation of differentially expressed genes across two sample groups has been extensively studied, to date, the vast majority of the existing methods for identifying gene regu- latory networks (GRNs) and cell types have, so far, focused on scRNA-seq data generated under a single experimental condition. In principle, these methods can be applied to one experimental condition at a time, based on which post hoc comparisons can be made in order to ﬁnd the differences caused by experimental interventions. However, compared to joint modeling approaches, this two-step procedure is deemed less efﬁcient and more susceptible to false discoveries due to lack of proper uncertainty propagation from the ﬁrst step to the second. Moreover, most scRNA-seq network models are correlative in nature and do not infer causal gene regulatory relationships. There is, therefore, a critical need to develop new models for identifying the effects of experimental interventions on causal gene regulation and cell composition by jointly modeling scRNA-seq data across experimental groups. In the absence of such tools, mechanistically un- derstanding gene regulation and cell differentiation, and fully realizing the translational values of scRNA-seq studies will likely remain difﬁcult. A.2 Rigor of Prior Research Aim 1. Many existing scRNA-seq network approaches adapt standard association measures to zero- inﬂated scRNA-seq data, e.g. Pearson correlation [1] and mutual information [2]. A common limitation of these methods is that they only quantify marginal dependencies, which is susceptible to spurious indirect associations [3]. Graphical models which deal with conditional associations are powerful alternatives to the marginal association measures. Numerous methods have been proposed for general purposes [4, 5] including the development on non-Gaussian data [6–9]. Speciﬁcally for scRNA-seq data, two undirected graphical models including Co-I Cai's work [10, 11] were recently proposed based on neighborhood selec- tion which, however, do not infer causal gene regulation. To identify causal relationships, several alternative methods [12, 13] were developed. However, these methods either ignore the count nature of scRNA-seq data, require a known pseudotime (whic...

Key facts

NIH application ID: 10592720
Project number: 1R01GM148974-01
Recipient: TEXAS A&M UNIVERSITY
Principal Investigator: Yang Ni
Activity code: R01
Funding institute: NIH
Fiscal year: 2022
Award amount: $304,449
Award type: 1
Project period: 2022-09-21 → 2026-08-31