Scalable Bayesian Network analysis of multimodal FACS and SUMOylation data, with generalization to other big mixed biological datasets

NIH RePORTER · NIH · R01 · $252,110 · view on reporter.nih.gov ↗

Abstract

Scalable Bayesian Network analysis of multimodal FACS and SUMOylation data, with generalization to other big mixed biological datasets Abstract The Bayesian, or Belief, Network (BN) modeling is a powerful tool that is currently emerging as one of the principal data analysis, exploration and visualization methods for multimodal (aka mixed, or heterogeneous) “big” biological data. We have previously developed comprehensive BN algorithms and software package aimed at heterogeneous big biological data analysis. Over the recent years we have applied it to the different biological research domains / datasets (including chromatin interaction, tRNA evolution, genetic epidemiology and metabolomics, cancer epidemiology and single cell thymopoiesis data); work on three more projects (inferring immune signaling networks using FACS data, genome-wide SUMOylation, Alzheimer's genomic analysis) is currently in progress. In course of this work we have identified crucial “bottlenecks” that need to be addressed, on the methodological level, to make the BN analysis universally usable in our general context (that is, big biological data containing large numbers of variables of different types). These issues (scalability of the BN reconstruction process, handling mixed data types, and interpretation, evaluation & comparison of the resulting network models) have not been adequately addressed in the field yet, thus limiting the usability of the otherwise very powerful and elegant BN approach. Consequently, the primary goal of this project is to develop novel BN analysis algorithms with emphasis on (a) scalability, (b) handling mixed data types, and (c) resulting networks' interpretation and evaluation. We are particularly interested in the BN analysis of the quantitative flow cytometry (FACS) data generated as part of the ongoing City of Hope cancer immunogenetics research projects, as this type of data exemplifies BN modeling challenges, and any advances in algorithm and software development would be generalizable to most instances of big biological data. We will subsequently apply the BN analysis to the SUMOylation and chromatin interaction genomic data (also generated as part of the ongoing collaborative City of Hope research projects), to further test generalizability, and to produce additional biological results.

Key facts

NIH application ID
10048110
Project number
1R01LM013138-01A1
Recipient
BECKMAN RESEARCH INSTITUTE/CITY OF HOPE
Principal Investigator
Andrei Rodin
Activity code
R01
Funding institute
NIH
Fiscal year
2020
Award amount
$252,110
Award type
1
Project period
2020-07-01 → 2023-03-31