Interpretable Bayesian Non-linear statistical learning models for multi-omics data integration

NIH RePORTER · NIH · R35 · $374,672 · view on reporter.nih.gov ↗

Abstract

Project Summary Recent technological advances have enabled the production of vast amounts of diverse multi-omics data types (e.g., genomics, epigenomics, proteomics, transcriptomics) of complex diseases such as cancer, cardiovascular diseases and neurodegenerative disorders. The integration of multi-omics data from those heterogeneous diseases can help in unraveling the underlying biological mechanisms at multiple omics data levels, in improving prediction of clinical outcomes, and to transform medicine, but at the same time presents significant challenges to identify important biomarkers from a large size of heterogeneous molecular data points (i.e. hundreds of thousands). We will develop and apply novel and powerful Bayesian statistical learning methods that will capture linear and nonlinear relationships of multi-omics data. The methods will be used to identify i) important predictive pathways and their corresponding important molecules; ii) clinically meaningful molecular disease subtypes, and iii) predictive and prognostic biomarkers that contribute to the joint association (or regulatory networks) between omics data types. The proposed method will be applied to multiple publicly available datasets such as The Cancer Genome Atlas, dbGAP, and Genotype-Tissue Expression, and to non public data sets obtained from our collaborators. We will develop robust, computationally efficient, and user-friendly software free of charge for the application of our methods.

Key facts

NIH application ID: 10931642
Project number: 5R35GM150537-02
Recipient: UNIVERSITY OF MINNESOTA
Principal Investigator: Thierry Chekouo Tekougang
Activity code: R35
Funding institute: NIH
Fiscal year: 2024
Award amount: $374,672
Award type: 5
Project period: 2023-09-20 → 2028-07-31