Machine Learning for Integrative Modeling of the Immune System in Clinical Settings

NIH RePORTER · NIH · R35 · $19,217 · view on reporter.nih.gov ↗

Abstract

Machine Learning for Integrative Modeling of the Immune System in Clinical Settings In response to an immunological challenge, immune cells act in concert forming complex and dense networks. A deep understanding of these immune responses is often the first step in developing immune therapies and diagnostic tests. Multivariate modeling algorithms can simultaneously consider all measured aspects of the immune system but requires prohibitively larger cohort sizes as technological advancements increase the number of measurements (a.k.a., “Curse of Dimensionality”). To address this, we propose a series of studies to develop machine learning algorithms for comprehensive profiling of the immune system in clinical settings. Particularly, for analysis of the immune system at a single-cell-level, we will leverage the stochastic nature of clustering algorithms to produce a robust pipeline for prediction of clinical outcomes. Next, we introduce the immunological Elastic-Net (iEN) algorithm, which addresses both the curse of dimensionality and reproducibility by integrating prior immunological knowledge into the models. The cellular systems that govern immunity act through symbiotic interactions with multiple interconnected biological systems. The simultaneous interrogation of these systems with suitable technologies can reveal otherwise unrecognized crosstalk. In collaboration with several leading laboratories, we have produced multiomics datasets (including analysis the genome, proteome, microbiome, and metabolome) in synchronized groups of patients. Using these coordinated datasets, we will evaluate several algorithms for combining multiple biological modalities while accounting for the intrinsic characteristics of each assay, to reveal biological cross- talk across various systems and increase combined predictive power. Importantly, numerous population- level factors (including medical history, environmental, and socioeconomic factors) significantly impact the immune system and studies focused on homogenous patient populations often lack generalizability to other populations. To address this, we will develop machine learning strategies to integrate population-level factors directly into our immunological data. These models will objectively define subpopulations of patients and enable flexibility in the coefficients of the models (and hence, the importance of the various biological measurements) in each group. This research program will be executed using data from several biorepositories focused on various diseases. This approach will ensure generalizability of our work to previously unseen datasets and increase the long-term impact of our findings. Throughout the proposal, a major area of focus is the development of visualization and model-reduction strategies that lay the foundation for interpretation of complex models. The machine learning algorithms developed will be readily applicable to a broad range of multiomics and multicohort studies and will b...

Key facts

NIH application ID
10727034
Project number
3R35GM138353-04S1
Recipient
STANFORD UNIVERSITY
Principal Investigator
Nima Aghaeepour
Activity code
R35
Funding institute
NIH
Fiscal year
2023
Award amount
$19,217
Award type
3
Project period
2020-09-05 → 2025-06-30