Bayesian multivariate 3D spatial modeling for microbiome image analysis

NIH RePORTER · NIH · R01 · $551,098 · view on reporter.nih.gov ↗

Abstract

Bacteria play critical beneficial and harmful roles in human health. Living in biofilm communities, one species may attack, protect, or provide nutrients for neighboring species. These interactions determine the community's net effects. Clarifying community organization is needed to understand how biofilm affects health. To begin to meet this need, we developed an imaging technique, Combinatory Labeling and Spectral Imaging Fluorescence in Situ Hybridization (CLASI-FISH), which displays how taxa's cells are located relative to each other and to host cells. Yet biofilm's complex, three-dimensional (3D) architecture is poorly captured by commonly used measures, such as intercellular distances or global biofilm volume for one or two taxa. Here, we propose to extend Log Gaussian Cox process models (LGCP) to describe and test hypotheses about human biofilm architecture, a novel application. Computational burden limits existing LGCP models for geostatistical data to datasets with thousands of observations. These methods cannot be applied to biofilm image data typically containing millions of observations. In preliminary work on two-dimensional (2D) biofilm images, we have successfully scaled up multivariate LGCPs for six taxa. Estimated pairwise cross-correlation functions differ in univariate analyses, which ignore other taxa's locations, versus multivariate analyses, which leverage taxa's joint spatial distribution. We propose statistical innovations to address challenges raised by, but not unique to, 3D biofilm images. Comparing biofilm across sample groups defined experimentally or based on exposure history requires integrating data across subjects' images that lack true spatial correspondence. Further, 3D spatial analyses have not been applied to multivariate data with millions of observations. The goal of this proposal is therefore to build a Bayesian multivariate 3D LGCP that incorporates different images—thereby allowing for non-spatial covariate factors—by applying a separate coordinate system to each image. This proposal has three parts: (a) the development of novel multivariate 3D spatial analysis methods (aims 1-3), (b) evaluation of a hypothesis regarding the spatial structure of human tongue microbiome (aim 4), and (c) software development and dissemination, based on best practices (aim 5). The interdisciplinary team has a deep skill set and experience developing Bayesian high-dimensional multivariate analysis methods. The core innovation proposed is to integrate non-spatial covariates with multivariate spatial data across 3D images lacking a common coordinate system. Sample accessibility and prior biological knowledge make the oral cavity the best starting point to develop a flexible modeling framework that will allow testing of hypotheses regarding microbial interactions and associations with host characteristics. This is a fundamental shift for how such images will be analyzed, potentially providing new insight into the role of ora...

Key facts

NIH application ID
10401247
Project number
5R01GM126257-02
Recipient
BRIGHAM AND WOMEN'S HOSPITAL
Principal Investigator
KYU HA LEE
Activity code
R01
Funding institute
NIH
Fiscal year
2022
Award amount
$551,098
Award type
5
Project period
2021-05-04 → 2025-02-28