# New data science approaches to visualize and understand the impact of the microbiome on risk of graft-versus-host disease

> **NIH NIH R01** · UNIVERSITY OF TX MD ANDERSON CAN CTR · 2024 · $243,000

## Abstract

Project Summary/Abstract
Allogeneic stem cell transplantation is a life-saving therapy for a variety of blood disorders, but its use is limited by a high
rate of serious side eﬀects, including the development of graft-versus-host-disease (GVHD). The gut microbiome, or the
composition of microorganisms populating the digestive tract, plays a key role in triggering this inﬂammatory response,
and there is an urgent need to analyze patient microbiome proﬁles to both predict and mitigate risk of GVHD. However,
microbiome data pose a number of statistical challenges not addressed by existing methods due to high dimensionality,
heterogeneity across subjects, and complex phylogenetic relationships. In this proposal, we develop new data science
approaches to make sense of microbiome data, providing insight that can guide the development of future interventions
aimed at reducing GVHD incidence. We will develop accurate and eﬃcient methods for microbiome data analysis and
make them available in user-friendly formats. We focus on the development of novel methods for visualization and
prediction using microbiome data, as detailed in the following speciﬁc aims:
Speciﬁc Aim 1: To develop and evaluate advanced tools for visualization of microbiome data. The high
dimensionality and unique structure of microbiome data present challenges to eﬀective data visualization. In this aim,
we will develop approaches for both unsupervised and supervised visualization of microbiome data, along with an RShiny
app and QIIME2 plug-in that will make these tools accessible to both clinicians and bioinformaticians. The methods and
software resulting from this aim will provide robust approaches to enable researchers to better visualize global microbiome
heterogeneity across their study population, enhancing data exploration and identiﬁcation of potential confounding factors
or outliers.
Speciﬁc Aim 2: To develop predictive modeling approaches for binary and survival outcomes. In this aim, we
will focus on selection of predictive microbiome features in the context of regression. We will carry out key advances
enabling the eﬀective application of sparse modeling to predict GVHD risk: novel statistical approaches to handle binary
and time-to-event outcomes, including those with competing risks, and computationally eﬃcient implementations, to be
made freely available as both an R package and RShiny application.
Speciﬁc Aim 3: To develop methods for understanding the impact of rare features. Current microbiome proﬁling
methods allow for very ﬁne resolution of the strains present in each sample. In this aim, we propose two methods to
understand the impact of rare features. We will ﬁrst develop a method to provide insight into kernel association results, by
obtaining estimated eﬀect sizes for individual microbiome features. We will then develop an approach for nonparametric
clustering of the regression coeﬃcients, which allows ﬂexible aggregation of the observed rare features.
Successful ...

## Key facts

- **NIH application ID:** 10778625
- **Project number:** 5R01HL158796-03
- **Recipient organization:** UNIVERSITY OF TX MD ANDERSON CAN CTR
- **Principal Investigator:** Christine B Peterson
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $243,000
- **Award type:** 5
- **Project period:** 2022-04-01 → 2026-02-28

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10778625

## Citation

> US National Institutes of Health, RePORTER application 10778625, New data science approaches to visualize and understand the impact of the microbiome on risk of graft-versus-host disease (5R01HL158796-03). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10778625. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
