Novel Statistical Methods for Analyzing Complex Microbiome Data

NIH RePORTER · NIH · R01 · $309,367 · view on reporter.nih.gov ↗

Abstract

Project Summary/Abstract It is imperative to elucidate the roles that different microbes play in human health and diseases. However, microbiome data from (either 16S rRNA gene or shotgun metagenomic) sequencing studies have unique and complex features, including high-dimensionality, sparsity, overdispersion, compositionality, and experimental bias. Existing statistical methods for hypothesis testing often fail to account for these features in full and thus tend to yield false-positive results. The goal of this application is to develop robust and flexible statistical methods that perform well in the presence of all data complexities, allow testing of various hypotheses (e.g., differential abundance, dynamic changes, mediation effects), and accommodate a wide range of datasets (e.g., continuous or discrete traits of interest, longitudinal data). To these ends, we propose the following specific aims: (Aim 1) to develop a new framework for compositional analysis of differential abundance; (Aim 2) to develop methods for controlling Monte-Carlo error rate in resampling-based multiple-hypotheses testing; (Aim 3) to develop methods for analyzing longitudinal data; (Aim 4) to develop a new framework for mediation analysis of the microbiome; and (Aim 5) to develop and support a user-friendly software program implementing the methods developed in Aims 1-4. We will evaluate these methods using extensive simulation studies and multiple datasets from real microbiome studies at Emory University that we are actively involved in.

Key facts

NIH application ID
10595015
Project number
5R01GM141074-03
Recipient
EMORY UNIVERSITY
Principal Investigator
Yijuan Hu
Activity code
R01
Funding institute
NIH
Fiscal year
2023
Award amount
$309,367
Award type
5
Project period
2021-06-01 → 2025-03-31