# Statistical methods for integrative analysis of multiple microbiome datasets

> **NIH NIH R21** · JOHNS HOPKINS UNIVERSITY · 2022 · $203,111

## Abstract

Project Abstract:
 Recent research has highlighted the importance of human associated microbiota in many diseases and health
conditions. However, in many areas, results are often inconsistent across studies due limited sample sizes,
heterogeneous study populations (e.g., different race, gender, age), and technical variability (e.g., experimen-
tal/analysis pipelines). For example, in HIV studies there is increasing evidence suggesting that gut dysbiosis
contributes to HIV-associated inﬂammation. However, there is still a lack of consensus on its characteristics,
such as whether HIV infection increases or decreases the microbial biodiversity in the gut and which taxa differ
between HIV+ and HIV-. Integrative analysis, which aggregates information from multiple studies to increase the
sample sizes and boost power, is necessary to move the ﬁeld forward toward consistent and reproducible dis-
coveries with the potential of suggesting prophylactic and therapeutic intervention. This, however, poses serious
statistical challenges due to the differential biases and measurement error between studies.
 The objective of this proposal is to develop and validate statistical methods for integrative analysis of multiple
microbiome datasets that are potentially generated using different laboratory and pre-processing procedures. We
will use the study-speciﬁc characteristics, such as study populations, laboratory and pre-processing pipelines,
and develop novel statistical models for characterizing changes in microbial alpha (within-sample) diversity, beta
(between-sample) diversity, and abundances (Aim 1). We will analyze the data from the microbiome quality
control project, a large community effort that sequenced the same set of samples through multiple pipelines,
designed to identify technical variables that impact the microbiome sequencing data, and use this as a basis to
determine how to best use the information in the proposed methods (Aim 2).
 We will apply the proposed methods to the HIV microbiome re-analysis project, in which we have compiled all
available 16s rRNA gene sequencing data for gut microbiome in HIV for a comprehensive evaluation. We will
also apply our proposed methods to the microbiome data collected from multiple cohorts from the Environmental
inﬂuences of child health outcomes (ECHO) to investigate the role of microbiome in impacting the health of
children and adolescents. We expect that the proposed methods will have broad impact on almost all areas of
microbiome research and provide a foundation for analyzing 16s rRNA sequencing data.

## Key facts

- **NIH application ID:** 10380772
- **Project number:** 5R21AI154236-02
- **Recipient organization:** JOHNS HOPKINS UNIVERSITY
- **Principal Investigator:** Ni Zhao
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $203,111
- **Award type:** 5
- **Project period:** 2021-04-01 → 2024-03-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10380772

## Citation

> US National Institutes of Health, RePORTER application 10380772, Statistical methods for integrative analysis of multiple microbiome datasets (5R21AI154236-02). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10380772. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
