# Statistical Methods for Microbiome and Metagenomics

> **NIH NIH R01** · UNIVERSITY OF PENNSYLVANIA · 2022 · $460,654

## Abstract

Abstract
 The broad, long-term objective of this project concerns the development of novel statistical methods and computa-
tional tools for statistical and probabilistic modeling of human microbiome and shotgun metagenomic data motivated by
important biological questions and experiments. As we move to the next phase of microbiome research, it has become
increasingly evident that lack of methods suitable for analyzing such large-scale microbiome data has emerged as a
bottleneck to effectively understand the functions and dynamics of microbiota. There is a pressing need to develop
statistical and computational methods for large-scale shotgun metagenomics data analysis in order to accelerate in-
novations in microbiome data science. This project aims at narrowing this gap by developing new statistical models,
novel inference procedures, and fast computational algorithms. The speciﬁc aims of the current project focus on two
important aspects of microbiome data analysis: (1) developing new statistical methods and fast computational algo-
rithms for phylogenomic-based analysis of metagenomic sequencing data in large-scale human microbiome studies;
(2) developing statistical methods and inference procedures for quantifying and comparing the potential energy land-
scape and stability of microbial communities. Under each of these two broad aims, several related statistical methods
will be developed to address the key questions of how to perform phylogenomics-based microbiome analysis and how
to quantify and link microbial community stability to disease risk and progression. These problems are all motivated by
the PI's close collaborations with Penn investigators on metagenomic studies of Crohn disease, childhood obesity and
disease progression among patients with chronic kidney disease (CKD)). Speciﬁcally, this project will develop meth-
ods for phylogenomics-based association analysis using a set of universal marker genes, phylogenetic-Ising models
and change-point phylogenetic-Ising models for assessing the microbial community energy landscape and stability,
and time-invariant Ising models for understand consensus taxon-taxon interactions based on longitudinal microbiome
studies. The new methods can be applied to both 16S rRNA and shotgun metagenomic sequencing data and will
ideally facilitate the identiﬁcations of microbial composition, community stability and microbial networks underlying var-
ious complex human diseases and biological processes. The project will also investigate the robustness, power and
efﬁciencies of these methods and compare them with existing methods. Finally, this project will develop practical and
feasible computer programs for the implementation of the proposed methods, and for the evaluation of the performance
of these methods through extensive simulations and analysis of various on-going microbiome studies through the PI's
collaborations with Penn physicians and biologists. All programs developed under this grant and detail...

## Key facts

- **NIH application ID:** 10520964
- **Project number:** 2R01GM123056-05
- **Recipient organization:** UNIVERSITY OF PENNSYLVANIA
- **Principal Investigator:** Hongzhe Lee
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $460,654
- **Award type:** 2
- **Project period:** 2017-09-15 → 2026-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10520964

## Citation

> US National Institutes of Health, RePORTER application 10520964, Statistical Methods for Microbiome and Metagenomics (2R01GM123056-05). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10520964. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
