Statistical Methods for Microbiome and Metagenomics

NIH RePORTER · NIH · R01 · $448,996 · view on reporter.nih.gov ↗

Abstract

Abstract The broad, long-term objective of this project concerns the development of novel statistical methods and computa- tional tools for statistical and probabilistic modeling of human microbiome and shotgun metagenomic data motivated by important biological questions and experiments. As we move to the next phase of microbiome research, it has become increasingly evident that lack of methods suitable for analyzing such large-scale microbiome data has emerged as a bottleneck to effectively understand the functions and dynamics of microbiota. There is a pressing need to develop statistical and computational methods for large-scale shotgun metagenomics data analysis in order to accelerate in- novations in microbiome data science. This project aims at narrowing this gap by developing new statistical models, novel inference procedures, and fast computational algorithms. The specific aims of the current project focus on two important aspects of microbiome data analysis: (1) developing new statistical methods and fast computational algo- rithms for phylogenomic-based analysis of metagenomic sequencing data in large-scale human microbiome studies; (2) developing statistical methods and inference procedures for quantifying and comparing the potential energy land- scape and stability of microbial communities. Under each of these two broad aims, several related statistical methods will be developed to address the key questions of how to perform phylogenomics-based microbiome analysis and how to quantify and link microbial community stability to disease risk and progression. These problems are all motivated by the PI's close collaborations with Penn investigators on metagenomic studies of Crohn disease, childhood obesity and disease progression among patients with chronic kidney disease (CKD)). Specifically, this project will develop meth- ods for phylogenomics-based association analysis using a set of universal marker genes, phylogenetic-Ising models and change-point phylogenetic-Ising models for assessing the microbial community energy landscape and stability, and time-invariant Ising models for understand consensus taxon-taxon interactions based on longitudinal microbiome studies. The new methods can be applied to both 16S rRNA and shotgun metagenomic sequencing data and will ideally facilitate the identifications of microbial composition, community stability and microbial networks underlying var- ious complex human diseases and biological processes. The project will also investigate the robustness, power and efficiencies of these methods and compare them with existing methods. Finally, this project will develop practical and feasible computer programs for the implementation of the proposed methods, and for the evaluation of the performance of these methods through extensive simulations and analysis of various on-going microbiome studies through the PI's collaborations with Penn physicians and biologists. All programs developed under this grant and detail...

Key facts

NIH application ID
10911170
Project number
5R01GM123056-07
Recipient
UNIVERSITY OF PENNSYLVANIA
Principal Investigator
Hongzhe Lee
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$448,996
Award type
5
Project period
2017-09-15 → 2026-08-31