Carbohydrate enzyme gene clusters in human gut microbiome

NIH RePORTER · NIH · R01 · $302,120 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Carbohydrate enzyme gene clusters in human gut microbiome Hippocrates said ~2,400 years ago: “Let food be thy medicine and medicine be thy food”. It is now well known that this is largely due to the “diet-microbiota-host” interactions that happen in the human gut. In particular, microbial degradation of carbohydrates can produce a variety of metabolites, which have a profound impact on human health. As a bioinformatics researcher in the Nebraska Food for Health Center, the long-term interests of the PI include: (i) develop specialized computational tools for better functional annotation of food-digesting microbial genomes and metagenomes, and (ii) characterize enzymes and other genetic elements that connect microbes, diets, and human health. The objective of this R01 project is to develop a suite of bioinformatics tools for functional annotation of carbohydrate active enzyme (CAZyme) and CAZyme gene clusters (CGCs) in human gut microbiome. The PI has over 10 years of experience in CAZyme bioinformatics tool development, and maintains a well-recognized CAZyme annotation database and web server called dbCAN (http://bcb.unl.edu/dbCAN2). This project aims to further dbCAN development to address fundamental personalized nutrition questions: (i) is a gut microbe able to utilize a specific type of glycan? (ii) can a person carrying certain gut microbes respond to an individualized diet (e.g., prebiotics: dietary compounds that are beneficial to human health)? To address these questions, new CAZyme annotation tools must have the ability to predict the carbohydrate substrates of CAZymes. Recent research has found that different CAZyme encoding genes are often co-localized with each other and with other genes (e.g., those encoding sugar transporters, regulators, and signaling proteins) in bacterial genomes to form CGCs (also known as polysaccharide utilization loci or PULs). Thus, the foundation of the new tool development is that the gene membership (or functional domain composition) of a CGC can be used to predict its carbohydrate substrates (e.g., xylans, pectins, glucans, etc.). The innovation is that machine learning approaches will be used to analyze a large number of experimentally characterized PULs curated from literature, and the extracted sequence features will be used to build effective classifiers to predict and classify CGCs in new genomes/metagenomes. The expected outcome will be novel and user-friendly open source computer programs, databases, and web servers that allow automated CGCs identification and substrate predictions. The significance is that the new tools will facilitate the experimental characterization of more PULs and their carbohydrate substrates in human gut microbiome (also in other carbohydrate rich environments). Therefore, this project will contribute computational solutions to the research of personalized nutrition, e.g., analyze a person's gut microbiome to predict if this person can respond to diets cont...

Key facts

NIH application ID: 10099567
Project number: 1R01GM140370-01
Recipient: UNIVERSITY OF NEBRASKA LINCOLN
Principal Investigator: Yanbin Yin
Activity code: R01
Funding institute: NIH
Fiscal year: 2021
Award amount: $302,120
Award type: 1
Project period: 2021-05-01 → 2025-02-28