# Structure-based functional annotation of microbial genomes

> **NIH NIH R01** · UNIVERSITY OF MICHIGAN AT ANN ARBOR · 2020 · $722,354

## Abstract

Abstract
Given the recent explosion in the number of sequenced genomes and the relative lack of functional information
on their contents, annotating the biological functions of all proteins across different genomes represents a major
challenge to modern molecular and computational biology. The problem of genome annotation is particularly
acute for bacteria; a vast range of commensal and pathogenic bacterial species impact human health, and only
computational approaches, when appropriately combined with carefully targeted biochemical experiments, can
provide the reliable, high-throughput annotations necessary to understand their physiology. The current
approach to computational function prediction is mainly based on transfer from known proteins of similar
sequence, which however becomes increasingly unreliable when the homology level is low. Recently, significant
progress has been achieved in protein 3D structure prediction as witnessed by the community-wide blind testing
experiments, and current state of the art methods can construct correct protein folds for the majority of genome
sequences without using close homologous templates. Building on the hypothesis that biological function is more
directly associated with 3D structure than sequence, this proposal aims to initiate a paradigm shift from protein
structure prediction to structure-based function annotations. Combining expertise from computational biology,
microbiology, and structural biology, the PIs will systemically examine the potential and scope of how
computational structure models from cutting-edge modeling methods can help provide reliable high-throughput
annotations of bacterial genomes, with a particular focus on the difficult targets that cannot be addressed by the
existing sequence homology-based approaches.
This project is designed to develop and test several cutting-edge approaches for protein function prediction using
low-resolution (but correctly folded) models from the structure predictions. The specific aims include the
development of novel structure-based methods for modeling of the protein-ligand binding sites, and enzyme and
gene ontologies. The modeling methods and results will be tested by a set of carefully designed experiments,
including high-throughput chemical screening and detailed structural-biology based characterizations. At all
stages, iterative prediction-to-experiment-to-refinement loops will be established between the experiments and
computational annotations to guide the functional modeling method development and advances. The studies of
this project will be focused on E. coli K12 strain, for which >10% of the genome remains un-annotated despite a
long history of use as a model organism; but the long-term goal is to build up a novel and robust framework
which can be used as a resource for reliable function annotations for various other microbial genomes. Compared
with current sequence-based approaches, the success of the structure-based pipelines could pot...

## Key facts

- **NIH application ID:** 9976447
- **Project number:** 5R01AI134678-03
- **Recipient organization:** UNIVERSITY OF MICHIGAN AT ANN ARBOR
- **Principal Investigator:** Yang Zhang
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $722,354
- **Award type:** 5
- **Project period:** 2018-08-01 → 2022-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9976447

## Citation

> US National Institutes of Health, RePORTER application 9976447, Structure-based functional annotation of microbial genomes (5R01AI134678-03). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/9976447. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
