# Collaborative Research: Mathematical Framework for Biomolecules: From Protein to RNA to Chromosomes

> **NIH NIH R01** · FLORIDA STATE UNIVERSITY · 2020 · $308,756

## Abstract

Despite rapid progress in structural bioinformatics, a rigorous and unifying mathematical and statistical framework is missing in our current toolbox for analysis, classification, and organization of individual as well as groups of biomolecules. We have recently developed such a framework based on the elastic shape analysis (ESA) for the comparison of protein and RNA structures. Under this framework, the formal geodesic distance for any two protein/RNA structures can be computed rapidly. Probability distributions can also be built for families of protein/RNA structures, and can be used to classify structures in a principled way through statistical hypothesis testing. In addition, sequence information can be naturally incorporated so that comparison of structures can be conducted in the joint sequence-structure space. We have also developed novel algorithms for matching and analyzing protein surfaces. We propose to significantly further develop these methodologies for important applications in structure biology, including studying chromosome structures by combining both 30 structure and sequence level information.
The proposed research will make significant contributions to the following areas: (1) This proposal will fill an important gap in structure biology - the lack of a rigorous mathematical and statistical framework for biomolecular structure comparison; (2) Our proposed unifying framework will allow natural incorporation of sequence information for structure comparison; (3) Our approach can uncover distinct clusters at the deepest level of current classification scheme (i.e. SCOP family), enabling a finer classification of biomolecular structures. Preliminary results indicate that by using carefully measured structural similarity, we will obtain representative sets of proteins of higher quality than those by current sequence similarity based methods; (4) The probabilistic models designed for protein/RNA backbone structures and surfaces will capture the flexible nature of protein structures through the use of ensemble of conformations, while maintaining high computational efficiency. These models will also enable effective characterization of family-specific variations among proteins, an important task none of the existing methods work well; (5) Protein/RNA structures will be organized using network-based data structures using probabilistic approaches. This new organization will effectively integrates sequence, backbone structure, and surface information, facilitating discovery of novel insight; and (6) these new development will be rapidly generalized for studying chromosome structures.
This proposed research will allow development of tools that will also be applicable in other areas of shape analysis, including medical image analysis, computer vision, and pattern recognition. Our work will help to increase the communication between the field of protein structure analysis and the field of shape analysis, and will stimulate more cross-over d...

## Key facts

- **NIH application ID:** 9963294
- **Project number:** 5R01GM126558-04
- **Recipient organization:** FLORIDA STATE UNIVERSITY
- **Principal Investigator:** Jinfeng Zhang
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $308,756
- **Award type:** 5
- **Project period:** 2017-07-01 → 2022-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9963294

## Citation

> US National Institutes of Health, RePORTER application 9963294, Collaborative Research: Mathematical Framework for Biomolecules: From Protein to RNA to Chromosomes (5R01GM126558-04). Retrieved via AI Analytics 2026-05-21 from https://api.ai-analytics.org/grant/nih/9963294. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
