# Computational analysis of proteins

> **NIH NIH R35** · UT SOUTHWESTERN MEDICAL CENTER · 2021 · $314,051

## Abstract

We develop computational methods for the analysis of proteins and use them to study
evolution and predict protein spatial structures and functions. Our major recent advances
are: 1) Statistically sound similarity search approaches based on matching of two
sequence alignments augmented with known relationships between proteins in a
database (COMPADRE); 2) Multiple sequence alignment program that is accurate for
very distant sequences (PROMALS3D); 3) Comprehensive evolutionary classification of
protein domains with known spatial structures (ECOD), a database that is updated
weekly and catalogues most distant evolutionary connections between proteins; 4)
Highly heterozygous genome sequencing, assembly and annotation pipeline that
resulted in several dozen butterfly genomes sequenced by our group; 5) service to
scientific community by being assessors in several structure prediction (CASP) and
genome interpretation (CAGI) challenges. During the next 5 years, we would like to
capitalize on our progress in the analysis of proteins and organisms and explore new
research directions offered by developing technologies. Our work will be structured
along the five major interconnected threads. 1) We will continue developing
comprehensive evolutionary classification of proteins. We have the strongest track
record and most extensive experience in this direction and are uniquely positioned to
make a lasting impact. 2) We will develop computational methods to find distant protein
homologs and multiply align them. This work will build upon our software we have been
working on for almost two decades. These new methods will be used for protein
classification, and expert analysis of protein families will offer fresh ideas for further
methods development. 3) We will build the atlas of human mutations and rationalize their
effects on proteins and human health. This project will rely upon our expertise in
structure prediction, alignment and evolutionary connections between proteins, and will
derive power from our dozens of collaborators, many of whom are clinicians who are
dealing with interpretation of mutation effects. 4) At the organismal level, we will tackle
the link between genotype and phenotype and a series of evolutionary and population
biology questions using butterflies as model organisms. Many of these features are
linked to proteins and their spatial structures. Integration of molecular biophysics
techniques with organismal and evolutionary biology is innovative is promising to
advance both fields. 5) We will continue collaborations with experimentalists to test our
method and help them with experimental design. All five threads are tied by their
connection to computational analysis of proteins, which is the main strength of our
group.

## Key facts

- **NIH application ID:** 10146419
- **Project number:** 5R35GM127390-04
- **Recipient organization:** UT SOUTHWESTERN MEDICAL CENTER
- **Principal Investigator:** Nick V. Grishin
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $314,051
- **Award type:** 5
- **Project period:** 2018-05-01 → 2023-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10146419

## Citation

> US National Institutes of Health, RePORTER application 10146419, Computational analysis of proteins (5R35GM127390-04). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10146419. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
