# Data-driven, evolution-based design of proteins

> **NIH NIH R01** · UNIVERSITY OF CHICAGO · 2021 · $316,858

## Abstract

Project Summary:
Evolution builds proteins with a remarkable combination of characteristics. They can fold spontaneously and
carry out difficult chemical reactions, but also are robust to perturbation and able to adapt as conditions of fitness
fluctuate. In recent years, sequence-based statistical models have provided specific models for how all these
properties are encoded in the amino acid sequence of proteins. Here, we propose a data-driven, evolution-based
design (EBD) process that, with the developments outlined here, can address several basic problems in protein
mechanism and evolution. We will unify and optimize approaches for EBD and then apply it (1) to quantify the
functional sequence space of a protein family, (2) to parse the constraints on paralogs and orthologs of a protein
family, and (3) to understand how substrate specificity in an enzyme can adapt through a process of stepwise
variation and selection. The work is extensively supported by preliminary data, and is enabled by new
technologies for statistical inference, gene synthesis, and high-throughput functional assays, both in vitro and in
vivo. The outcomes will be a unified computational framework for sequence-based statistical inference, and an
serious test of the power of emerging evolution-based protein design approaches to understand and engineer
protein molecules.

## Key facts

- **NIH application ID:** 10185231
- **Project number:** 1R01GM141697-01
- **Recipient organization:** UNIVERSITY OF CHICAGO
- **Principal Investigator:** RAMA RANGANATHAN
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $316,858
- **Award type:** 1
- **Project period:** 2021-08-01 → 2025-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10185231

## Citation

> US National Institutes of Health, RePORTER application 10185231, Data-driven, evolution-based design of proteins (1R01GM141697-01). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10185231. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
