# Deep characterization of the sequence space and evolutionary trajectories of reconstructed ancestral proteins - Resubmission 01

> **NIH NIH R01** · UNIVERSITY OF CHICAGO · 2020 · $319,238

## Abstract

We propose the first comprehensive characterization of sequence space around an ancestral protein. This
work will 1) characterize the effects on function of all possible mutations and pairs of mutations across the
protein's entire length and of all possible combinations of mutations at a key subset of sites, 2) illuminate how
the distribution of function through this multidimensional sequence space would have affected the processes
of protein evolution (a key goal in molecular evolution), and 3) quantify the complete set of main-effect and
epistatic genetic determinants of DNA specificity in a transcription factor and elucidate their biochemical
causes – an important goal for protein biochemistry and molecular gene regulation. We use the steroid
hormone receptor DNA-binding domain as an ideal model system, because it is of great biomedical
importance; it is experimentally and phylogenetically tractable; and its specificity for DNA targets diversified
through a well-understood evolutionary process, with a known set of historical mutations and biophysical
mechanisms. The proposed work will reveal why this history occurred relative to the many other mutational
trajectories the protein could have taken as it evolved its new specificity. With the map of sequence space in
hand, we will then apply locus-specific, replicated experimental evolution to the ancestral protein, placing it
under strong selection to explore sequence space and evolve the same novel specificity that it acquired during
historical evolution. By identifying commonalities and differences among the historical trajectory,
experimental evolution trajectories, and the many other possible pathways through sequence space, we will
gain fundamental insight into the roles of contingency and determinism in evolution and illuminate
underlying mechanistic factors that caused those phenomena. Specific questions include: how many ways
were there to evolve the derived DNA specificity, and how many were accessible under selection and drift?
Did the historical outcome evolve because it was the optimal genotype, because it was the best or only
accessible genotype, or simply due to chance? If more optimal genotypes exist, what prevented the evolving
protein from reaching them? To what extent must new specificities evolve through promiscuous
intermediates, and how many mutations does it take to evolve a new specificity? We will also characterize
sequence space and experimental evolutionary trajectories around ancient receptors that existed at different
times during history; this will reveal how the protein's evolvability and robustness fluctuated over
evolutionary time due to epistatically acting mutations. Finally, by fully characterizing the main and epistatic
genetic determinants of the protein's DNA specificity, we will identify common biophysical mechanisms that
underlie DNA recognition, contributing to an important goal in molecular biology, biochemistry, cell biology,
and development. The methods...

## Key facts

- **NIH application ID:** 9901582
- **Project number:** 5R01GM121931-04
- **Recipient organization:** UNIVERSITY OF CHICAGO
- **Principal Investigator:** Joseph W Thornton
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $319,238
- **Award type:** 5
- **Project period:** 2017-04-04 → 2021-03-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9901582

## Citation

> US National Institutes of Health, RePORTER application 9901582, Deep characterization of the sequence space and evolutionary trajectories of reconstructed ancestral proteins - Resubmission 01 (5R01GM121931-04). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/9901582. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
