Consensus and Covariance Proteins: Stability, Cooperativity, Function, & Design

NIH RePORTER · NIH · R01 · $365,638 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY/ABSTRACT With the exponential increase in protein sequences, the statistical power of multiple sequence alignments (MSAs) has been recognized as an important source of information for analysis and design of proteins. For example, consensus design, where the most frequent residue is selected from each position of an MSA, has been recognized as generating folded, functional, stabilized proteins. At the same time, covariance among pairs of residues at different positions has been recognized as having powerful value in predicting protein structures, and is a major component of the recent successes of deep-learning methods such as AlphaFold. Despite the power of pairwise residue covariance, these statistics have seen limited use in design of proteins. Moreover, it is not presently known which properties of proteins—for example, folding, stability, binding, and catalysis--are affected by the forces that contribute to covariance. The proposed research will combine consensus design with covariance. Using well-behaved consensus proteins we designed in the previous funding cycle, we will use two complementary methods to design proteins with varying amounts of covariance and consensus information. The first uses a statistical thermodynamic "Potts" formalism to determine coupling biases between residue pairs and separate them from single-site biases. This separation allows us to adjust the amount of covariance information in our designs. The second method uses singular value decomposition (SVD) to transform an MSA to a set of coordinates that separate consensus from covariance. Within this space, sequences fall into well-defined clusters that have shared conservation and covariance patterns. We will use the coordinate values of these clusters to design sequences with specific patterns of covariance. Designed proteins will be produced in the lab, and their stabilities, binding affinities, and enzyme activities will be determined. By projecting Potts designs into SVD space, we will refine the Potts designs and gain insights into the specific pair correlations that position each SVD cluster. We will also project extant sequences with known specificities into SVD space to predict functional features of clusters, which will be tested experimentally. To identify specific consensus and covariance sequence elements that contribute to stability and activity patterns, we will make single-and multisite point substitutions that are found in our consensus, Potts, and SVD designs. These will focus the non-additivity of consensus stabilization, which has been suggested from the previous funding cycle, which is likely to be related to covariance. These mutagenesis studies will also better define the striking stability and activity differences we have seen in preliminary Potts designs. Overall, the proposed research will better define the roles of covariance in the various properties of proteins, and will lead to new tools for more precise protein design. Fur...

Key facts

NIH application ID
10534973
Project number
2R01GM068462-17
Recipient
JOHNS HOPKINS UNIVERSITY
Principal Investigator
DOUGLAS E. BARRICK
Activity code
R01
Funding institute
NIH
Fiscal year
2022
Award amount
$365,638
Award type
2
Project period
2005-03-01 → 2026-08-31