Deep Topological Sampling of Protein Structures

NIH RePORTER · NIH · R01 · $306,511 · view on reporter.nih.gov ↗

Abstract

Project Summary. Most proteins are symmetric oligomeric complexes. Despite their prevalence and biomedical importance, such complexes are vastly underrepresented in the PDB, and determining their structures presents daunting challenges for NMR structural biologists. In particular, simulated annealing (SA), a widely-used technique for structure determination of homo-oligomers, is vulnerable to significant structural errors. Due to assignment ambiguity, SA converges to local minima rather than to the optimal structure or structural ensemble indicated by the data. Fold Operator Theory overcomes these errors, using a systematic search algorithm shown to identify biologically important assignments and structures that SA does not find. For example, the published NMR and crystal structures of the enzyme Diacylglycerol Kinase (DAGK) have very different topologies. Our systematic search techniques not only showed that both published folds are supported by the NMR data, but also found a novel fold that satisfies the data better than either published fold. We propose to develop novel algorithms and software enabling global and systematic search for NMR structure determination, building on our preliminary results showing that our methods can solve problems where traditional stochastic NMR methods struggle. These new tools will dramatically increase the accuracy of NMR structure determination with assignment ambiguity, which unavoidably arises for higher-order symmetric homo-oligomers. The proposed Deep Topological Sampling (DTS) has two primary modules: Fold Operator Theory (FOT); and DISCO (which we recently used to solve the structure of a membrane-associated MPER homo-trimer designed to probe immunogenic responses to the HIV-1 viral coat protein gp41). Aim 1: We will implement a general FOT in software, to compute all the protein folds consistent with the NMR data. FOT will search globally over folds, and avoid being trapped in local minima, to find all satisfying structures. Aim 2: We will develop our DISCO algorithm to search within each viable fold generated by FOT to find all feasible low-energy structures. DISCO and FOT will exploit novel geometric and topological algorithms to perform automated assignments accurately and efficiently, thus alleviating the most time-consuming and potentially error-prone step in multimeric structure determination. Aim 3: We will apply our FOT/DTS software (developed in Aims 1-2) prospectively to important systems. (A) We will perform experiments to determine the true functional structure DAGK adopts in its native environment. (B) We will use our methods to determine the structure of a larger HIV-1 membrane-associated pre-fusion gp41 trimer construct exposing transient, intermediate epitopes that bind broadly neutralizing antibodies, but are structurally invisible in larger laboratory constructs. (C) We will solve the hemifusion intermediate structures of the antigenic, symmetric homo- oligomeric domains of the Zika v...

Key facts

NIH application ID
9986769
Project number
5R01GM118543-04
Recipient
DUKE UNIVERSITY
Principal Investigator
Bruce R. Donald
Activity code
R01
Funding institute
NIH
Fiscal year
2020
Award amount
$306,511
Award type
5
Project period
2017-09-18 → 2022-07-31