# Deep Topological Sampling of Protein Structures

> **NIH NIH R01** · DUKE UNIVERSITY · 2020 · $306,511

## Abstract

Project Summary. Most proteins are symmetric oligomeric complexes. Despite their prevalence and
biomedical importance, such complexes are vastly underrepresented in the PDB, and determining their
structures presents daunting challenges for NMR structural biologists. In particular, simulated annealing (SA),
a widely-used technique for structure determination of homo-oligomers, is vulnerable to significant structural
errors. Due to assignment ambiguity, SA converges to local minima rather than to the optimal structure or
structural ensemble indicated by the data. Fold Operator Theory overcomes these errors, using a systematic
search algorithm shown to identify biologically important assignments and structures that SA does not find. For
example, the published NMR and crystal structures of the enzyme Diacylglycerol Kinase (DAGK) have very
different topologies. Our systematic search techniques not only showed that both published folds are
supported by the NMR data, but also found a novel fold that satisfies the data better than either published fold.
 We propose to develop novel algorithms and software enabling global and systematic search for NMR
structure determination, building on our preliminary results showing that our methods can solve problems
where traditional stochastic NMR methods struggle. These new tools will dramatically increase the accuracy of
NMR structure determination with assignment ambiguity, which unavoidably arises for higher-order symmetric
homo-oligomers. The proposed Deep Topological Sampling (DTS) has two primary modules: Fold Operator
Theory (FOT); and DISCO (which we recently used to solve the structure of a membrane-associated MPER
homo-trimer designed to probe immunogenic responses to the HIV-1 viral coat protein gp41).
 Aim 1: We will implement a general FOT in software, to compute all the protein folds consistent with the
NMR data. FOT will search globally over folds, and avoid being trapped in local minima, to find all satisfying
structures. Aim 2: We will develop our DISCO algorithm to search within each viable fold generated by FOT to
find all feasible low-energy structures. DISCO and FOT will exploit novel geometric and topological algorithms
to perform automated assignments accurately and efficiently, thus alleviating the most time-consuming and
potentially error-prone step in multimeric structure determination. Aim 3: We will apply our FOT/DTS software
(developed in Aims 1-2) prospectively to important systems. (A) We will perform experiments to determine the
true functional structure DAGK adopts in its native environment. (B) We will use our methods to determine the
structure of a larger HIV-1 membrane-associated pre-fusion gp41 trimer construct exposing transient,
intermediate epitopes that bind broadly neutralizing antibodies, but are structurally invisible in larger laboratory
constructs. (C) We will solve the hemifusion intermediate structures of the antigenic, symmetric homo-
oligomeric domains of the Zika v...

## Key facts

- **NIH application ID:** 9986769
- **Project number:** 5R01GM118543-04
- **Recipient organization:** DUKE UNIVERSITY
- **Principal Investigator:** Bruce R. Donald
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $306,511
- **Award type:** 5
- **Project period:** 2017-09-18 → 2022-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9986769

## Citation

> US National Institutes of Health, RePORTER application 9986769, Deep Topological Sampling of Protein Structures (5R01GM118543-04). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/9986769. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
