High accuracy computational methods for biomolecular nuclear magnetic resonance spectroscopy

NIH RePORTER · NIH · U01 · $150,000 · view on reporter.nih.gov ↗

Abstract

High accuracy computational methods for biomolecular nuclear magnetic resonance spectroscopy Nuclear magnetic resonance (NMR) spectroscopy is one of the most important condensed phase probes of composition, structure and dynamics of biomolecules and bio-organic species. NMR observables such as chemical shifts and spin-spin splittings can be measured to very high accuracy. Because they are sensitive to the biological functional groups, detailed geometries, and chemical environments, they allow for prediction of solution phase protein structures or to identify or verify the structure of chemical compounds in the crystalline phase. The connection to structure, while true in principle, is nevertheless sometimes difficult to reveal in practice through direct assignment of the spectrum. Simulation methods that accurately predict spectral observables from structure are a key goal for NMR spectral assignment. Such methods are even more crucial for the inverse problem of realizing high quality NMR structures of folded proteins from spectra, and as powerful restraints for determining the structural ensembles of intrinsically disordered proteins (IDPs). Existing approaches to this problem typically rely on semi-empirical heuristics, and while they have achieved considerable success, they also reveal limitations that significantly degrade the quality of structural prediction. In this equipment supplement we are proposing to acquire a dedicated compute cluster for high throughput calculations of wavefunction-based QM methods we have developed for chemical shifts that offer improved accuracy over DFT. This will be employed to populate databases that reflect protein and small molecule drug relevant for machine learning methods we have developed under NIH support. With such data, machine learning and deep networks will determine a quantitative relationship between structure and computed NMR observable, and the resulting data science driven methods will be tested on the refinement of folded proteins and small molecule drug prediction.

Key facts

NIH application ID: 10145510
Project number: 3U01GM121667-04S1
Recipient: UNIVERSITY OF CALIFORNIA BERKELEY
Principal Investigator: Martin Paul Head-Gordon
Activity code: U01
Funding institute: NIH
Fiscal year: 2020
Award amount: $150,000
Award type: 3
Project period: 2017-02-01 → 2022-09-30