Open data-driven infrastructure for building biomolecular force fields for predictive biophysics and drug design

NIH RePORTER · NIH · R01 · $96,003 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY/ABSTRACT The study of biomolecular interactions and design of new therapeutics requires accurate physical models of the atomistic interactions between small molecules and biological macromolecules. Over the least few decades, molecular mechanics force fields have demonstrated the potential that physical models hold for quantitative biophysical modeling and predictive molecular design. However, a significant technology gap exists in our ability to build force fields that achieve high accuracy, can be systematically improved in a statistically robust manner, be extended to new areas of chemistry, can model post-translational and covalent modifications, are able to quantify systematic errors in predictions, and can be broadly applied across a high-performance software packages. In this project, we willl bridge this technology gap to enable new generations of accurate quantitative biomolec- ular modeling and (bio)molecular design for chemical biology and drug discovery. In Aim 1, we will produce a modern, open infrastructure to enable practitioners to rapidly and conveniently construct and employ accurate and statistically robust physical force fields via automated machine learning methods. In Aim 2, we will construct open, machine-readable experimental and quantum chemical datasets that will accelerate next-generation force field development. In Aim 3, we will develop statistically robust Bayesian inference techniques to enable the auto- mated construction of type assignment schemes that avoid overfitting and selection of physical functional forms statistically justfied by the data. This approach will also provide an estimate of the systematic error in predicted properties arising from uncertainty in parameters or functional form choices—generally the dominant source of error—to be quantified with little added expense. In Aim 4, we will integrate and apply this infrastructure to produce open, transferable, self-consistent force fields that achieve high accuracy and broad coverage for modeling small molecule interactions with biomolecules (including unnatural amino or nucleic acids and covalent modifications by organic molecules), with the ultimate goal of covering all major biomolecules. This research is significant in that the technology developed in this project has the potential to radically transform the study of biomolecular phenomena by providing highly accurate force fields with exceptionally broad chemical coverage via fully consistent parameterization of organic (bio)molecules. In addition, we will produce new tools to automate force field creation and tailoring to specific problem domains, quantify the systematic error in predictions, and identify new data for improving force field accuracy. This will greatly improve our ability to study diverse biophysical processes at the molecular level, and to rationally design new small-molecule, protein, and nucleic acid therapeutics. This supplement to the original RO1 is to purchase a small GPU clu...

Key facts

NIH application ID
10157034
Project number
3R01GM132386-01A1S1
Recipient
UNIVERSITY OF COLORADO
Principal Investigator
Michael R Shirts
Activity code
R01
Funding institute
NIH
Fiscal year
2020
Award amount
$96,003
Award type
3
Project period
2020-03-01 → 2024-02-29