# Advancing predictive physical modeling through focused development of model systems to drive new modeling innovations

> **NIH NIH R01** · UNIVERSITY OF CALIFORNIA-IRVINE · 2020 · $235,500

## Abstract

PROJECT SUMMARY/ABSTRACT
This work seeks to advance quantitative methods for biomolecular design, especially for predicting
biomolecular interactions, via a focused series of community blind prediction challenges. Physical methods for
predicting binding free energies, or “free energy methods”, are poised to dramatically reshape early stage drug
discovery, and are already finding applications in pharmaceutical lead optimization. However, performance is
unreliable, the domain of applicability is limited, and failures in pharmaceutical applications are often hard to
understand and fix. On the other hand, these methods can now typically predict a variety of simple physical
properties such as solvation free energies or relative solubilities, though there is still clear room for
improvement in accuracy. In recent years, competitions and crowdsourcing have proven an effective model for
driving innovations in diverse fields. In our field, blind prediction challenges have played a key role in driving
innovations in prediction of physical properties and binding, especially in the form of the SAMPL series of
challenges. Here, we will continue and extend SAMPL prediction challenges to include new physical
properties, more complicated host-guest binding data, and application to biomolecular systems.
Carefully selected systems and novel experimental data will provide challenges of gradually increasing
complexity spanning between systems which are now tractable to those which are marginally out of reach of
today's methods but still slightly simpler than those covered by the Drug Design Data Resource (D3R) series of
challenges on existing pharmaceutical data. We will work with D3R to run blind challenges on the data we
generate and to ensure it is designed to maximally benefit the field.
In our original proposal, Aim 4 focused on using data generated in a SAMPL series of challenges, applying
proven crowdsourcing-based techniques to drive the development of new methods and new understanding of
the strengths and weaknesses of existing techniques. Here, we extend this work by building out software
infrastructure for a fully automated component of these challenges, where workflow components can be
deposited in a common registry and then linked together to automate participation in SAMPL challenges. This
solves several key problems at once, and will allow innovations resulting from the SAMPL challenges to have
much greater impact on the community and much more rapidly disseminate to a wide variety of applications.
Users of software employed in the SAMPL challenges number in the thousands to tens of thousands, so this
will have far-reaching implications for the predictive modeling community.

## Key facts

- **NIH application ID:** 10165354
- **Project number:** 3R01GM124270-03S1
- **Recipient organization:** UNIVERSITY OF CALIFORNIA-IRVINE
- **Principal Investigator:** David Lowell Mobley
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $235,500
- **Award type:** 3
- **Project period:** 2018-09-10 → 2022-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10165354

## Citation

> US National Institutes of Health, RePORTER application 10165354, Advancing predictive physical modeling through focused development of model systems to drive new modeling innovations (3R01GM124270-03S1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10165354. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
