# Accelerating drug discovery via ML-guided iterative design and optimization

> **NIH NIH R35** · UNIVERSITY OF CALIFORNIA-IRVINE · 2023 · $415,824

## Abstract

PROJECT SUMMARY/ABSTRACT
The Mobley laboratory focuses on developing and using computational tools to dramatically accelerate pharma-
ceutical drug discovery. We focus on the interface between methods and applications, and invest in assessing and
improving computational methods as well as applying methods directly in discovery. We take an open approach
(open science, open source software, open data), making our work a community resource, including our FreeSolv
database of solvation free energies, the Statistical Assessment of Modeling of Proteins and Ligands (SAMPL)
series of blind challenges, our Lead Optimization Mapper (LOMAP) tool for automation of binding calculations,
and the Open Force Field and Open Free Energy projects. Tools and methods we have contributed to are now
broadly used in drug discovery research, including in pharma.
Our overall vision is to make modeling a tool which plays a key role guiding drug discovery research, reducing
costs, time and trial and error. In particular, we want researchers – ranging from medicinal chemists to structural
biologists as well as experts in computation – to routinely input their latest results and ideas into their computer at
the end of the work day, and return to work to ﬁnd prioritized next steps for their research. For example, in the lead
optimization process, one might input the latest assay results as well as ideas for new compounds which could be
screened next, and on returning to work in the morning, ﬁnd ideas ranked by afﬁnity for the target, potential off-
target effects and predicted solubility/oral availability. Results might also include additional synthetically accessible
compounds not originally considered. If predictions were accurate, this pipeline would dramatically accelerate
discovery; thus, we seek to make workﬂows like this a reality via our science and engineering efforts.
In our next ﬁve years, we plan to develop an increasingly automated iterative pipeline for iterative library design,
compound screening, and optimization. With an experimental partner, we use computation to design promising
DNA-encoded compound libraries, computationally analyze screening results, then design models to recommend
additional compound rounds for screening and further iterations of the cycle. When combinatorial screening leads
to promising enough compounds, we shift to compound optimization, employing active learning in combination
with free energy methods and machine learning to prioritize compounds for synthesis and, when possible, for
purchase from compound libraries like Enamine, with assay results guiding additional cycles. Results from this
work feed back into improving our models and guide early stage drug discovery projects.
Our focus involves both pipeline development and actual discovery. While we are developing methods that can be
applied to any therapeutic area or target when coupled with experimental work, we will also focus on antibacterial
discovery, a particular interest for...

## Key facts

- **NIH application ID:** 10552325
- **Project number:** 1R35GM148236-01
- **Recipient organization:** UNIVERSITY OF CALIFORNIA-IRVINE
- **Principal Investigator:** David Lowell Mobley
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2023
- **Award amount:** $415,824
- **Award type:** 1
- **Project period:** 2023-03-01 → 2028-02-29

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10552325

## Citation

> US National Institutes of Health, RePORTER application 10552325, Accelerating drug discovery via ML-guided iterative design and optimization (1R35GM148236-01). Retrieved via AI Analytics 2026-05-26 from https://api.ai-analytics.org/grant/nih/10552325. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*