Accelerating drug discovery via ML-guided iterative design and optimization

NIH RePORTER · NIH · R35 · $415,824 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY/ABSTRACT The Mobley laboratory focuses on developing and using computational tools to dramatically accelerate pharma- ceutical drug discovery. We focus on the interface between methods and applications, and invest in assessing and improving computational methods as well as applying methods directly in discovery. We take an open approach (open science, open source software, open data), making our work a community resource, including our FreeSolv database of solvation free energies, the Statistical Assessment of Modeling of Proteins and Ligands (SAMPL) series of blind challenges, our Lead Optimization Mapper (LOMAP) tool for automation of binding calculations, and the Open Force Field and Open Free Energy projects. Tools and methods we have contributed to are now broadly used in drug discovery research, including in pharma. Our overall vision is to make modeling a tool which plays a key role guiding drug discovery research, reducing costs, time and trial and error. In particular, we want researchers – ranging from medicinal chemists to structural biologists as well as experts in computation – to routinely input their latest results and ideas into their computer at the end of the work day, and return to work to ﬁnd prioritized next steps for their research. For example, in the lead optimization process, one might input the latest assay results as well as ideas for new compounds which could be screened next, and on returning to work in the morning, ﬁnd ideas ranked by afﬁnity for the target, potential off- target effects and predicted solubility/oral availability. Results might also include additional synthetically accessible compounds not originally considered. If predictions were accurate, this pipeline would dramatically accelerate discovery; thus, we seek to make workﬂows like this a reality via our science and engineering efforts. In our next ﬁve years, we plan to develop an increasingly automated iterative pipeline for iterative library design, compound screening, and optimization. With an experimental partner, we use computation to design promising DNA-encoded compound libraries, computationally analyze screening results, then design models to recommend additional compound rounds for screening and further iterations of the cycle. When combinatorial screening leads to promising enough compounds, we shift to compound optimization, employing active learning in combination with free energy methods and machine learning to prioritize compounds for synthesis and, when possible, for purchase from compound libraries like Enamine, with assay results guiding additional cycles. Results from this work feed back into improving our models and guide early stage drug discovery projects. Our focus involves both pipeline development and actual discovery. While we are developing methods that can be applied to any therapeutic area or target when coupled with experimental work, we will also focus on antibacterial discovery, a particular interest for...

Key facts

NIH application ID: 10552325
Project number: 1R35GM148236-01
Recipient: UNIVERSITY OF CALIFORNIA-IRVINE
Principal Investigator: David Lowell Mobley
Activity code: R35
Funding institute: NIH
Fiscal year: 2023
Award amount: $415,824
Award type: 1
Project period: 2023-03-01 → 2028-02-29