Improving Similarity Scoring in Drug Discovery: Solving by Solvating

NIH RePORTER · NIH · R43 · $306,872 · view on reporter.nih.gov ↗

Abstract

Identifying molecules with similar shape and characteristics to known ligands has proven useful in several areas of drug discovery and development, from ligand-based virtual screening (LBVS) to scaffold-hopping. The underlying assumption is that compounds that occupy a similar volume with similar chemical groups will have similar activity at the target protein, due to formation of similar protein-ligand interactions. However, in the aqueous in vivo environment, protein-water-ligand interactions are equally important with water bridges, water networks, and water displacement playing a critical role in the binding of many compounds. To our knowledge, no 3D ligand matching methods explicitly account for these important waters as potential ligand space, which leads to false negatives during shape matching, as molecules that appear dissimilar in vacuo may in fact behave similarly in the binding pocket once water is accounted for. We aim to change that by creating the first program that considers these water molecules when comparing ligands in 3D, factoring them in when scoring similarity. To do so, we will adapt our previously developed algorithm (WATGEN) for the prediction of water positions in the unbound (“empty”) protein and the protein ligand complex, as well as a calculation of ligand- driven water displacement in protein−ligand complexes. To extend this work, waters relevant to shape matching will be identified using a combination of machine learning (ML) and empirical algorithms, which are based on the 9,000+ solvated structures in our previous study, each with corresponding displacement calculations. This step will calculate the “replaceability” and “displaceability” of WATGEN predicted waters, indicating how they should be represented for shape matching physically and chemically. We will then write code to automatically create “hybrid” ligands through addition (or removal) of atoms based on solvation representation determined above. Finally, we will validate our new solvation 3D shape matching methodology by comparing the new methodology to current waterless methodology in two settings that rely on ligand-based similarity scoring: 1) Evolution of a first-generation sulfonylurea (tolbutamide) to more advanced drugs like glyburide using our AI-driven Drug Design platform, and 2) LBVS using unmodified and solvation-informed tolbutamide as reference structures for screening the WuXi GalaXi “off-the-shelf” virtual library. The most similar compounds to each reference will be purchased and assayed for glucose-stimulated insulin secretion activity. We predict that for each of these experiments, the new methodology will outperform our current waterless methodology. These features will be integrated into our existing shape matching algorithm within the ADMET Predictor platform, which is freely available to academic researchers, leading to improvements in drug discovery and optimization.

Key facts

NIH application ID
11003420
Project number
1R43GM156103-01
Recipient
SIMULATIONS PLUS, INC.
Principal Investigator
JEREMY O JONES
Activity code
R43
Funding institute
NIH
Fiscal year
2024
Award amount
$306,872
Award type
1
Project period
2024-09-10 → 2025-08-31