A Web-Based Automatic Virtual Screening System

NIH RePORTER · NIH · R01 · $339,150 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY / ABSTRACT A long-term goal is to bring small molecules to biologists and chemical biologists, developing easy-to-use tools and libraries that rapidly identify reagents. A second goal uses these libraries and tools to predict biological activity for key compound classes, advancing the science and demonstrating proof-of-concept. The tools introduced by this research program have become central to virtual screening. The ZINC database is the most widely used compound library in the field, while our DUD and DUD-E benchmarks are ubiquitous in virtual screening. Recently, our development of ultra-large libraries has been embraced by the field. The Similarity Ensemble Approach (SEA) brings chemoinformatic target prediction to a large community, and we have used it to predict drug off-targets, their side effects, and the activities of supposedly inert molecules. Here we extend both projects, further developing community libraries and tools in aim 1, applying these to the prediction of biological activities in aim 2. The specific aims are: Aim 1. New tools to bring chemistry to biology. An exciting result of the last period was the introduction of ultra-large libraries. While an accessible library of >20 billion molecules has expanded our horizons, the two component reactions from which they derive are inevitably limiting. We will A. develop a “chemistry commons” of more elaborate virtual molecules available from academic labs, testing them in aim 2, B. expand the chemistry available for covalent docking to develop new community-accessible libraries of selective electrophiles for covalent inhibitor discovery, C. We will optimize the widely-used DUDE benchmarks, introducing new subsets to address the biases that they certainly still retain. D. We will integrate into ZINC methods that enable similarity searches for analogs in sublinear time. Aim 2. Libraries of high value compounds, and their activities. We will A. test the utility of more elaborate virtual libraries from aim 1 where they are experimentally tested, B. test the new covalent electrophilic libraries in docking campaigns against SARS-2 relevant proteases 3CLPro and TMPRSS2. C. expand our interest in target discovery by chemoinformatics, focusing on compounds that are widely used in biology because they are inactive: drug excipients and Generally Regarded As Safe food additives. D. ask whether GRAS molecules have on-target pharmacology, as we found with drug excipients, testing our predictions experimentally. Whereas these goals are ambitious, extensive preliminary results support their feasibility.

Key facts

NIH application ID
10818367
Project number
5R01GM071896-19
Recipient
UNIVERSITY OF CALIFORNIA, SAN FRANCISCO
Principal Investigator
John J. Irwin
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$339,150
Award type
5
Project period
2004-08-01 → 2026-04-30