# Binding-Site Modeling with Multiple-Instance Machine-Learning

> **NIH NIH R01** · UNIVERSITY OF CALIFORNIA, SAN FRANCISCO · 2020 · $310,202

## Abstract

Project Summary / Abstract
This proposal is entitled “Binding-Site Modeling with Multiple-Instance Machine-Learning.” A number of in-
terrelated computational methods for making predictions about the biological behavior of small molecules have
been the subject of development within the Jain Laboratory for over twenty years. These share a common strat-
egy that considers molecular interactions at their surface interface, where proteins and ligands actually interact.
These methods yield measurements of similarity between small molecules or between protein binding pockets.
They also yield measurements of the complementarity of a small molecule to a protein binding site (the molecular
docking problem). A generalization of these concepts makes possible the construction of a virtual binding site for
quantitative activity prediction purely from data about the biological activities of a set of small molecules.
 The goals of the proposed work include further improving the accuracy and breadth of applicability of the
binding site modeling approach. The primary application of the approach is to guide optimization of leads within
medicinal chemistry projects, and to quantify potential off-target effects during pre-clinical drug discovery.
 A critical focus of the work will be in data and software dissemination, in order to accelerate the efficient
development of targeted therapies. In addition to methods development, the proposed work will involve broad
application of these state-of-the-art predictive modeling methods. The proposed work will proceed with the col-
laborative input of our pharmaceutical industry colleagues, who have specialized knowledge and data sets that are
vital for cutting-edge work in computer-aided drug design.
 The expected results include more efficient lead optimization (fewer compounds to reach desired biological pa-
rameters), truly effective scaffold replacement (to move away from a molecular series with biological limitations),
and improved computational predictions of off-target effects during pre-clinical drug design.

## Key facts

- **NIH application ID:** 9904662
- **Project number:** 5R01GM101689-08
- **Recipient organization:** UNIVERSITY OF CALIFORNIA, SAN FRANCISCO
- **Principal Investigator:** AJAY N JAIN
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $310,202
- **Award type:** 5
- **Project period:** 2013-01-01 → 2021-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9904662

## Citation

> US National Institutes of Health, RePORTER application 9904662, Binding-Site Modeling with Multiple-Instance Machine-Learning (5R01GM101689-08). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/9904662. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
