BindingDB: An Open Knowledgebase of Protein-Small Molecule Interactions

NIH RePORTER · NIH · R24 · $603,560 · view on reporter.nih.gov ↗

Abstract

Small, organic molecules that bind specific proteins represent one of the most effective ways that physicians have to treat diseases and that researchers can use to probe living systems. Such small molecules, also known as ligands, can act in many ways, such as by blocking a protein from working, by activating a protein, or by causing the protein to be broken down by normal cellular processes. In fact, most medications are ligands, and researchers in universities, government labs, and pharmaceutical companies, are constantly at work seeking new ones as drugs and biological probes. These ongoing efforts generate a continuous flow of information about what small molecules bind what proteins, and how tightly. This information is useful not only within the specific project that generated it, but also for many other applications, such as helping researchers identify probe molecules to help with their research, serving as benchmarks for computational chemists creating software designed to predict ligand-protein binding, and training and testing machine-learning tools for drug design. However, scientists generating this information typically release it in scientific articles or patents, where it cannot easily be found or accessed by other researchers. The core purpose of this project is to further develop the BindingDB Knowledgebase, dramatically expanding the availability of protein-ligand binding information and connecting this information to other areas of knowledge in order to make it as broadly useful as possible. This will be accomplished by using a combination of automated and human methods to carry out fast, accurate extraction of large volumes of data from scientific articles and patents. These data will be rendered in machine readable format, linked with related data, such as information on protein structure and function, and made publicly available in open source format via the searchable BindingDB website, which also allows data to be downloaded in quantity for offline use. The information in BindingDB will be managed according to high community standards for findability, accessibility, interoperability, and reusability (FAIR), and the project will achieve the high CoreTrustSeal standards and certification for reliability and long-term preservation. In addition, steps will be taken to maximize usability and integration of this information, such as by making it available as a public dataset in emerging cloud resources and creating links from on- line journal articles and patents to the data extracted from them in BindingDB.

Key facts

NIH application ID
10706457
Project number
5R24GM144232-02
Recipient
UNIVERSITY OF CALIFORNIA, SAN DIEGO
Principal Investigator
MICHAEL K. GILSON
Activity code
R24
Funding institute
NIH
Fiscal year
2023
Award amount
$603,560
Award type
5
Project period
2022-09-20 → 2027-08-31