# AI-powered chemical proteomics for drug discovery targeting orphan proteins

> **NIH NIH R01** · HUNTER COLLEGE · 2024 · $373,702

## Abstract

Abstract
Genome-Wide Association Studies, whole-genome sequencing, and high-throughput techniques have
generated vast amounts of diverse omics data. However, these sets of data have not yet been fully explored to
improve the effectiveness and efficiency of drug discovery. Only 5-10% of druggable proteins are targeted by
approved drugs. The undrugged orphan proteins are potential targets of yet-incurable diseases but whose
endogenous and exogeneous ligands are unknown. Furthermore, there is a knowledge gap to link drug-target
binding affinities to clinical outcomes. We know little if the target is activated or inhibited by the binder (i.e.,
function activity: agonist vs. antagonist). To date, few experimental and computational tools can determine
genome-wide protein-ligand interactions (PLIs) for orphan proteins and ligand-induced functional activities
(LIFAs) for both orphan proteins and majority of well-studied proteins. Existing machine learning techniques are
mostly unsuccessful in predicting the ligand of orphan proteins due to an out-of-distribution (OOD) problem, i.e.,
they cannot reliably predict the function of an unseen protein if it is significantly different from the proteins in the
training data in terms of sequence and structure. Commonly used computational tools for structure-based drug
design, such as protein-ligand docking/scoring and Molecular Dynamics simulations, are neither scalable nor
particularly reliable. As a result, we only have a limited capability of compound screening for orphan proteins.
This proposal seeks to develop and experimentally validate innovative methods for predicting genome-wide PLIs
and LIFAs to address aforementioned challenges. Building on our successful proof-of-concept studies and our
close multidisciplinary collaborations between experimental and computational laboratories, we will develop a
novel computational framework to model drug actions on a multi-scale by integrating big data from chemical and
structural genomics and developing innovative deep learning algorithms. Specifically, we will develop a structure-
enhanced deep learning framework to reliably and accurately predict protein-ligand interactions for orphan
proteins on a genome-scale. We will integrate functional genomics with chemical genomics to predict ligand-
induced functional activity. We will apply the methods developed to design and experimentally test inhibitors of
orphan anti-cancer target AVIL and dual antagonists of dopamine receptors for opioid use disorder (OUD). The
proposed research offers an innovative concept, methodology, and translational applications. Completing this
research will fill a critical knowledge gap in understanding drug actions in a biological system and significantly
impact drug discovery for complex diseases, many of which lack effective and safe treatments. The developed
methodology and platform will not only immediately impact the NIH’s “Illuminating the Druggable Genome”
Program but also has potential...

## Key facts

- **NIH application ID:** 10932853
- **Project number:** 5R01GM122845-06
- **Recipient organization:** HUNTER COLLEGE
- **Principal Investigator:** Lei Xie
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $373,702
- **Award type:** 5
- **Project period:** 2017-08-15 → 2027-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10932853

## Citation

> US National Institutes of Health, RePORTER application 10932853, AI-powered chemical proteomics for drug discovery targeting orphan proteins (5R01GM122845-06). Retrieved via AI Analytics 2026-05-26 from https://api.ai-analytics.org/grant/nih/10932853. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
