# PROTEAN-CR: Proteomics Toolkit for Ensemble Analysis in Cancer Research

> **NIH NIH U01** · RICE UNIVERSITY · 2021 · $402,077

## Abstract

Project Summary
Understanding protein–ligand molecular interactions is fundamental to understanding the role of proteins in
complex diseases such as cancer. For instance, there is growing interest in predicting the binding modes of
peptide-based ligands (e.g., cyclic and phosphorylated peptides) to inhibit or induce targeted degradation of
high-proﬁle cancer targets. Another promising example is the identiﬁcation of tumor-associated antigens for cancer
immunotherapy applications. Both examples involve very speciﬁc molecular interactions, provide opportunities
for computer-aided design of better cancer treatments, and highlight the need for structural analyses in cancer
research. They also require new methods that account for the ﬂexibility and variability of the protein receptors
involved in these molecular interactions. The objective of this project is to develop an integrated approach to the
structural modeling and analysis of protein–ligand interactions in cancer research that will be implemented in
the proteomics toolkit PROTEAN-CR. The proposed toolkit will adopt a data-science approach to the problem
by introducing approaches for data acquisition and aggregation, as well as algorithmic advances for handling
receptor ﬂexibility and for modeling driver mutations, drug-resistance polymorphisms, and post-translational
modiﬁcations. PROTEAN-CR will streamline running structural analyses at scale while providing meaningful data
analytics. The long-term goal of our research is to fully integrate three-dimensional structural information about
proteins and ligands and structural analysis into cancer research. The PIs will work with collaborators to target
a wide range of users, from experimentalists with little to no programming experience, to advanced users who
are comfortable scripting large-scale analyses and integrating the toolkit with their own computational pipeline.
The central hypothesis is that a uniﬁed data-science-inspired approach can be used to address major challenges
in structural analysis of protein–ligand interactions in cancer research at scale. The ﬁrst aim will incorporate
protein ﬂexibility in docking studies for cancer research. Speciﬁc workﬂows will be used to generate ensembles of
protein conformations (receptor ﬂexibility) and innovative machine learning methods will be implemented aiming
at a better scoring of protein–ligand complexes. The second aim will focus on including cancer variability into
structural analysis. We aim to ﬁll the gap that exists between available data on cancer variants and the structural
analysis of ensembles of tumor-associated mutations and protein modiﬁcations. Finally, the third aim will focus on
customization, interpretability and scalability, where user-friendly methods will be deployed to manage ensembles
of protein-ligand complexes. PROTEAN-CR will be developed focusing on speciﬁc cancer-related projects, and
with a broad network of collaborators, enabling the design, implementation and evol...

## Key facts

- **NIH application ID:** 10188196
- **Project number:** 1U01CA258512-01
- **Recipient organization:** RICE UNIVERSITY
- **Principal Investigator:** Lydia E. Kavraki
- **Activity code:** U01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $402,077
- **Award type:** 1
- **Project period:** 2021-05-01 → 2024-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10188196

## Citation

> US National Institutes of Health, RePORTER application 10188196, PROTEAN-CR: Proteomics Toolkit for Ensemble Analysis in Cancer Research (1U01CA258512-01). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10188196. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*