# Proteogenomic translator for cancer biomarker discovery towards precision medicine

> **NIH NIH U24** · ICAHN SCHOOL OF MEDICINE AT MOUNT SINAI · 2024 · $787,902

## Abstract

PROJECT SUMMARY
The goal of our PGDAC is to improve our understanding of the proteogenomic complexity of tumors. Towards
this goal, our First Aim is to apply multiomics and network based system learning to reveal causative
molecular regulatory relationships contributing to varieties of phenotypes in cancer using CPTAC
proteogenomic data. We will start with rigorous preprocessing and quality control using a pipeline tailored to
MS-based proteomics data to detect and correct batch effects, outliers, sample labeling errors, as well as to
impute missing values (Aim 1.1). We will then utilize novel statistical tools to jointly model ≥6 types of omics
data to systematically characterize functional impact of DNA alterations (such as DNA mutations, CNA, and
methylations) (Aim 1.2). Such cis-/trans-regulatory networks will help us to elucidate how protein or pathway
activities are shaped by genomic alterations in tumor cells. We will also construct protein/PTM co-expression
networks based on global-, phospho-, glyco- and other PTM-proteomics data (Aim 1.3). When constructing
these networks, we will use and create advanced computational tools to effectively borrow information from
literature, publicly available open databases, and transcriptome profiles. Moreover, we will study cell type
composition from bulk tissue using novel multi-omics deconvolution analyses, and identify immune subtypes
with distinct immune activation or evasion mechanisms (Aim 1.4). Furthermore, we will perform comprehensive
investigation of kinase and transcription factor activities by leveraging publicly available data extracted and
processed from many regulatory network databases (Aim 1.5). All Aims 1.2-1.5 will contribute to a large
collection of functionally related protein/PTM sets, co-expression network modules, immune signatures, as well
as kinase/TF activity scores. These features and feature-sets will then be tested for their associations with
disease phenotypes (Aim 1.6). For all analysis tasks in Aim 1, we will derive an integrated view of
commonalities and differences across multiple tumor types via Pan-Cancer analyses. Our Second Aim is to
further develop methods, software, and web-based tools to optimize the data analyses of our PGDAC. We will
develop novel statistical/computational tools; implement these methods as computationally efficient and user-
friendly software; and construct an integrated data analysis pipeline (Aim 2.1). We also plan to develop a set of
web-based services for querying, visualizing, and interpreting analysis results from CPTAC studies (Aim 2.2).
Our Third Aim is to nominate novel protein-based cancer biomarkers and drug targets for further investigation
by targeted proteomics assays. We will first apply machine-learning-based prediction models on features and
feature-sets from Aim 1 to identify protein biomarkers that predict disease outcome, treatment responses, and
therapeutically distinct disease subtypes (Aim 3.1). We will also query disea...

## Key facts

- **NIH application ID:** 10832668
- **Project number:** 5U24CA271114-03
- **Recipient organization:** ICAHN SCHOOL OF MEDICINE AT MOUNT SINAI
- **Principal Investigator:** Avi Ma'ayan
- **Activity code:** U24 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $787,902
- **Award type:** 5
- **Project period:** 2022-07-01 → 2027-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10832668

## Citation

> US National Institutes of Health, RePORTER application 10832668, Proteogenomic translator for cancer biomarker discovery towards precision medicine (5U24CA271114-03). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10832668. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*