Project Summary Cancer has claimed over 600,000 lives in 2020 in the United States. A better understanding of the mechanisms underlying cancer progression has led to the development of early detection strategies and novel treatment modalities that have contributed to the decrease in cancer-related deaths observed for the past few decades. Yet, cancer remains a deadly disease. There is thus an acute need to identify new cancer vulnerabilities. This will require exploring understudied aspects of cancers, which requires the development of novel technologies. One understudied aspect of cancer is the extracellular matrix (ECM). The ECM is a complex meshwork of proteins providing architectural support and biochemical signals critical for cellular functions required for tumor progression. Overcoming technical challenges posed by largely insoluble ECM proteins, we previously devised a proteomic pipeline specifically geared towards ECM proteins and showed that the tumor ECM is composed of 200+ distinct proteins. We further identified ECM signatures predictive of patient outcome and novel ECM proteins playing functional roles in cancer progression. The ECM thus represents an important reservoir of potential prognostic biomarkers and therapeutic targets. However, the ECM has many more secrets to reveal. For example, ECM proteins exist in various isoforms and are extensively post-translationally modified, yet, we do not know which proteoforms are present in the tumor ECM. ECM protein structure and the architecture of the ECM meshwork is key to mediate function, yet, very little is known about ECM protein folding and its impact on protein functions. Since proteomics relies on the generation of peptides from protein via proteolysis and protein identification via database search, we propose that enhancing these steps will provide a more complete picture of the cancer ECM and significantly advance cancer research. Here, we propose to use in-silico modeling to define the optimal cleavage conditions to achieve near-complete coverage of ECM protein sequences (Aim 1). Standard proteomic protocols rely on protein denaturation prior to protein digestion. Yet, we know that many ECM functions are governed by its architecture. We thus propose to perform native ECM digestion to gain insights into the structure of individual proteins, and the secondary and tertiary structures of the ECM meshwork (Aim 2). To facilitate ECM research, we have previously developed a searchable database, MatrisomeDB, compiling ECM proteomic dataset. Here, we propose to enhance the content and functionalities of MatrisomeDB to include our new prediction model and a new tool to the visualize sequence coverage on 3D models of ECM proteins predicted by Google’s AlphaFold (Aim 3). Our technology, offering substantial improvements over conventional proteomic approaches, targets the unmet technical need to profile, with deep coverage and high sensitivity, the protein composition of the tumor ECM. When...