A proteoform-centric informatics platform for targeted top-down characterization and quantitation

NIH RePORTER · NIH · R43 · $251,961 · view on reporter.nih.gov ↗

Abstract

ABSTRACT Proteoforms are the driving force of biological processes. Accurate identification, characterization, and quantitation of proteoforms is therefore essential for understanding protein roles in human disease. Top-down proteomics focuses on the direct analysis of proteoforms, providing the specificity needed to uncover the precise molecular participants in biological phenomena. The landscape of proteomics has been changing from digestion- based methods to intact protein analyses, with top-down proteomics gaining a foothold in important biomedical and pharmaceutical studies. Software solutions dedicated to targeted top-down proteomics are needed now more than ever to propel the next wave of innovation and top-down technology adoption. In this proposed research, Proteinaceous will develop further a new bioinformatic tool named Proteoform Finder to provide the mass spectrometry community with a proteoform-centric software solution for targeted top-down proteomics. Proteoform Finder will present proteoform data through proteoform family network diagrams. A proteoform funnel will be implemented in Proteoform Finder for importing proteoform families from discovery top-down search results or creating them through an intuitive graphical user interface. The nodes of the proteoform family network will represent individual proteoforms, with each proteoform linked to an underlying repository of characterization and quantitative data. Handling the data in this way will allow seamless comparisons across quantitative studies as well as merging of fragmentation maps and spectra to produce optimal characterization coverage of proteoforms. Proteoform Finder will have a robust quantitation pipeline for quantifying proteoforms-of-interest. In addition, characterization studies will be able to be carried out with any fragmentation data available. The quantitative engine in Proteoform Finder is comprised of an isotopic fitting algorithm that matches theoretical isotopic distributions based on proteoform chemical formulas to the experimental data. As opposed to “averagine”-based isotopic fitting methods, using the exact formula for the proteoform reduces deviations between theoretical and observed distributions, while also eliminating off-by-one errors often observed during protein analysis with non-targeted mass determination algorithms. Our characterization algorithms will present a high-end user experience for matching fragment ions to proteoforms, along with a host of fragmentation validation features. Altogether, Proteoform Finder will be a comprehensive targeted software for analyzing top- down mass spectrometry quantitation and characterization data that will assist researchers across pharma, academia, and clinical settings in making important proteoform discoveries.

Key facts

NIH application ID
10325492
Project number
1R43GM142386-01A1
Recipient
PROTEINACEOUS, INC.
Principal Investigator
Kenneth Durbin
Activity code
R43
Funding institute
NIH
Fiscal year
2021
Award amount
$251,961
Award type
1
Project period
2021-09-25 → 2022-10-24