PROJECT SUMMARY The primary goal of this study is to construct predictive models (classifiers) of pulmonary sarcoidosis and progressive (P) vs. non-progressive (NP) disease that will ultimately serve to improve outcomes of pulmonary sarcoidosis. We have assembled a unique investigative team with expertise in proteomics, immunology, genomics, sarcoidosis clinical care, as well as bioinformatics and statistics. Sarcoidosis is a diagnostically challenging immune-mediated systemic disease. It results in significant morbidity and mortality, primarily due to progressive pulmonary disease, although the factors that drive pulmonary disease and P vs. NP disease are unknown. The strategies to treat pulmonary sarcoidosis, including the triggers to initiate treatment, are non- specific; treatment usually relies on suppressing the immune system with corticosteroids and is associated with considerable side-effects. Transcriptional changes in the lung and blood have defined a signature of P disease in cross-sectional studies. Since proteins are the main effectors of cellular function and their alterations result in disruption of biologic systems and disease development, they are a logical source of biomarkers. Our preliminary data from bronchoalveolar lavage fluid and cells demonstrate significant proteome wide alterations in pulmonary sarcoidosis vs controls and P vs NP disease. We hypothesize that effective markers of disease and those distinguishing progressive from non-progressive disease will reflect biological processes active in disease and progression. Secondarily, by characterizing cellular proteins, global phosphorylation events and cell-specific RNA expression, we will define known proteins/gene/pathways such as the PI3K/Akt/mTOR and other serine-threonine kinase signaling mechanisms as well as novel pathogenic proteins/genes, such as endocytic and aryl hydrocarbon receptor signaling, which will have implications for mechanism and therapy. We will use high-resolution mass spectrometry (MS), advanced bioinformatics and computational tools in well- phenotyped sarcoidosis patients. In Aim 1, we will determine a disease-specific classifier for diagnosing sarcoidosis using a Discovery Cohort of sarcoidosis cases and diseased and healthy controls (already recruited) for the development and Validation Cohort (recruited for this study) of sarcoidosis cases and controls to verify and optimize the classifier performance. In Aim 2, we will identify a protein classifier of P vs NP disease using the same approach as in Aim 2. In Aim 3 we will use a novel single-cell RNA-sequencing approach, CITE-seq to identify transcription from specific cells, and integrate it with protein changes, including examination of global phosphorylation events to identify kinase signaling and discover cell-specific cellular proteins/genes associated with disease and progression in a subset of our Validation Cohort. At the end of this study, we will have defined diagnostic biomarkers of...