ABSTRACT Sarcoidosis is a systemic granulomatous disease with striking heterogeneity of clinical course and increasing mortality and morbidity rates. Diagnosis is difficult and predicting disease outcomes is virtually impossible. While it is known that sarcoidosis likely involves host genetic susceptibility and a dysregulated immune response to any number of environmental factors, the mechanisms by which granulomas form and the determinants of severity and disease manifestations remain elusive. Our team has been at the forefront of genetics and transcriptomics in sarcoidosis, including the identification of genes for susceptibility, severity, ancestry- and organ-specific effects, the first and only single-cell RNA sequencing study of sarcoidosis and the first GWAS of pulmonary fibrosis in patients of African ancestry (AA). These findings provide the foundation for further dissection of cellular mechanisms that lead to systemic immune dysregulation and tissue-specific inflammatory response. Our current proposal will advance the field by filling critical gaps in sarcoidosis research, including the lack of biomarkers for sarcoidosis diagnosis and prognosis, and the absence of mechanistic connections between genetics, genomics, and immune profiles, in the periphery and affected organs. We will exploit the immunological, genomic, and bioinformatic expertise of our team to apply an innovative, iterative, and integrative multi-omic approach to discover molecular signatures characterizing sarcoidosis and its disease burden. Specifically, in Aim 1, we will use circulating protein levels from over 200 sarcoidosis patients of AA and European ancestry (EA) and 200 controls to a) develop predictive models of sarcoidosis susceptibility and disease burden via a novel application of machine learning, and b) define unique immune cell subsets based on cell-specific expression of cytokines of interest, and thus identify novel candidate genes and pathways. In Aim 2, we will identify candidate genes that are best suited for functional and drug target studies based on integrating the data collected across multiple biological systems (including genetic, transcriptomics, and proteomics), from multiple sample sources (including circulating immune cells and brochoalveolar lavage). Our work will be greatly facilitated by the extensive genetic, genomic, and clinical data available to us from our own cohorts (>2,900 EA and >3,000 AA) and the TOPMed Consortium (of which we are members). This application includes a novel approach to integrating data across multiple biological systems, with strong preliminary data to support the rigor and success of our proposal.