PROJECT SUMMARY Many valuable datasets have been generated from the NIH Common Fund programs, including large RNA- sequencing data from multiple tissues and species as well as increasingly available high-quality mass spec- trometry-based proteomics data. Although proteomics offers unique insight into various pathophysiological processes, currently many proteomics datasets remain smaller in scale than their RNA-seq counterparts and it remains to be investigated how best to integrate proteomics and RNA-seq data to maximize their utility. The goal of this pilot project is to assess the feasibility of integrating multi-omics data to generate new hypotheses about cross-tissue physiology. We will perform three integrated tasks during the funding period: First, we will integrate data across tissues, species, and omics data type in two NIH Common Fund projects, namely GTEx v8 and MoTrPAC Release v1.0, with the aid of transfer learning methods. Our goal is to learn a low-dimension representation of gene expression structure across human tissues that can be applied to other datasets to facilitate integrative analysis. Second, we will apply an RNA-sequencing guided proteomics pipeline and software that we recently developed in order to extract hidden peptide information from Common Fund proteomics data, including isoform and post-translational modifications. We will then evaluate the feasibility and utility of integrating transcriptomics and proteomics data to examine multi-tissue correlations in gene expression. Lastly, we will utilize this data analysis pipeline in order to predict cross-tissue communication pat- terns from transcriptomics and proteomics data, then perform limited experimental validation of the computational findings using human induced pluripotent stem cell (iPSC)-derived cells. If successful, we envision that the results will inform on data integration strategies that can help further increase the utility of large-scale proteomics and RNA-seq data in the public domain, as well as generate testable hypotheses on gene co-expression across multiple tissues that can inform future work.