PROJECT SUMMARY/ABSTRACT Chronic obstructive pulmonary disease (COPD) is the leading cause of respiratory mortality in the United States. COPD is a highly heterogeneous disease and some COPD therapies are only applied to specific clinically defined subtypes. With the advent of multiple high-throughput biological assays and machine learning approaches, data-driven subtypes are increasingly being recognized. We hypothesize that such subtypes exist in COPD and that they can be identified using an integrative, multi-'omic approach. To accomplish this goal, we first propose to complement existing RNA and whole genome sequencing data in the well-phenotyped COPDGene study with peripheral blood microRNA sequencing. We will study the relationship of microRNA to genetic variation and gene expression in COPD. Next, we will apply a patient-based network similarity method to these three data types to identify COPD molecular subtypes. Finally, we will associate these subtypes with important clinical phenotypes and outcomes, and validate these subtypes in an independent subset of subjects. Our analysis targets a key clinical problem in COPD management, and will allow the mentee to become an independent investigator, applying bioinformatic and machine learning methods to genomic data in respiratory diseases.