Interpretable graphical models for large multi-modal COPD data (R01HL159805)

NIH RePORTER · NIH · R01 · $501,802 · view on reporter.nih.gov ↗

Abstract

INTERPRETABLE GRAPHICAL MODELS FOR LARGE MULTI-MODAL COPD DATA ABSTRACT One of the most important tasks in today’s era of precision medicine is to understand the mechanisms and the factors affecting the development of clinical outcomes. Another important task is to develop interpretable, predictive models for outcomes. In the last years, many machine learning methods have dominated the task of predictive modeling, including deep learning, random forests and others. They are fueled by the unprecedent volume of data that have been generated in some research areas. However, the interpretability of these methods is not straight forward and their accuracy decreases when only small to medium size training datasets are available. Furthermore, their predictive models do not uncover the complex web of interactions between other variables in the dataset, which is essential for fully understanding disease mechanisms. Also, most such methods are not well suited to accommodate mixed data types (e.g., continuous, discrete) in the same dataset. Probabilistic graphical models (PGMs) offer a promising alternative to classical machine learning methods, because they are flexible and versatile. They can identify both the direct (causal) relations between variables, pointing to disease mechanisms, and build predictive models over diverse data, with good results even with smaller training datasets. They have been used for classification, biomarker selection, identification of modifiable risk factors of an outcome, or for mechanistic studies of perturbations of disease networks. In the previous years we extended the PGM theoretical framework to the analysis of mixed continuous and discrete variables, with or without unmeasured confounders; and we can now evaluate and incorporate prior information in mixed data graph learning. We successfully applied those methods to diverse clinically important problems, including malignancy prediction of undetermined lung nodules, identification of microbiome and other factors affecting pneumonia, selection of SNP biomarkers for combination treatment of cancer patients. Our objective is to develop novel interpretable methods for analysis of any-type data and use them to address clinically relevant questions in COPD, an important chronic lung disease. Method evaluation will be done on synthetic and real data, including parallel datasets with genomic, genetic, imaging and clinical COPD data. Our central aim is to identify factors of disease mechanisms of progression using different modalities of patient data. The deliverables will be (1) new PGM approaches for integrative analysis of any-type data; (2) a new, fully documented software package (in R, Python) that can be incorporated in other pipelines; (3) a new web portal to disseminate our methodologies to non-computer-savvy COPD researchers; (4) results on the pathogenesis and predictive features of chronic obstructive pulmonary disease (COPD). This cross-disciplinary team project is ...

Key facts

NIH application ID
10689574
Project number
7R01HL159805-06
Recipient
UNIVERSITY OF FLORIDA
Principal Investigator
PANAGIOTIS V BENOS
Activity code
R01
Funding institute
NIH
Fiscal year
2022
Award amount
$501,802
Award type
7
Project period
2021-07-25 → 2025-06-30