# Interpretable graphical models for large multi-modal COPD data

> **NIH NIH R01** · UNIVERSITY OF PITTSBURGH AT PITTSBURGH · 2021 · $559,193

## Abstract

INTERPRETABLE GRAPHICAL MODELS FOR LARGE MULTI-MODAL COPD DATA
ABSTRACT
One of the most important tasks in today’s era of precision medicine is to understand the mechanisms and the
factors affecting the development of clinical outcomes. Another important task is to develop interpretable,
predictive models for outcomes. In the last years, many machine learning methods have dominated the task of
predictive modeling, including deep learning, random forests and others. They are fueled by the unprecedent
volume of data that have been generated in some research areas. However, the interpretability of these methods
is not straight forward and their accuracy decreases when only small to medium size training datasets are
available. Furthermore, their predictive models do not uncover the complex web of interactions between other
variables in the dataset, which is essential for fully understanding disease mechanisms. Also, most such methods
are not well suited to accommodate mixed data types (e.g., continuous, discrete) in the same dataset.
Probabilistic graphical models (PGMs) offer a promising alternative to classical machine learning methods,
because they are flexible and versatile. They can identify both the direct (causal) relations between variables,
pointing to disease mechanisms, and build predictive models over diverse data, with good results even with
smaller training datasets. They have been used for classification, biomarker selection, identification of modifiable
risk factors of an outcome, or for mechanistic studies of perturbations of disease networks. In the previous years
we extended the PGM theoretical framework to the analysis of mixed continuous and discrete variables, with or
without unmeasured confounders; and we can now evaluate and incorporate prior information in mixed data
graph learning. We successfully applied those methods to diverse clinically important problems, including
malignancy prediction of undetermined lung nodules, identification of microbiome and other factors affecting
pneumonia, selection of SNP biomarkers for combination treatment of cancer patients.
Our objective is to develop novel interpretable methods for analysis of any-type data and use them to address
clinically relevant questions in COPD, an important chronic lung disease. Method evaluation will be done on
synthetic and real data, including parallel datasets with genomic, genetic, imaging and clinical COPD data. Our
central aim is to identify factors of disease mechanisms of progression using different modalities of patient data.
The deliverables will be (1) new PGM approaches for integrative analysis of any-type data; (2) a new, fully
documented software package (in R, Python) that can be incorporated in other pipelines; (3) a new web portal
to disseminate our methodologies to non-computer-savvy COPD researchers; (4) results on the pathogenesis
and predictive features of chronic obstructive pulmonary disease (COPD). This cross-disciplinary team project
is ...

## Key facts

- **NIH application ID:** 10301433
- **Project number:** 9R01HL159805-05A1
- **Recipient organization:** UNIVERSITY OF PITTSBURGH AT PITTSBURGH
- **Principal Investigator:** PANAGIOTIS V BENOS
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $559,193
- **Award type:** 9
- **Project period:** 2015-07-01 → 2022-06-15

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10301433

## Citation

> US National Institutes of Health, RePORTER application 10301433, Interpretable graphical models for large multi-modal COPD data (9R01HL159805-05A1). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10301433. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*