# PANDA-MSD: Predictive Analytics via Networked Distributed Algorithms for Multi-System Diseases

> **NIH NIH U01** · UNIVERSITY OF PENNSYLVANIA · 2022 · $1,212,233

## Abstract

Project Summary
This proposal seeks support to develop novel data integration methods using electronic health records (EHR)
from multiple CTSA hubs to create predictive models of multi-system diseases. The proposed project directly
addresses the areas of emphasis in PAR-19-099 to “engage new collaborators in pre-existing collaborations to
solve a translational science problem no one hub can solve alone”.
Research gap: The overarching goal of this proposal is to develop the Predictive Analytics via Networked
Distributed Algorithms (PANDA) framework, which will enable accurate risk prediction to help healthcare
providers reach accurate diagnoses earlier. Our proposed methods directly address two major barriers: 1) lack
of predictive models for multi-system conditions; 2) lack of algorithms that effectively combine data from
multiple sites in a privacy-preserving and communication-efficient fashion.
In this proposal, we will develop and evaluate the PANDA framework using two prototypic multi-system
conditions, with different levels of prevalence: granulomatosis with polyangiitis (GPA, a type of vasculitis,
prevalence of 74 per million) and psoriatic arthritis (PsA) (1500 per million), with the expectation that the
approach will be readily applicable to other diseases. These two conditions are particularly well-suited to the
development of our predictive methods given the commonly encountered delays in diagnosis that can range
from months to years. These delays may be associated with high morbidity and early mortality. We have
three Specific Aims:
Aim 1. Develop predictive models for granulomatosis with polyangiitis and psoriatic arthritis, and data
 integration algorithms to enable secure and efficient data sharing among multiple institutions.
Aim 2. Test the predictive models from Aim 1 using aggregated data (not IPD) from a separate set of
 CTSA sites to validate the data integration methodology.
Aim 3. Develop a “toolbox” of resources through which the PANDA processes of algorithm generation
 and data aggregation can be easily shared with and adopted for use by all CTSAs and others.
The success of this project will lead to novel analytic tools for facilitating efficient and privacy-preserving data
sharing and collaborative risk predictions across CTSA sites. The PANDA process of novel analytic tools to
assist clinical diagnoses and interventions should then be studied through pragmatic trials to evaluate its
potential to decrease diagnostic delays and alter patients’ health trajectories. This project is highly feasible and
is potentially transformative for both data science and clinical medicine.

## Key facts

- **NIH application ID:** 10368562
- **Project number:** 1U01TR003709-01A1
- **Recipient organization:** UNIVERSITY OF PENNSYLVANIA
- **Principal Investigator:** Jiang Bian
- **Activity code:** U01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $1,212,233
- **Award type:** 1
- **Project period:** 2022-08-05 → 2026-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10368562

## Citation

> US National Institutes of Health, RePORTER application 10368562, PANDA-MSD: Predictive Analytics via Networked Distributed Algorithms for Multi-System Diseases (1U01TR003709-01A1). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10368562. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
