PANDA-MSD: Predictive Analytics via Networked Distributed Algorithms for Multi-System Diseases

NIH RePORTER · NIH · U01 · $1,160,870 · view on reporter.nih.gov ↗

Abstract

Project Summary This proposal seeks support to develop novel data integration methods using electronic health records (EHR) from multiple CTSA hubs to create predictive models of multi-system diseases. The proposed project directly addresses the areas of emphasis in PAR-19-099 to “engage new collaborators in pre-existing collaborations to solve a translational science problem no one hub can solve alone”. Research gap: The overarching goal of this proposal is to develop the Predictive Analytics via Networked Distributed Algorithms (PANDA) framework, which will enable accurate risk prediction to help healthcare providers reach accurate diagnoses earlier. Our proposed methods directly address two major barriers: 1) lack of predictive models for multi-system conditions; 2) lack of algorithms that effectively combine data from multiple sites in a privacy-preserving and communication-efficient fashion. In this proposal, we will develop and evaluate the PANDA framework using two prototypic multi-system conditions, with different levels of prevalence: granulomatosis with polyangiitis (GPA, a type of vasculitis, prevalence of 74 per million) and psoriatic arthritis (PsA) (1500 per million), with the expectation that the approach will be readily applicable to other diseases. These two conditions are particularly well-suited to the development of our predictive methods given the commonly encountered delays in diagnosis that can range from months to years. These delays may be associated with high morbidity and early mortality. We have three Specific Aims: Aim 1. Develop predictive models for granulomatosis with polyangiitis and psoriatic arthritis, and data integration algorithms to enable secure and efficient data sharing among multiple institutions. Aim 2. Test the predictive models from Aim 1 using aggregated data (not IPD) from a separate set of CTSA sites to validate the data integration methodology. Aim 3. Develop a “toolbox” of resources through which the PANDA processes of algorithm generation and data aggregation can be easily shared with and adopted for use by all CTSAs and others. The success of this project will lead to novel analytic tools for facilitating efficient and privacy-preserving data sharing and collaborative risk predictions across CTSA sites. The PANDA process of novel analytic tools to assist clinical diagnoses and interventions should then be studied through pragmatic trials to evaluate its potential to decrease diagnostic delays and alter patients’ health trajectories. This project is highly feasible and is potentially transformative for both data science and clinical medicine.

Key facts

NIH application ID
10872123
Project number
5U01TR003709-03
Recipient
UNIVERSITY OF PENNSYLVANIA
Principal Investigator
Jiang Bian
Activity code
U01
Funding institute
NIH
Fiscal year
2024
Award amount
$1,160,870
Award type
5
Project period
2022-08-05 → 2026-05-31