# Decentralized differentially-private methods for dynamic data release and analysis

> **NIH NIH R01** · UNIVERSITY OF CALIFORNIA, SAN DIEGO · 2022 · $647,096

## Abstract

Project Summary
Large data sets are important in the development and evaluation of artificial intelligence (AI) and
statistical learning models to predict morbidity, mortality, and other important health outcomes.
Healthcare institutions are stewards of their patients’ data, and want to contribute to the
development, evaluation, and utilization of predictive analytics tools. However, they also know
that simple “de-identification” per HIPAA rules is not sufficient to protect patient privacy.
Additionally, other factors such as protection of market share, lack of control about who uses
shared data for what purposes, and concerns about patients’ reactions to having their data shared
without explicit consent make initiatives such as certain registries and centralized repositories
difficult to implement. We have shown that it is possible to decompose algorithms so that they
can run on data that stays at each healthcare center, thus mitigating the concerns about control
and potential misuse. In the first phase of this project, we concentrated on demonstrating the
accuracy and performance of these algorithms for the study of chronic diseases in which (1)
acquisition of new knowledge about the condition is slow (i.e., the disease is well understood, so
scientific discoveries are not being published at a rapid pace); and (2) the incidence and
presentation of the disease do not vary dramatically from place to place, and from person to
person. In this competitive renewal, we propose to develop decentralized predictive models that
meet all requirements for chronic diseases, but the methods are also applicable to rapidly evolving
acute conditions such as COVID-19. We propose new approaches to deal with sites that may be
missing certain patient profiles or certain variables but can still participate in model learning,
evaluation and implementation. These new AI algorithms will permit supervised and unsupervised
learning across institutions, using data from multiple modalities (e.g., imaging, genomes,
laboratory tests), and will allow privacy-protecting record linkage. We will test these algorithms
and approaches in data from three highly diverse medical centers across the US: Emory
University in Atlanta, University of Texas Health Science Center at Houston, and University of
California, San Diego.

## Key facts

- **NIH application ID:** 10367349
- **Project number:** 9R01LM013712-05A1
- **Recipient organization:** UNIVERSITY OF CALIFORNIA, SAN DIEGO
- **Principal Investigator:** Xiaoqian Jiang
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $647,096
- **Award type:** 9
- **Project period:** 2022-03-01 → 2022-12-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10367349

## Citation

> US National Institutes of Health, RePORTER application 10367349, Decentralized differentially-private methods for dynamic data release and analysis (9R01LM013712-05A1). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10367349. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
