# Personalizing AAV Management by Leveraging Big Data: Targeting Complication Clusters

> **NIH NIH R03** · MASSACHUSETTS GENERAL HOSPITAL · 2021 · $105,843

## Abstract

PROJECT SUMMARY
ANCA-associated vasculitis (AAV) is a small vessel vasculitis associated with disease- and treatment-related
complications that contribute to reduced quality of life and excess mortality compared to the general
population. In the context of improving rates of flare and mortality with contemporary treatments, increasing
attention is shifting to complications (e.g., renal failure, infection, cardiovascular disease) as clinically-relevant
and patient-oriented outcomes. However, our understanding of how best to address and prevent complications
is limited because they are typically studied in isolation from a “single disease framework.” We do not
understand how complications tend to co-occur in individuals in complication clusters. Moreover, with several
available treatment options for AAV, comparative effectiveness studies using real-world experience data and
relevant outcomes like complication clusters are needed to guide treatment decisions in a manner that
personalizes care, improves quality of life, and reduces mortality. However, we do not have the methods to
accurately and efficiently assemble an AAV cohort using state-of-the-art algorithms that leverage
heterogeneous claims and electronic health record (EHR) data. The aims of this proposal are to (1) apply
advanced clinical informatics methods (i.e., machine learning and natural language processing) to identify AAV
cases in big data to assemble a large cohort and (2) determine complication clusters in an AAV cohort by
applying latent transition analysis. To achieve these aims, we will leverage methodologic expertise developed
through collaborations established during the PI’s K23 and use a novel data source that includes EHR data
linked to Medicare and Medicaid claims. The PI’s team has previously demonstrated that unstructured (i.e.,
free-text) EHR data can be used to study topics mentioned in clinical notes of AAV patients and that keywords
in these notes can help identify AAV patients but neither machine learning nor sophisticated natural language
processing have been previously used to identify AAV cases. In addition, our prior work has examined AAV
complications in isolation (e.g., renal disease, cardiovascular disease) but here we seek to identify phenotypes
of complications (complication clusters) that tend to co-occur in patients, how patients transition between
clusters over time, and what factors predict a person’s membership in a complication cluster. The major goal of
this proposal is to build further preliminary data in preparation for an R01 application over the next 24 months.
The planned R01 will focus on comparative effectiveness studies in AAV using cohorts assembled in big data
and clinically-relevant, patient-oriented outcomes, like complication clusters. The results of these studies can
then be used as inputs in simulation models built during my K23 to guide optimal patient-oriented treatment
decisions. Ultimately, the goal of this research program is to i...

## Key facts

- **NIH application ID:** 10198103
- **Project number:** 1R03AR078938-01
- **Recipient organization:** MASSACHUSETTS GENERAL HOSPITAL
- **Principal Investigator:** Zachary Scott Wallace
- **Activity code:** R03 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $105,843
- **Award type:** 1
- **Project period:** 2021-04-01 → 2023-03-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10198103

## Citation

> US National Institutes of Health, RePORTER application 10198103, Personalizing AAV Management by Leveraging Big Data: Targeting Complication Clusters (1R03AR078938-01). Retrieved via AI Analytics 2026-05-21 from https://api.ai-analytics.org/grant/nih/10198103. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
