# Building a Substance Use Data Commons for Public Health Informatics

> **NIH NIH R01** · UNIVERSITY OF WISCONSIN-MADISON · 2021 · $310,617

## Abstract

PROJECT SUMMARY
Substance misuse comprises a complex set of conditions, often associated with comorbidities and social factors,
that are the root cause of misuse and can lead to poor outcomes. Opioid misuse, non-opioid illicit use, and
alcohol misuse can also lead to repeated encounters with hospital emergency departments or first-responders.
Although substance use disorders are a leading cause of repeat hospital visits, our fragmented data systems do
not generate comprehensive information on the scope and character of this poorly treated condition that would
allow providers to improve and monitor the quality of care. Crucial social and behavioral determinants strongly
linked with substance use (e.g., pre-hospital behavioral events) are not readily available to health systems, but
they are important data that can be used to better train artificial intelligence/machine learning (AI/ML) models. In
principle, hospitals are well-positioned to address these challenges. In practice, these opportunities are
frequently missed given the fragmented structure and design of current data systems. Many patients living with
substance misuse visit a specific hospital for the first time after an overdose or a related medical condition of
drug use such as infection or trauma. Substance use-related conditions are among the top reasons for repeat
visits to the hospital. This supplemental will expand on the existing work of the Parent R01, which is focused in
clinical informatics, and build an AI/ML-ready public health informatics Substance Use Data Commons and share
a novel, all-inclusive prediction model that will help guide clinical interventions and regional health policy.
We aim to foster an academic-public-private collaboration to build a data ecosystem in this supplemental grant
that will harmonize data across a Wisconsin regional hospital, pre-hospital agencies like fire, and public health
agencies for the first time. We will build a cohort with substance misuse with linked data that are engineered as
an AI/ML-ready data commons. During our one-year timeline, we will train and test an AI/ML model that can
prioritize those at the highest risk for poor outcomes and uncover important biases in our data sources with input
by health equity experts. The following goals are to be accomplished from the supplement proposal: (1) build a
Substance Misuse Data Commons across a major hospital system and Wisconsin agencies; (2) develop and
validate a machine learning tool for substance use-related health outcomes; and (3) examine model performance
across health disparate groups (race/ethnic groups as well as neighborhoods). Access to combined data from
hospitals, public health agencies, and first responder agencies could provide a comprehensive data resource
that would allow us to reliably identify, risk stratify, and prioritize care for some of Wisconsin's most vulnerable
residents through AI/ML modeling.

## Key facts

- **NIH application ID:** 10411763
- **Project number:** 3R01DA051464-02S1
- **Recipient organization:** UNIVERSITY OF WISCONSIN-MADISON
- **Principal Investigator:** Majid Afshar
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $310,617
- **Award type:** 3
- **Project period:** 2020-09-30 → 2025-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10411763

## Citation

> US National Institutes of Health, RePORTER application 10411763, Building a Substance Use Data Commons for Public Health Informatics (3R01DA051464-02S1). Retrieved via AI Analytics 2026-06-01 from https://api.ai-analytics.org/grant/nih/10411763. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
