# Data Driven Strategies for Substance Misuse Identification in Hospitalized Patients

> **NIH NIH R01** · UNIVERSITY OF WISCONSIN-MADISON · 2022 · $717,699

## Abstract

PROJECT SUMMARY
 The rate of substance use-related hospital visits in the US continues to increase, and now outpaces
visits for heart disease and respiratory failure. The prevalence of substance misuse (nonmedical use of opioids
and/or benzodiazepines, illicit drugs, and/or alcohol) in hospitalized patients is estimated to be 15%-25% and
far exceeds the prevalence in the general population. With over 35 million hospitalized patients per year, tens
of millions of patients are not screened for substance misuse during their stay. Despite the recommendation for
self-report questionnaires (single-question universal screens, Alcohol Use Disorders Identification Test
[AUDIT], Drug Abuse Screening Tool [DAST]), screening rates remains low in hospitals. Current screening
methods are resource-intensive, so a comprehensive and automated approach to substance misuse screening
that will augment current clinical workflow would therefore be of great utility.
 In the advent of Meaningful Use in the electronic health record (EHR), efficiency for substance misuse
detection may be improved by leveraging data collected during usual care. Documentation of substance use is
common and occurs in 97% of provider admission notes, but their free text format renders them difficult to
mine and analyze. Natural Language Processing (NLP) and machine learning are subfields of artificial
intelligence (AI) that provide a solution to analyze text data in the EHR to identify substance misuse. Modern
NLP has fused with machine learning, another sub-field of AI focused on learning from data. In particular, the
most powerful NLP methods rely on supervised learning, a type of machine learning that takes advantage of
current reference standards to make predictions about unseen cases
 In our earlier version of an NLP and machine learning tool, our opioid and alcohol misuse classifiers
successfully used data from clinical notes collected in the first 24 hours of hospital admission to reach a
sensitivity and specificity above 75% for detecting alcohol or opioid misuse. We will improve the performance
of our baseline, individual NLP single-substance classifiers for alcohol and opioid misuse by implementing
multi-label and multi-task machine learning methods. These methods will take advantage of information shared
across different types of substance misuse and better capture the state of a patient within a single model. The
resulting classifier will be capable of jointly inferring all types of substance misuse (alcohol misuse, opioid
misuse, and non-opioid illicit misuse) including polysubstance use, and cater to each individual patient’s
substance use treatment needs.
 We aim to train and test our substance misuse classifiers at Rush in a retrospective dataset of over
35,000 hospitalizations that have been manually screened with the universal screen, AUDIT, and DAST. The
top performing classifier will then be tested prospectively to: (1) externally validate its screening performance in
a ho...

## Key facts

- **NIH application ID:** 10455043
- **Project number:** 5R01DA051464-03
- **Recipient organization:** UNIVERSITY OF WISCONSIN-MADISON
- **Principal Investigator:** Majid Afshar
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $717,699
- **Award type:** 5
- **Project period:** 2020-09-30 → 2025-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10455043

## Citation

> US National Institutes of Health, RePORTER application 10455043, Data Driven Strategies for Substance Misuse Identification in Hospitalized Patients (5R01DA051464-03). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10455043. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*