# Data-Driven Methods to Identify Social Determinants of Health

> **NIH VA I01** · RALPH H JOHNSON VA MEDICAL CENTER · 2024 · —

## Abstract

Background: There is increased attention on social determinants of health (SDOH) as a result of
empirical evidence showing that the patient’s social background is associated with their health
behaviors and clinical outcomes. Now more than ever, health care systems (HCS) are being held
accountable for addressing social factors. Improving the quality of health care among racial and ethnic
minorities is a VA is a top priority.
Significance/Impact: Ideally, identifying and documenting a patient’s social background would be
followed by referral to services that address the SDOH that are most likely to reduce compliance with
recommendations for disease prevention, treatment, and management. However, SDOH such as
education, income, social isolation, and financial strain are rarely documented during routine care visits.
A more systematic approach that leverages health information technology is needed to improve the
efficiency and effectiveness of identifying social determinants among patients in the VA so that more
targeted approaches are used to address these risk factors in the patients’ communities. A better
understanding of SDOH within the electronic health record (EHR) is needed in order to improve
population health management and processes for referring patients to social services.
Innovation: The first step to developing a more robust data-driven strategy for identifying social
phenotypes among patients is to understand the extent to which SDOHs are being documented in the
EHR. Natural language processing (NLP) is one strategy to automatically extract those data from
clinical notes in the EHR into a structured format that can be used to examine the quality of health care
and facilitate the development and implementation of quality improvement strategies. However, NLP
approaches alone are not sufficient to improve the quality of health care for Veteran racial/ethnic
minorities. This is because poor quality communication between patients and providers and greater
distrust in the health care system among minorities may limit discussion of these factors. Novel deep
learning approaches have not been fully leveraged in the identification of patients at risk for adverse
SDOH. Moreover, there is a lack of empirical data on the concordance between patient self-reported
SDOH and those extracted using NLP. Even less is known about the value associated with obtaining
and documenting SDOH on patient outcomes. Therefore, we propose to develop a multilevel health
informatics approach for identifying social phenotypes among primary care patients based on
documentation of SDOH in the EHR as part of the following:
Specific Aims: Aim 1: Use deep learning strategies to identify social phenotypes among diabetes
patients based on documentation of SDOH in the EHR. Aim 2: Examine the concordance between risk
factors for SDOH identified using NLP and patient-self- report. Aim 3: Conduct a study to evaluate the
effects of documenting SDOH on patient outcomes.
Methodology: ...

## Key facts

- **NIH application ID:** 10739723
- **Project number:** 5I01HX003379-03
- **Recipient organization:** RALPH H JOHNSON VA MEDICAL CENTER
- **Principal Investigator:** Lewis James Frey
- **Activity code:** I01 (R01, R21, SBIR, etc.)
- **Funding institute:** VA
- **Fiscal year:** 2024
- **Award amount:** —
- **Award type:** 5
- **Project period:** 2021-10-01 → 2025-09-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10739723

## Citation

> US National Institutes of Health, RePORTER application 10739723, Data-Driven Methods to Identify Social Determinants of Health (5I01HX003379-03). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10739723. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
