# Integration of electronic medical records and neighborhood contextual indicators into machine learning strategies for identifying pregnant individuals at risk of depression in underserved communities

> **NIH NIH R21** · UNIVERSITY OF ILLINOIS AT CHICAGO · 2023 · $419,357

## Abstract

PROJECT SUMMARY/ABSTRACT
The goal of this proposal is to optimize the use of computational methods using electronic medical records
(EMRs), such as machine learning (ML) models, to predict depression during pregnancy and the first year
postpartum (perinatal depression, PND) in Minoritized Women of Color. Most ML models forecast postpartum
depression (PPD) based on EMR from middle class Non-Hispanic White individuals. However, our results show
that Non-Hispanic Black Women (NHBW) have higher rates of depression (23% versus the 12% US average)
and depression during early pregnancy in NHBW is far more common than PPD. Here, we propose to optimize
the application of ML models to PND in three keyways. First, we will use bias-mitigation approaches, to limit what
it is called model prediction performance bias, defined as the disparate model prediction outcome with respect
to certain socio-demographic variables, such race/ethnicity or age. Second, we will develop ML models that can
offer interpretable outcomes and provide insights for clinical interventions. ML models are often “black boxes”,
making it difficult to know the direction and magnitude of variables associated with the model outcome. Third,
current EMR-based ML models to predict PND rarely include community social determinants of health (SDoH).
SDoH both at the individual-level (e.g., racial minority, poverty) and at the neighborhood-level (e.g., violence,
access to care) have been linked with increased risk of PND. NHBW are disproportionally affected by the
negative health impacts of SDoH, including higher risk of PND and preterm birth. Despite their importance, SDoH
have not been considered in assessing risk of PND using ML models, particularly among Minority Women of
Color who experience disproportionate burden of social and economic hardship. This limits the model prediction
performance in women who are exposed to higher contextual risks. We hypothesize that interpretable ML
models trained on sufficient numbers of EMR records from Minoritized Women of Color and that integrate
neighborhood-level contextual factors (a proxy for community-level stressors) can substantially improve the
prediction of PND in women at higher risk. We aim to establish a robust and interpretable ML framework that
combines individual- and community-level SDoH to predict PND for Minoritized Women of Color who have been
rarely represented in data modeling. Our long-term vision is to integrate our interpretable ML model into routine
clinical care for early detection, diagnosis, and treatment of PND. We will capitalize on large urban OB/GYN
clinics (>70,000 patients) primarily serving Minoritized Women of Color (50% NHBW, 30% Latinas) living in the
Chicago area. Neighborhood contextualized information will be obtained from the US Census Bureau and the
Chicago Health Atlas. In Aim 1, we will develop interpretable ML models to predict PND in at-risk women using
EMRs. In Aim 2, we will also incorporate neighbor-level SDoH ...

## Key facts

- **NIH application ID:** 10741143
- **Project number:** 1R21HD110779-01A1
- **Recipient organization:** UNIVERSITY OF ILLINOIS AT CHICAGO
- **Principal Investigator:** YANG DAI
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2023
- **Award amount:** $419,357
- **Award type:** 1
- **Project period:** 2023-09-19 → 2025-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10741143

## Citation

> US National Institutes of Health, RePORTER application 10741143, Integration of electronic medical records and neighborhood contextual indicators into machine learning strategies for identifying pregnant individuals at risk of depression in underserved communities (1R21HD110779-01A1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10741143. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
