# Predicting who will fracture: Exploration of machine learning in the observational Women's Health Initiative Study dataset.

> **NIH NIH R21** · UNIVERSITY OF CALIFORNIA LOS ANGELES · 2022 · $168,874

## Abstract

PROJECT ABSTRACT
Half of all postmenopausal women will experience an osteoporosis-related fracture in their remaining lifetimes.
As these fractures can lead to disability, loss of independence, and death, it is important to identify who is at risk
for early intervention and mitigation. While clinical guidelines support routine osteoporosis screening for women
aged ≥65 years, only selective screening is recommended for younger postmenopausal women aged 50-64
based on the use of risk assessment tools (e.g., OST, FRAX, SCORE). However, we have shown that these
tools – which were not specifically developed for women in this age group – do not differentiate well between
women who do and do not have osteoporosis (based on bone mineral density, BMD) and/or subsequent fracture.
The objective of this project is to explore machine learning (ML) to improve osteoporosis risk assessment in
young postmenopausal women. Prior ML-based analyses for osteoporosis and related fractures exist but are on
non-American populations and/or are of limited size. We will use the large Women's Health Initiative (WHI) Study
(>160,000 individuals from the United States), to develop, validate, and compare different machine learning
approaches (random forests; logistic regression; dynamic belief network, DBN) for younger postmenopausal
women. ML models will be constructed and assessed for two tasks: 1) predicting fracture risk in women aged
50-64 (Aim 1); and 2) predicting osteoporosis (per BMD; Aim 2). In each case, we will build ML models using
existing risk factors from current tools, as well as add additional variables collected from the WHI to identify new
features that may improve predictive power. We will also assess the value of temporal model by building DBNs,
using an individual's past observations to guide predictions. We will compute technical performance metrics
(e.g., sensitivity, specificity, positive predictive value) and conduct error analyses to contrast what (sub)groups
each model (in)correctly identifies. We will also perform sensitivity analyses to ascertain the impact of different
variables on the robustness of the model's predictions. Lastly, we plan to externally validate (Aim 3) the models
from Aims 1 & 2 using electronic health record (EHR) datasets from UCLA and UCSF, investigating the degree
of transportability. Successful execution of this R21 will: 1) develop and test different ML models predicting major
osteoporotic fracture and osteoporosis in US women; 2) identify potential additional variables that inform the risk
of these conditions; and 3) provide insight into areas where such ML-models may be improved through stratifi-
cation and/or future methodological approaches. The results from this R21 will serve as a baseline for a broader
R01 to develop more effective predictive models for fracture and osteoporotic risk.

## Key facts

- **NIH application ID:** 10370048
- **Project number:** 1R21AR078905-01A1
- **Recipient organization:** UNIVERSITY OF CALIFORNIA LOS ANGELES
- **Principal Investigator:** ALEX BUI
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $168,874
- **Award type:** 1
- **Project period:** 2022-09-21 → 2024-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10370048

## Citation

> US National Institutes of Health, RePORTER application 10370048, Predicting who will fracture: Exploration of machine learning in the observational Women's Health Initiative Study dataset. (1R21AR078905-01A1). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10370048. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
