# Time series clustering to identify and translate time-varying multipollutant exposures for health studies

> **NIH NIH F31** · UNIVERSITY OF SOUTHERN CALIFORNIA · 2023 · $47,694

## Abstract

PROJECT SUMMARY/ABSTRACT
Air pollution exposure is a universal concern linked to a wide range of adverse health outcomes. Ambient air
pollution is a complex environmental exposure arising from numerous different sources and varies over time;
however, many air pollution health effects studies fail to consider more than a single pollutant at a time and rely
on an exposure that has been averaged over time. Recent advancements in statistical methodologies for multi-
collinear exposures have resulted in an increased number of studies on human health impacts of multipollutant
mixtures, but these methodologies still often result in hard-to-interpret effect estimates and do not extend to
repeated measures of exposure. Thus, there is a need to further improve mixtures methodologies to be able to
investigate time-varying exposures and have interpretable exposure effect estimates.
 The overall goal of this study is to improve methodologies for the study of air pollution mixtures
by using a two-stage time series clustering approach. Initial work focuses on supplementing current
literature by extending clustering methodologies to the interpretable analysis of time series data. This
developmental work will provide a strong foundation for later application to identify and translate multipollutant
diurnal exposure profiles. In Aim 1, I will identify the optimal number of ending clusters by extending current
methods on static data and evaluating their performance on time series data. Identification of optimal cluster
number is nontrivial without external information (e.g., a key) and current methods fail to provide evidence of
positive (or negative) performance for time series data. In Aim 2, I will extend the linear statistical model to
appropriately translate multivariate clustering methods to studies on health effects of pollutant mixtures.
Exposures grouped by clusters are themselves visually intuitive but would be improved by adding interpretive
distances between features of the representative cluster center and individual cluster members. The time
series clustering methodology will be demonstrated in two applications: (Aim 3a) to identify typical
multipollutant diurnal profiles in Southern California, and (Aim 3b) to evaluate their associations with exhaled
nitric oxide (FeNO) in the Southern California Children’s Health Study. Hourly monitoring data for particulate
matter <2.5µm (PM2.5) and <10µm (PM10), nitrogen dioxide (NO2), and ozone (O3) are used to identify typical
diurnal ambient air pollution exposures and relate them to pediatric health.
 This work will improve current mixtures methods and provide new tools for the study of time-varying
exposures. The analysis of time-varying exposures is of increasing import with the growing amounts of data in
response to recent technological advances. Time-varying mixtures are present in many places (e.g., air, soil)
and development of applicable methodologies would benefit public health and regulatory decisions...

## Key facts

- **NIH application ID:** 10749341
- **Project number:** 1F31ES035618-01
- **Recipient organization:** UNIVERSITY OF SOUTHERN CALIFORNIA
- **Principal Investigator:** Brittney Marian
- **Activity code:** F31 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2023
- **Award amount:** $47,694
- **Award type:** 1
- **Project period:** 2024-01-01 → 2025-12-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10749341

## Citation

> US National Institutes of Health, RePORTER application 10749341, Time series clustering to identify and translate time-varying multipollutant exposures for health studies (1F31ES035618-01). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10749341. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
