# Puerto Rico Testsite for Exploring Contamination Threats (PROTECT) - Admin Supplement

> **NIH NIH P42** · NORTHEASTERN UNIVERSITY · 2021 · $2

## Abstract

1 Project Summary
 Evidence suggests that exposure to Superfund chemicals contributes to adverse pregnancy outcomes (APOs), including
preterm birth (PTB). Rates of PTB and infant mortality in Puerto Rico (PR) are among the highest of all US states and
territories. There are 18 Superfund sites in PR, and evidence of contamination of the drinking water is extensive. Moreover,
extreme weather events (hurricanes, ﬂooding) may result in elevated exposures to Superfund chemicals. The PROTECT
center has brought together researchers from Northeastern University, the University of Puerto Rico, University of Georgia,
and the University of Michigan to provide much needed understanding of the relationship and the mechanisms by which
exposure to suspect chemicals contribute to APOs, and to develop new methods to reduce risk of exposure in PR and beyond.
To do this, PROTECT uses a source-to-outcome structure, integrating epidemiological, toxicological, fate and transport,
and remediation studies, a uniﬁed sampling infrastructure, a centralized indexed data repository, and a sophisticated data
management system.
 Since its inception in 2010, PROTECT has built detailed and extensive datasets on environmental conditions and prenatal
conditions of pregnant mothers (exposure, socioeconomic and health data–close to more than 2400 data points per partic-
ipant), yielding a rich dataset collected from a cohort of over 2000 expectant mothers and their children. The PROTECT
Data Management and Analytics Core (DMAC) manages data centrally for the entire Center, while providing both analytics
support to evaluate quantitative and qualitative data. The DMAC provides a comprehensive set of Data Dictionaries for this
dataset and developed protocols for proper handling of sensitive data.
 The PROTECT database has the potential to help unlock the relationships that can tie environmental factors to preterm
birth outcomes. The dataset is primed to leverage powerful AI/ML toolsets to help identify and establish these relationships.
But before we can start leveraging these powerful tools, speciﬁc challenges must be overcome to prepare the data. These
challenges include the degree of missing data in our dataset, and the inherent class imbalance in our data. This project will
address these issues head on, and utilize feedback and experience developed through hold a series of hackathons that explore
these datasets. The result should be the production of a suite of datasets ready for AI/ML tools, and the delivery a number of
new open source toolsets to addressing missingness and imbalance in data.

## Key facts

- **NIH application ID:** 10411854
- **Project number:** 3P42ES017198-11S3
- **Recipient organization:** NORTHEASTERN UNIVERSITY
- **Principal Investigator:** Akram N Alshawabkeh
- **Activity code:** P42 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $2
- **Award type:** 3
- **Project period:** 2021-09-15 → 2022-01-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10411854

## Citation

> US National Institutes of Health, RePORTER application 10411854, Puerto Rico Testsite for Exploring Contamination Threats (PROTECT) - Admin Supplement (3P42ES017198-11S3). Retrieved via AI Analytics 2026-06-01 from https://api.ai-analytics.org/grant/nih/10411854. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
