# A Framework for Automated and Reproducible Geomarker Curation and Computation at Scale

> **NIH NIH R01** · CINCINNATI CHILDRENS HOSP MED CTR · 2020 · $337,875

## Abstract

Project Summary
Environmental exposures and community characteristics, including air pollution, greenspace, crime, and indices
of community deprivation are powerful predictors of health. These data, termed ”geomarkers”, have until recently
experienced limited availability. Democratization of “big spatial data” and advances in geoinformatics now allow
unprecedented access, paving the way for the expansion of “precision medicine” into “precision public health”.
Automated linkage of these newly available data to existing studies and electronic health record (EHR) databases,
however, is often difﬁcult due to data heterogeneity with respect to spatiotemporal resolution and extent, anno-
tation, storage, formats, retrieval methods, and computational strategies. This hinders data accessibility and
interoperability. Because precise geolocation is considered protected health information (PHI), research regula-
tions designed to protect the identities of study participants often present obstacles to sharing data and utilizing
third party tools. Thus, though usage of these data is rapidly expanding, their application in research and clinical
care is hampered by access to proper expertise and reliance on inefﬁcient manual data curation. We are critically
missing the methods and tools to make reproducible geomarker assessment widely accessible and easily used
by biomedical researchers. To address this limitation, our overall objective is to develop a curated and standard-
ized library that researchers can use for efﬁcient, automated, and reproducible linkage of geomarkers to their
own data and a generalized framework to which exposure assessment scientists can contribute. We will create
an initial library based on software containerization and automated metatdata collection and publishing, conduct
a test implementation within an existing electronic health record informatics data pipeline, and obtain feedback
from users and consumers. Importantly, this proposal will create a framework for geomarker tool development
that will allow any user to interact with the software in a consistent and user friendly way. Development will take
place publicly such that anyone can provide feedback and modiﬁcations or additions to our toolset. The soft-
ware, framework, and library of containers will free and open source. Our efforts will make geomarker data and
methods more ﬁndable, accessible, interoperable, and reusable (FAIR) and as we seek to make our approach
a widely-adopted, community-maintained, and sustainable resource to fuel the advancement of precision public
health.

## Key facts

- **NIH application ID:** 9986309
- **Project number:** 1R01LM013222-01A1
- **Recipient organization:** CINCINNATI CHILDRENS HOSP MED CTR
- **Principal Investigator:** Cole Brokamp
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $337,875
- **Award type:** 1
- **Project period:** 2020-08-01 → 2024-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9986309

## Citation

> US National Institutes of Health, RePORTER application 9986309, A Framework for Automated and Reproducible Geomarker Curation and Computation at Scale (1R01LM013222-01A1). Retrieved via AI Analytics 2026-05-21 from https://api.ai-analytics.org/grant/nih/9986309. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
