# PREMIERE: A PREdictive Model Index and Exchange REpository

> **NIH NIH R01** · UNIVERSITY OF CALIFORNIA LOS ANGELES · 2022 · $291,548

## Abstract

PROJECT DESCRIPTION (ABSTRACT)
The use of artificial intelligence (AI) continues to accelerate in biomedical and behavioral research, with the
ultimate goals of informing and improving healthcare. While the technical advances thus far are significant, sev-
eral concerns have been introduced recently regarding the (unintended) consequences of such techniques. For
example, problems related to dataset bias can manifest in multiple ways, including the use of non-representative
populations; continued propagation of unrecognized system and process prejudices; and equitable access.
Given the potential downstream harm, ethical, legal, and social issues (ELSI) must now be integrated alongside
the use of data and AI in biomedical and behavioral research and care delivery. However, best practices for ELSI
and ethical AI (ETAI) have yet to fully emerge and there is no standard way of documenting ELSI/ETAI consid-
erations in the development and use of predictive models.
Building on our platform, the PREdictive Model Index and Exchange REpository (PREMIERE), the goal of this
1-year R01 supplement is to develop (meta)data to document and share information around ELSI/ETAI, directly
linking such information as part of a shared predictive model. To focus our efforts, we address the growing use
of synthetic datasets to train and validate machine learning (ML) models. Synthetic datasets reflect the underly-
ing statistical properties of actual real-world datasets and are promoted as a way of protecting private information
while enhancing the overall data availability (i.e., for training). The use of generative adversarial networks (GANs)
is illustrative. But improperly simulated datasets can result in an algorithm learning incorrectly and/or exacerbat-
ing existing dataset biases, raising complex ELSI questions. Using these questions as a motivating use case,
this supplement has three specific aims: 1) to examine the ethics of using synthetic datasets, namely through
key informant interviews; 2) to establish guidance for a computational checklist for AI/ML and ELSI, leveraging
a broad community of stakeholders; and 3) to develop and implement this checklist as part of PREMIERE,
demonstrating how ELSI-related information is shared as part of a ML model by extending the Predictive Model
Markup Language (PMML). To achieve these aims, we established a new collaboration between UCLA and
Penn State University (PSU) to bring together interdisciplinary experts in AI/ML, biomedical informatics, law,
ethics, communication, and healthcare. Together, we will plan a series of workshops that convene national ex-
perts who have already agreed to participate in this endeavor. The results of these meetings will be increased
awareness around the use of synthetic datasets and their complexities; published recommendations around their
use; and methods for documenting ELSI/ETAI in the context of predictive ML models.

## Key facts

- **NIH application ID:** 10597854
- **Project number:** 3R01EB027650-03S1
- **Recipient organization:** UNIVERSITY OF CALIFORNIA LOS ANGELES
- **Principal Investigator:** ALEX BUI
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $291,548
- **Award type:** 3
- **Project period:** 2019-09-15 → 2023-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10597854

## Citation

> US National Institutes of Health, RePORTER application 10597854, PREMIERE: A PREdictive Model Index and Exchange REpository (3R01EB027650-03S1). Retrieved via AI Analytics 2026-06-13 from https://api.ai-analytics.org/grant/nih/10597854. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
