An ethical framework-guided metric tool for assessing bias in EHR-based Big Data studies

NIH RePORTER · NIH · R01 · $267,578 · view on reporter.nih.gov ↗

Abstract

Abstract The emergence of Big Data health research has exponentially advanced the fields of medicine and public health but has also faced many ethical challenges. One of most worrying but still under-researched aspects of ethical issues is the risk of potential biases in datasets (e.g., electronic health records [EHR] data) as well as in the data curation and acquisition cycles. Very few EHR data-based studies report bias in datasets, data acquisition and/or mining as an indicator of research quality because of a lack of a standardized measurement tool or metrics to assess bias; few ethical frameworks as a theoretical ground; and limited effective interdisciplinary collaboration that engages ethical experts, professional data curators, data management experts, data repository administrators, healthcare workers, and state agencies in discussions addressing this ethical challenge. Since 2021, we have been funded by NIH (R01AI164947) to develop a machine-learning based predictive model of viral suppression among HIV patients based on EHR and other relevant data from multiple sources in South Carolina. One of the ethical challenges encountered by the parent project is how to assess the potential biases in the curation, acquisition, and processing of EHR data. In response to the NOT-OD-22-065 titled “Administrative supplements for advancing the ethical development and use of AI/ML in biomedical and behavioral sciences”, we propose to develop, refine, and pilot test an ethical framework-guided metric tool for assessing bias in Big Data research using EHR datasets. Specifically, we request support to: 1) conduct a literature/policy review and concept analysis to develop an ethical framework for unbiased and inclusive Big Data research; 2) create and modify a metric tool to assess potential biases in EHR data-based studies via in- depth interviews of key stakeholders of the parent project; and 3) refine and disseminate the metric tool through a community charette workshop among interdisciplinary scholars (ethics experts and disciplinary experts) and key stakeholders (data curators, data management experts, and data repository administrators; healthcare workers; and HIV patients) and pilot test it in the parent project. The proposed study will advance our understanding of bias and equity issues in Big Data research and develop an ethical framework and a metric tool for assessing bias in EHR-based Big Data studies, thus leading to and informing a more nuanced assessment and exploration of bias in practice for the ethical development of Big Data health research beyond the parent project. The metric tool of bias for a Big Data study can be reused as an assessment tool to detect and quantify biases, which may contribute to improving awareness and exploration of this critical ethical challenge. The ethical framework regarding bias challenges in Big Data research may provide insights and guidance for addressing bias issues in other types of Big Data beyond EHR.

Key facts

NIH application ID: 10599459
Project number: 3R01AI164947-02S2
Recipient: UNIVERSITY OF SOUTH CAROLINA AT COLUMBIA
Principal Investigator: Bankole Olatosi
Activity code: R01
Funding institute: NIH
Fiscal year: 2022
Award amount: $267,578
Award type: 3
Project period: 2021-06-09 → 2026-05-31