Overcoming explainability and data availability barriers to broad application of ECG ML screening with a system-wide ECG dataset

NIH RePORTER · NIH · R21 · $115,500 · view on reporter.nih.gov ↗

Abstract

Abstract Cardiovascular disease remains a major source of mortality in the US, and detection and prevention remain lacking. Machine learning (ML) based detection of cardiovascular disease from raw 12-lead electrocardiogram (ECG) signals is a digital health technique (ML-ECG) that has shown remarkable potential to improve patient care. However, the best-developed ML tools require large data sets for training, and lack explainability. A major long-term goal is to develop ML-ECG approaches that can be equitably deployed in clinical care to help detect rare but serious diseases across broad populations. The current objectives towards this long-term goal are to (1) implement self-supervised ML methods to train ML-ECG algorithms with smaller data-sets and (2) develop novel ML-interpretability methods for ECG-classification algorithms. The central hypothesis is that application of unique ML-ECG approaches to this rich existing dataset can reduce the amount of data needed for training and devel- opment of clinically-actionable algorithms, and can provide valuable and as yet unrecognized insight into physio- logic links used in ML-EC algorithms. The rationale for this project is that: (1) contrastive self-supervised training allows model development on smaller labeled datasets without significantly compromising accuracy; (2) explain- ability analyses through generative adversarial learning can fill gaps in understanding of how these algorithms are adding to clinical knowledge; and (3) an existing dataset of over 1.4 million clinical ECGs provides an ideal environment in which to apply these techniques. The central hypothesis will be tested by pursuing 2 specific aims: (1) Develop contrastive self-supervised training approaches to ECG-classification ML architectures, in order to reduce the necessary power for highly-accurate predictions; and (2) Apply ML explainability analyses to ECG classification architectures to understand the features driving predictions. Success in this project will advance the ML-ECG science to allow its deployment not only in the detection of extremely common disease patterns, but the more difficult and consequential tasks of identifying rare but serious abnormalities. Moreover, the results will also provide insight into how such algorithms work.

Key facts

NIH application ID: 10809212
Project number: 1R21HL172288-01
Recipient: UTAH STATE HIGHER EDUCATION SYSTEM--UNIVERSITY OF UTAH
Principal Investigator: BENJAMIN ADAM STEINBERG
Activity code: R21
Funding institute: NIH
Fiscal year: 2024
Award amount: $115,500
Award type: 1
Project period: 2023-12-20 → 2025-11-30