# Deep Learning-based Emulation Analysis: Methodological Developments and Case Studies

> **NIH NIH R21** · YALE UNIVERSITY · 2023 · $125,625

## Abstract

Project Summary
To objectively quantify the relative effectiveness of drugs, devices, and treatment procedures on survival
outcomes of cardiovascular diseases (CVDs), rigorously designed and executed randomized clinical trials
(RCTs) remain as the gold standard. However, for many problems, RCTs either have failed or are not feasible.
Luckily, the fast development of electronic medical record (EMR) and insurance claims databases makes it
possible to mine a large amount of observational data and efficiently complement RCTs. Among the available
observational data analysis techniques that aim to draw RCT-type conclusions, emulation has emerged as
especially attractive, given its trial-like architecture, interpretability, and scalability. It has been applied to CVDs
for over twenty years and led to many important findings.
 This study has two aims. The first aim is to develop a deep learning (DL)-based emulation analysis
pipeline, methods, and software. Most of the existing emulation analyses are based on “classic” regression
techniques. Very recently, our group was the first to develop DL-based emulation analysis with application to
CVDs. Compared to regression, DL excels by having superior model fitting and flexibly accommodating
unspecified nonlinear effects. Built on our recent success, this project will methodologically significantly advance
by developing cutting-edge DL-based emulation analysis with more effective estimation (that has the much-
desired robustness property and significantly improved stability and interpretability), comprehensive and valid
inference (which is essential for making definitive conclusions on treatment effects but missing in most DL
studies), and friendly software (to facilitate broad utilization). This methodological effort can substantially expand
the scope of emulation analysis, deep learning, causal inference, observational data analysis, and medical
record/insurance claims data analysis. The second aim is to conduct two clinically highly significant case studies.
The first case study is on evaluating the effect of ICD (Implantable Cardioverter Defibrillator) on all-cause
mortality in the VA (Department of Veterans Affairs) elderly population. The clinical trial targeting at addressing
this problem failed because of low enrollment. As part of the VA CAUSAL Initiative, emulation was proposed as
a viable solution to “replace” the trial. The second case study is on evaluating the comparative efficacy of
Rivaroxaban versus Dabigatran on the mortality of AF (atrial fibrillation) patients in the Medicare population, for
which an RCT is unlikely with both drugs FDA-approved and already popularly used. Beyond directly informing
clinical practice, research under this aim can also complement and advance the VA CAUSAL Initiative as well
as serve as a prototype for future applications of the proposed approach.

## Key facts

- **NIH application ID:** 10676303
- **Project number:** 5R21HL161691-02
- **Recipient organization:** YALE UNIVERSITY
- **Principal Investigator:** Shuangge Ma
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2023
- **Award amount:** $125,625
- **Award type:** 5
- **Project period:** 2022-08-15 → 2025-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10676303

## Citation

> US National Institutes of Health, RePORTER application 10676303, Deep Learning-based Emulation Analysis: Methodological Developments and Case Studies (5R21HL161691-02). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10676303. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
