Statistical and Machine Learning Methods to Improve Dynamic Treatment Regimens Estimation Using Real World Data.

NIH RePORTER · NIH · R01 · $362,198 · view on reporter.nih.gov ↗

Abstract

Project Summary/Abstract Type 2 diabetes (T2D) is a global epidemic affecting approximately 462 million individuals world-wide. Cur- rent medical treatment guidelines rely largely on data from randomized controlled trials (RCTs) that study average effects, which is far from adequate for making individualized decisions for real world patients. This limitation is even worse for discovering dynamic treatment regimens (DTRs) in a heterogeneous population where treatment decisions are made over one or more stages of disease course. This limitation can be partially addressed by sup- plementing RCT data with real world data (RWD), such as disease registries, prospective observational studies, surveys and electronic health records, to improve medical decision making. Despite of the promise of combining RWD and RCT, there are several significant challenges in method and algorithm development. These include lack of generalizability or practical utility for the findings from RCTs when applied to real world patients; bias due to unobserved confounders; and concern about long-term side effects/risks. This proposal aims to address each of these challenges. Specifically, in Aim 1, we address the generalizability issue by proposing a novel framework that uses evidence from RWD to improve learning DTRs in the trials. The framework uses RWD to select infor- mative tailoring features, balance population distributions and improve statistical efficiency through doubly robust estimation. In Aim 2, to improve the practical utility of DTRs, we propose a robust method to first infer individual treatment choice/preference from RWD, then incorporate this estimated preference into learning DTRs using the trial data. The resulting DTRs are not only statistically valid but also compatible with patient/clinician preference in real world populations. In Aim 3, to lessen the bias due to hidden confounders in RWD, we propose joint semiparametric models to combine the trial data with RWD; the models we propose allow different magnitudes of treatment effect sizes and control for possible bias due to hidden confounders in RWD. In Aim 4, to address the concern about long-term risks, we consider a general procedure for estimating DTRs that maximizes efficacy outcomes while ensuring that long-term side effects associated with the recommended DTRs remain below a certain threshold. We then propose a novel simultaneous learning algorithm to estimate the optimal DTRs across all stages. For all four aims, we will provide rigorous assumptions and theoretical justifications using tools from concentration inequalities, statistical learning theory, empirical processes and semiparametric inference. We will conduct extensive simulation studies to study the performance of the proposed approaches in a variety of set- tings, and compare their performance with off-the-shelf methods. We will apply the proposed methods to estimate DTRs for T2D using clinical trial data and RWD taken from electronic health re...

Key facts

NIH application ID
10847532
Project number
5R01GM124104-07
Recipient
UNIVERSITY OF MICHIGAN AT ANN ARBOR
Principal Investigator
Yuanjia Wang
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$362,198
Award type
5
Project period
2018-04-01 → 2027-03-31