Statistical and Machine Learning Methods to Improve Dynamic Treatment Regimens Estimation Using Real World Data.

NIH RePORTER · NIH · R01 · $362,198 · view on reporter.nih.gov ↗

Abstract

Project Summary/Abstract Type 2 diabetes (T2D) is a global epidemic affecting approximately 462 million individuals world-wide. Cur- rent medical treatment guidelines rely largely on data from randomized controlled trials (RCTs) that study average effects, which is far from adequate for making individualized decisions for real world patients. This limitation is even worse for discovering dynamic treatment regimens (DTRs) in a heterogeneous population where treatment decisions are made over one or more stages of disease course. This limitation can be partially addressed by sup- plementing RCT data with real world data (RWD), such as disease registries, prospective observational studies, surveys and electronic health records, to improve medical decision making. Despite of the promise of combining RWD and RCT, there are several signiﬁcant challenges in method and algorithm development. These include lack of generalizability or practical utility for the ﬁndings from RCTs when applied to real world patients; bias due to unobserved confounders; and concern about long-term side effects/risks. This proposal aims to address each of these challenges. Speciﬁcally, in Aim 1, we address the generalizability issue by proposing a novel framework that uses evidence from RWD to improve learning DTRs in the trials. The framework uses RWD to select infor- mative tailoring features, balance population distributions and improve statistical efﬁciency through doubly robust estimation. In Aim 2, to improve the practical utility of DTRs, we propose a robust method to ﬁrst infer individual treatment choice/preference from RWD, then incorporate this estimated preference into learning DTRs using the trial data. The resulting DTRs are not only statistically valid but also compatible with patient/clinician preference in real world populations. In Aim 3, to lessen the bias due to hidden confounders in RWD, we propose joint semiparametric models to combine the trial data with RWD; the models we propose allow different magnitudes of treatment effect sizes and control for possible bias due to hidden confounders in RWD. In Aim 4, to address the concern about long-term risks, we consider a general procedure for estimating DTRs that maximizes efﬁcacy outcomes while ensuring that long-term side effects associated with the recommended DTRs remain below a certain threshold. We then propose a novel simultaneous learning algorithm to estimate the optimal DTRs across all stages. For all four aims, we will provide rigorous assumptions and theoretical justiﬁcations using tools from concentration inequalities, statistical learning theory, empirical processes and semiparametric inference. We will conduct extensive simulation studies to study the performance of the proposed approaches in a variety of set- tings, and compare their performance with off-the-shelf methods. We will apply the proposed methods to estimate DTRs for T2D using clinical trial data and RWD taken from electronic health re...

Key facts

NIH application ID: 10847532
Project number: 5R01GM124104-07
Recipient: UNIVERSITY OF MICHIGAN AT ANN ARBOR
Principal Investigator: Yuanjia Wang
Activity code: R01
Funding institute: NIH
Fiscal year: 2024
Award amount: $362,198
Award type: 5
Project period: 2018-04-01 → 2027-03-31