Targeted Machine Learning to evaluate and optimize HIV prevention strategies in cluster randomized trials

NIH RePORTER · MH · R01 · $759,816 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY/ABSTRACT Globally, there were 1.3 million new HIV infections in 2023, despite expanded access to biomedical HIV prevention products with high efficacy. Implementation strategies are needed to expand the reach of HIV risk screening and to facilitate the use of biomedical prevention among persons with risk. These implementation strategies are often delivered at the group-level or induce changes at the group-level (e.g., health clinics or health systems). Cluster randomized trials (CRTs) are integral to evaluating and optimizing strategies deployed at the group-level. CRTs provide an exciting opportunity to evaluate strategies aiming to both improve reach into the target population and health outcomes among persons reached. However, these CRTs create a complex missing data problem: the strategy improves outcomes directly and indirectly; yet, outcomes are only measured among persons reached. While machine learning can facilitate adjustment for missing data in simpler CRT settings, new methods are needed to minimize bias arising from this common CRT setting. CRTs also provide an exciting opportunity for intervention optimization by evaluating for whom and in what context the strategy works best. However, existing methods to evaluate effect heterogeneity in CRTs are prone to false conclusions (i.e., Type-I and Type-II errors). While machine learning can facilitate data-driven evaluation of effect modification in individually randomized trials, CRTs present distinct challenges due to their small effective sample sizes. In this proposal, we will address these crucial gaps in the analysis of CRTs. To do so, we will develop, apply, and disseminate new Targeted Machine Learning Estimators (TMLEs) to minimize bias due to missing data and to facilitate data-driven evaluation of effect modification. TMLE combines formal causal modeling, statistical theory, and machine learning to improve the accuracy, precision, and relevance of our findings. This proposal has the

Key facts

NIH application ID: 11328733
Project number: 1R01MH140685-01A1
Recipient: UNIVERSITY OF CALIFORNIA BERKELEY
Principal Investigator: Laura B Balzer
Activity code: R01
Funding institute: MH
Fiscal year: 2026
Award amount: $759,816
Award type: 1
Project period: 2026-05-01T00:00:00 → 2031-01-31T00:00:00