# Next Generation Missing Data Methods in HIV Research

> **NIH NIH R01** · UNIVERSITY OF PENNSYLVANIA · 2020 · $624,003

## Abstract

Abstract
This proposal is largely motivated by our involvement with the Botswana Combination
Prevention Project (BCPP) which is an on-going large scale human immunodeficiency virus
(HIV) cluster randomized prevention trial conducted in 30 communities across Botswana. As in
most HIV prevention studies, incomplete data on HIV status and nonresponse to queries about
sexual behavior is an important challenge the study currently faces, with data likely missing not
at random and in complex patterns across individuals. Recognizing that existing statistical
methods for missing data are largely ill-suited to fully address this important problem in HIV
research, we propose to develop the next generation of missing data methods going well
beyond current theory of identification and inference. Specifically, we propose (1) to develop a
unified theory of identification bringing together recent developments in the theory of
identification based on causal graphs with recent identification results from the statistics
literature. This will allow us to establish conditions under which in complex missing data settings
as in the BCPP, one can untangle features of the underlying population which may be of
scientific interest from features of the non-response process not necessarily of scientific
interest;(2) to build on (1) to develop corresponding inverse-probability-weighted and doubly
robust methods for statistical inference in the BCPP where data are likely to be missing not at
random and in complex patterns; (3) to develop novel semiparametric imputation methods that
solely rely on assumptions encoded in the nonresponse process, thus allowing the complete
data distribution in the BCPP to remain unscathed by the imputation process; (4) to develop
user-friendly software to facilitate widespread use of the methods developed in Aims 1-3, and to
apply and demonstrate their good performance in extensive simulation studies as well as in
answering scientific queries of primary interest in the BCPP.

## Key facts

- **NIH application ID:** 9846188
- **Project number:** 5R01AI127271-05
- **Recipient organization:** UNIVERSITY OF PENNSYLVANIA
- **Principal Investigator:** Eric Joel Tchetgen Tchetgen
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $624,003
- **Award type:** 5
- **Project period:** 2017-02-14 → 2022-01-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9846188

## Citation

> US National Institutes of Health, RePORTER application 9846188, Next Generation Missing Data Methods in HIV Research (5R01AI127271-05). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/9846188. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
