Clinical Phenotyping for Prediction of Retention in HIV Care

NIH RePORTER · NIH · R21 · $452,376 · view on reporter.nih.gov ↗

Abstract

Retention in care is essential to HIV treatment and prevention, yet only half of people with HIV in the U.S. are retained in medical care. Improving retention is critical for ending the HIV epidemic in the U.S., but effective retention interventions are highly resource intensive. With diminishing resources for HIV care and increasing prevalence of HIV, better approaches are needed to assist HIV care teams to identify patients most vulnerable to loss to follow-up (LTFU) who would most benefit from retention resources before LTFU occurs. A predictive model of LTFU based on electronic health data has the potential to address this need, as it quantifies a specific patient’s risk of future disengagement from care based on his/her unique characteristics and can be automated to generate risk prediction in real time. Using data from an urban HIV clinic, we have developed a machine learning model to predict LTFU from HIV care using natural language processing (NLP) of unstructured text of provider notes in the electronic medical record (EMR). The NLP model demonstrated good performance in detecting patients at risk for LTFU with a positive predictive value (PPV) of 0.86, and identified word patterns associated with LTFU, such as “substance abuse” and “stigma,” thereby demonstrating good face validity. While our preliminary data reveal the potential of NLP-based machine learning models to predict future retention in care, several key issues need to be addressed before the model can be deployed for patient care. First, PWH are a markedly heterogeneous population, and it is possible that there may exist sub-groups of patients (e.g., young Black men who have sex with men, cisgender women with childcare responsibilities, people who inject drugs and are unstably housed, etc) that differ drastically in the factors that are predictive of LTFU. Clinical phenotyping is an analytic method that can cluster patients within a heterogeneous population into different sub-groups based on profile similarities. Before a single model is deployed with a “one-size-fits- all” manner, it is crucial to better understand the performance of our NLP model on different clinical phenotypes of patients with HIV. Second, it is not known how the model would perform in a prospective, real-life setting. Finally, it is also unclear how a machine learning model would perform compared to provider intuition regarding patients’ risk for disengagement from care. This proposal seeks to address these issues through 2 specific aims. In Aim 1, we will determine the performance of the NLP predictive model of LTFU for different clinical phenotypes of people with HIV. In Aim 2, we will prospectively validate the model and compare results with care team intuition regarding risk for LTFU among people with HIV. As we move toward ending the HIV epidemic, results from this project will provide crucial information regarding the use of NLP and clinical phenotyping to predict loss to follow-up from HIV care an...

Key facts

NIH application ID: 10762595
Project number: 1R21MH134756-01
Recipient: UNIVERSITY OF CHICAGO
Principal Investigator: Anoop Mayampurath
Activity code: R21
Funding institute: NIH
Fiscal year: 2023
Award amount: $452,376
Award type: 1
Project period: 2023-09-05 → 2025-09-04