# Leveraging Big Data Science to Focus the HIV Response in Countries with Generalized HIV Epidemics

> **NIH NIH R01** · JOHNS HOPKINS UNIVERSITY · 2022 · $773,307

## Abstract

The overarching goal of the proposed aims is to leverage novel methods with large and underutilized data sets
to evaluate the potential impact of increasingly specific HIV responses across generalized epidemic settings in
Sub-Saharan Africa (SSA) in reducing overall HIV incidence. This application is highly responsive to multiple
areas of interest in the recent Notice of Special Interest (NOSI): Harnessing Big Data to Halt HIV (NOT-AI-21-
054). Moreover, these aims align with current realities of the HIV pandemic. While overall incidence has steadily
declined over the last 15 years, over 1.5 million people newly acquired HIV in 2020 including one million people
across SSA. The risk for HIV is not evenly distributed anywhere in the world. And while specific key populations
are recognized to be at increased risk of HIV in many higher income settings, a general population construct is
often used to represent HIV epidemics across SSA. This construct typically negates proximal determinants of
HIV acquisition and transmission, including heightened transmission risks in the contexts of condomless sex
between men, sex work, and drug use, as well as infections among transgender people and incarcerated
populations.
We propose an ambitious set of aims that will leverage available HIV-related data for key populations as well as
auxiliary data including from social media, search patterns, spatial data, socioeconomic and migration data. We
will assemble multiple data sources and integrate these data to build a comprehensive data warehouse to
estimate key population-specific indicators including HIV incidence and prevalence, population size,
engagement in the HIV treatment cascade, and structural determinants. These estimates, augmented by small
area estimation methods where data are sparse, will inform dynamic transmission models to estimate differential
risks of onward HIV transmission among key populations and to better address the needs of key populations
compared with general-population approaches. Finally, we will leverage very large and underutilized program
data for HIV testing, prevention, and treatment programs in partnership with implementing partners. Cameroon,
Kenya, Senegal, and South Africa will be used as exemplar countries given that there exists sufficient data,
willing governments, and they represent common HIV epidemic typologies in their respective regions of SSA.
Aim 1: Build a flexible, comprehensive, and accessible data warehouse collating available HIV-related and
relevant auxiliary data for key populations from 2000 onward in SSA. Aim 2: Employ small area estimation
methods and spatial statistics using available direct and auxiliary data to infer population size, prevalence, and
engagement in the treatment cascade for key populations. Aim 3: Characterize the transmission population
attributable fraction for HIV among key populations in each setting, incorporating differential risks of onward HIV
transmission over multiple time horizons...

## Key facts

- **NIH application ID:** 10548465
- **Project number:** 1R01AI170249-01A1
- **Recipient organization:** JOHNS HOPKINS UNIVERSITY
- **Principal Investigator:** Stefan David Baral
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $773,307
- **Award type:** 1
- **Project period:** 2022-07-29 → 2026-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10548465

## Citation

> US National Institutes of Health, RePORTER application 10548465, Leveraging Big Data Science to Focus the HIV Response in Countries with Generalized HIV Epidemics (1R01AI170249-01A1). Retrieved via AI Analytics 2026-06-11 from https://api.ai-analytics.org/grant/nih/10548465. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
