# Big Data Analysis of HIV Risk and Epidemiology in Sub-Saharan Africa

> **NIH NIH R01** · STANFORD UNIVERSITY · 2020 · $691,449

## Abstract

Project Summary/Abstract
HIV is the largest single cause of death among adults in Sub-Saharan Africa, responsible for about a
third of all deaths among adults. One of the key paradigms to halting HIV in Sub-Saharan Africa relies
on identification of infected individuals and populations for delivery of biomedical and behavioral
interventions. However, by the end of 2015 less that half of HIV-infected individuals accessed
antiretroviral therapy (ART) despite expansion of eligibility and ongoing efforts to diagnose and initiate
treatment. A better understanding of the social, behavioral, environmental, and economic contexts that
influence HIV risk could improve the effectiveness and efficiency of programs that aim to identify and
target HIV-infected populations. In response to the program announcement for “Harnessing Big Data to
Halt HIV” (PA-15-273), the overall goal of this proposal is to develop new analytic tools in large-scale
data to predict risk of HIV infection and to generate hypotheses about new or under-recognized risk
factors in Sub-Saharan Africa. We plan four primary investigations: (1) Extract and harmonize all Sub-
Saharan African nationally representative Demographic and Health Surveys that include the HIV status
of over 600,000 men and women collected in 29 countries, and hundreds to thousands of associated
exposure variables; (2) Develop analytic tools based on LASSO and XWAS to predict HIV infection
status and generate hypotheses about social, behaviorial, environmental, an economic risk factors; (3)
Identify HIV risk in multi-country, large-scale data and synthesize findings across in Sub-Saharan
Africa, and (4) develop a bioethics program to identify targets for new interventions and policies in a
culturally and ethically sound manner. The project will develop leverage big data and high-throughput
analytic methodology in the service of global HIV control. The outputs of this project include (1)
accessible software code for efficient exploration of robust correlates of HIV status derived from the
biggest collection of high-dimensional, harmonized, and nationally representative representative
household surveys in the world, (2) an extensive landscape of the social, environmental, behavioral,
and economic factors predictive of HIV infection among over 600K people tested for HIV in 29 Sub-
Saharan countries, and (3) a ethical framework to enable practical, relevant, and appropriate translation
and communication of findings. New models of HIV infection will facilitate identification of at-risk groups
and the development of interventions to halt the HIV epidemic in Sub-Saharan Africa.

## Key facts

- **NIH application ID:** 9892947
- **Project number:** 5R01AI127250-04
- **Recipient organization:** STANFORD UNIVERSITY
- **Principal Investigator:** Eran Bendavid
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $691,449
- **Award type:** 5
- **Project period:** 2017-04-01 → 2022-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9892947

## Citation

> US National Institutes of Health, RePORTER application 9892947, Big Data Analysis of HIV Risk and Epidemiology in Sub-Saharan Africa (5R01AI127250-04). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/9892947. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
