# Signature of profiling and staging the progression of TB from infection to disease.

> **NIH NIH R21** · BOSTON UNIVERSITY MEDICAL CAMPUS · 2021 · $206,823

## Abstract

Project Summary/Abstract
Tuberculosis (TB) is the leading cause of infectious disease mortality worldwide. Nearly one-third of the world's
population is infected with Mycobacterium tuberculosis (MTB). More than 10.4 million new cases of active TB
disease develop annually, leading to 1.4 million deaths due to the disease each year. Despite widespread
efforts to study of the etiology of disease, the development and global introduction of an effective treatment
regimen, and sensitive diagnostics for identifying pulmonary TB disease, efforts to control this pandemic are
falling short, largely due to a lack of a clear understanding of the pathogenic progression from MTB infection to
active clinical disease.
In addition, Existing gene expression studies have presented more than three dozen biomarkers to predict TB
related outcomes such as identifying active TB disease, predicting risk of treatment failure, or predicting which
patients will progress to active TB disease. These have been developed and refined using multiple
technologies and using a diverse set of computational and machine learning prediction algorithms, but most
are focused on two-class comparison (e.g. TB vs. LTBI).
In this proposal, we propose to compile and harmonize dozens of existing RNA-sequencing datasets for TB
outcomes. We will use these compiled data to develop a computational platform and interactive visualization
tools for profiling TB signatures across all existing datasets. We plan to use this curated data and software
platform to develop a more refined molecular map of progression from TB infection to active disease.
Consistent with a recently presented models for TB disease development, we hypothesize that we will be able
to identify gene expression patterns associated with stages on the TB disease spectrum, including: uninfected
or eliminated infection, controlled or truly latent infection, future progressors or incipient disease, subclinical TB
disease, and active clinical TB disease. We believe that existing gene expression data and signatures will
allow us to identify distinct transcriptional profiles for each stage, and hence develop a multi-class machine
learning approach for classifying patients into their corresponding stage.
Overall, this proposal contributes to the field by compiling existing gene expression data and developing a
wholistic map of TB progression from infection to active disease. In addition, we will provide a curated dataset
and metadata in an accessible format for more than three dozen existing TB studies, and allow others to
access and explore these data through a user-friendly profiling platform.

## Key facts

- **NIH application ID:** 10214482
- **Project number:** 5R21AI154387-02
- **Recipient organization:** BOSTON UNIVERSITY MEDICAL CAMPUS
- **Principal Investigator:** William Evan Johnson
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $206,823
- **Award type:** 5
- **Project period:** 2020-07-10 → 2023-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10214482

## Citation

> US National Institutes of Health, RePORTER application 10214482, Signature of profiling and staging the progression of TB from infection to disease. (5R21AI154387-02). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10214482. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
