A blind source separation approach for deconvolution of bulk transcriptional data leads to early detection of ATF cell-states in complex bacterial populations, in vitro and in vivo

NIH RePORTER · NIH · U19 · $610,131 · view on reporter.nih.gov ↗

Abstract

SUMMARY – PROJECT 3 Transient bacterial cell-states including tolerance, persistence and hetero-resistance (HR) are harbingers of antibiotic treatment failure (ATF) and enablers of antibiotic resistance. Importantly, they are missed in any currently employed diagnostic assay or antibiotic susceptibility tests. Intriguingly, in the treatment of different types of cancer, physicians are often confronted with similar treatment failure issues. It turns out that these epigenetic cell-states create extended opportunities for high-level resistance mutations to emerge. Moreover, due to the phenotype’s transience, they themselves can directly drive the re-emergence of the (susceptible) population after drug pressure subsides. While these cell-states are increasingly recognized as drivers that sit at the root of treatment failure, new strategies are emerging to specifically identify, track and target them. To achieve such highly targeted treatment, approaches are developed that map out the composition of complex cancer tissue, for instance through single cell RNA-Seq (scRNA-Seq), or computational deconvolution of bulk RNA-Seq data. While, scRNA-Seq on bacteria remains technically challenging we found that by modifying existing tools, specific bacterial cell-states can be identified in complex bacterial populations. However, the capabilities of current tools are limited, and through the implementation of state-of-the-art machine learning algorithms there is much room for improvement. Moreover, ATF cell-states are poorly characterized, making it currently impossible to effectively define them. Herein, 3 aims are pursued to develop an approach that, based on bulk RNA-Seq data, dissects a complex bacterial population into its separate cell-states, and calculates their frequencies and MICs. In Aim 1 a large and diverse temporal RNA-Seq dataset is generated by following a wide variety of strains and species while they are exposed to antibiotics and a subset of the population switches to an ATF cell state. In Aim 2 a blind source separation algorithm is explored to design a state-of-the-art machine learning tool that deconvolves bulk RNA-Seq data from a complex bacterial population into the cell-states and their frequencies that make up the population. Moreover, by reconstituting each cell-state’s expression profile we enable transcriptional entropy calculations and thereby cell-state specific MIC predictions. In Aim 3 the approach is validated by retrospectively predicting the presence of ATF cell-states in patient samples. Finally, the model’s applicability is extended to bulk dual RNA-Seq data from host and bacterium, and validated on patient serum samples. This project therefore not only informs on how ATF cell-states develop and are maintained in a population, but also creates a path towards the development of diagnostics that can detect them in an active infection. Combined with the collateral sensitivities from Project 2 this could eventually enable linkin...

Key facts

NIH application ID
10171121
Project number
1U19AI158076-01
Recipient
BROAD INSTITUTE, INC.
Principal Investigator
Tim van Opijnen
Activity code
U19
Funding institute
NIH
Fiscal year
2022
Award amount
$610,131
Award type
1
Project period
2022-09-12 → 2026-06-30