# Utilizing Bayesian modeling to improve mutational signature inference in large-scale datasets

> **NIH NIH U01** · BOSTON UNIVERSITY MEDICAL CAMPUS · 2022 · $402,634

## Abstract

The goals of this proposal are to develop novel statistical methods, more accurate inference procedures, and
interactive software tools to perform mutational signature deconvolution in cancer samples. Mutational
signatures are patterns of co-occurring mutations that can reveal insights into a cancer's etiology and evolution.
Currently, non-negative matrix factorization (NMF) is the “gold-standard” for mutational signature deconvolution.
However, NMF has several deficiencies in that it cannot do the following things: 1) predict signatures in new
samples, 2) perform joint learning of known and novel signatures at the same time, 3) alleviate problems from
signature “bleeding”, 4) cluster tumors into subgroups based on mutational signature profiles, and 5) characterize
uncertainty in model fit. In this proposal, we will develop a novel Bayesian hierarchical models that overcome
the limitations of NMF. Furthermore, there is a lack of interactive software for mutational signature inference and
visualization for non-computational users. We will also develop an R/Shiny interface on top of our R package to
facilitate data preprocessing, inference, and visualization of large-scale datasets. This interface will have a cloud
backend to facilitate computationally intensive operations. Overall, this software will streamline mutational
signature analysis for noncomputational researchers and will have the capability to interface with other projects
from the Informatics Technology for Cancer Research (ITCR) program. Finally, we will analyze a novel targeted
sequencing dataset from Chinese patients and perform a meta-analysis of all publicly available variants to
generate a novel reference set of mutational signatures for investigators to use in their own studies. Overall, our
tools will be of great interest to the cancer community as it will provide greater insights into mutational signature
patterns and will be useful in clinical settings to reveal insights into cancer etiology.

## Key facts

- **NIH application ID:** 10490301
- **Project number:** 5U01CA253500-02
- **Recipient organization:** BOSTON UNIVERSITY MEDICAL CAMPUS
- **Principal Investigator:** Joshua D Campbell
- **Activity code:** U01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $402,634
- **Award type:** 5
- **Project period:** 2021-09-17 → 2024-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10490301

## Citation

> US National Institutes of Health, RePORTER application 10490301, Utilizing Bayesian modeling to improve mutational signature inference in large-scale datasets (5U01CA253500-02). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10490301. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
