The mutational signatures inferred from tumor genome sequences have the potential to provide a record of environmental exposure and can give clues about the etiology of carcinogenesis. However, for inferred signatures to be biologically meaningful, each signature must accurately represent the contribution of different mutation types in each mutagenic process. Heuristic algorithms using non-negative matrix factorization (NMF) have primarily been used to discover mutational signatures. But these approaches are inflexible, non-robust, and require massive amounts of computation. The objective of the proposed project is to develop computationally efficient algorithms that, despite imperfect modeling assumptions, can discover biologically meaningful signatures. Aim 1 supports this objective by developing a new framework for scalable, easy-to-use, and accurate variational inference – a widely used approach to approximate Bayesian inference – that is applicable to mutational discovery models. Aim 2 develops statistical methods to extract biologically meaningful signatures from the inferences obtained using the proposed variational inference framework. The accuracy and statistical validity of the methods developed in Aims 1 and 2 is ensured through theoretical analysis and numerical experiments on synthetic and real data. Finally, Aim 3 improves upon the current understanding of mutational processes by (1) applying the methods developed in Aims 1 and 2 to a large Pan-Cancer dataset and (2) by developing a novel model that allows for the structured incorporation of single-base and double-base substitutions, and insertions and deletions in each signature. The proposed work is well-positioned to replace heuristics used for discovering meaningful representations of data, and so have long-term impact on how other genomic data types such as single-cell RNA-seq are analyzed. This work is also directly relevant to the NIGMS as it falls under “DNA and RNA metabolisms (repair)” since many mutational processes are related to aberrant DNA repair or “clock-like” molecular mechanisms that are associated with aging, which can be observed in histologically normal appearing tissue