Project Summary/Abstract Human cancer genomes show a complex pattern of mutations that, upon computational deconvolution, resolves into a systematic series of over 80 “Mutational Signatures” (the pattern of mutations across all possible trinucleotide sequence contexts). Some signatures are common to many cancers (e.g., CGàTA in CpG sites) whereas others appear in single tumor types (e.g., signatures involving GCàTA mutations that are presumably associated with aflatoxin B1 (AFB1) exposure in hepatocellular carcinoma (HCC)). Current sequencing efforts sometimes provide hints, via examination of the details of mutational signatures, as to cancer mechanistic etiology. The first two Aims of the present proposal are a bottom up approach to determine which specific chemical insults to DNA cause mutational patterns that match mutational signatures in tumors. We have formulated a model that describes three variables that contribute to the complexity of mutational spectra: The likelihood that a DNA lesion will form in a particular context, the likelihood that it will evade repair, and the likelihood that it will miscode when traversed by a polymerase. Aim 1 uses a combination of synthetic and analytical chemistry to generate the adduct-formation spectrum of a host of DNA damaging agents (AFB1, sterigmatocystin, N-methyl-N-nitrosourea, streptozotocin, temozolomide, chloroacetaldehyde and the oxidant SIN-1). A strategy involving the use of heavy isotope containing defined-sequence oligonucleotides will be used along with mass spectrometry to map the binding specificities of electrophiles derived from these agents. Aim 2 takes a two-pronged approach to define the biological effects of DNA damage from the studied agents. Since the DNA adducts of the compounds evaluated are known, we shall insert those adducts one at a time into defined-sequence oligonucleotides in all 16 possible 3-base contexts (i.e., 5’-NXN-3’, where X is the adduct and N is A, G, C or T). The oligonucleotides will be inserted into viral genomes, replicated in cells of defined repair or replication status, and the areas of the genome that contained the lesion will be characterized to define the type, amount, genetic requirements for - and sequence context dependency of - mutagenesis. The second part of Aim 2 is to use a newly developed mouse embryo fibroblast (MEF) line to define, using Duplex Consensus Sequencing (DCS), the high-resolution mutational spectra (HRMS) of the agents under study. These data are compared via informatics techniques (e.g., cosine similarity) to human cancer mutational signatures and to the DNA binding (Aim 1) and mutagenic properties of individual adducts. Lastly, a damaged nucleotide pool is an often-overlooked source of mutations. Aim 3 uses the MEF line of Aim 2 to probe the opportunity of a pool damaged with products of oxidative stress (e.g., 8-oxoG and 5-chlorocytosine) to impact the HRMS of the MEF line. Further, we have custom-designed a pool mutagen (fKP1...