Whole-genome sequencing (WGS) has transformed our ability to track the spread of pathogens in healthcare settings. With the ability to identify patients linked by transmission has come the capacity to determine with high confidence the role of certain hospital locations, contaminated infrastructure, and colonized healthcare personnel in mediating the spread of infections in hospitals. Moreover, broad integration of genomic with clinical data has the potential to identify not just pathways of transmission, but also patient characteristics and hospital practices that influence organism-specific transmission rates. However, to realize the potential of WGS as a tool for precision infection prevention will require overcoming critical barriers. The most significant challenges stem from the role that epidemic lineages play in the overall antibiotic resistance epidemic. It has been shown that the majority of antibiotic resistance in healthcare settings is due to the importation and spread epidemic lineages that have reached high-prevalence in regional healthcare networks. Due to the high prevalence of a small number of strains, it becomes challenging even with WGS to determine whether two infected patients are linked by transmission within the hospital, or if one or both patients acquired their infections during a previous community or healthcare exposure. The standard approach for discerning if two patients are linked by transmission is to employ species-specific thresholds for the number of single nucleotide variants (SNVs) separating two patients isolates; above which they are concluded to not be linked by transmission and below which transmission is deemed likely. However, there is a great deal of evidence that applying these SNV-thresholds can lead to both false-positive and false-negative transmission inferences. Sources of error include the difficulty of discriminating between recent transmission at a connected healthcare facility and higher than expected SNV differences between true transmission pairs due to mutation accumulation during long-term colonization. Here, we seek to develop, validate, and apply sampling, sequencing and analysis strategies to enable accurate transmission inference in high-prevalence endemic settings. In Aim 1 we will build on preliminary data showing that we can group patients linked by transmission in an SNV-threshold free manner, and evaluate several methods for detection of intra-facility transmission clusters. In Aim 2 we will develop and apply population sequencing strategies to comprehensively detect and track the spread of multiple strains between patients. In Aim 3, we will expand the analysis of population sequencing data to incorporate sharing of unfixed alleles into transmission inference. Lastly, we will apply our optimized genomic epidemiology toolkit to determine the relative contribution of importation, patient-to-patient transmission, environmental contamination and intra-patient evolution to colonizatio...