The project develops tools to read expanded genetic alphabets that contain bases other than A, T, G, and C, and studies how natural biological systems interact with these unnatural DNA letters. This research contributes to transformative applications in nucleic acid biotechnology, and has the potential to improve diagnostic assays, lead to the discovery of novel therapeutics, and enhance biomanufacturing techniques. The project integrates these research activities with robust educational objectives: preparing a globally competitive workforce through workshops for industry professionals, fostering public engagement with hands-on community activities, and developing online resources for data science education. Graduate and undergraduate students participating in the project will gain valuable skills and mentorship experience, contributing to STEM workforce development. This research addresses key challenges in expanded genetic alphabets, focusing on three objectives. First, it aims to improve generalizability (and accuracy) of next generation sequencing for unnatural base pairs through deep learning. Second, it develops single-context sequencing models that enable high accuracy measurements of polymerase replication fidelity for unnatural bases. With this new methodology, the project measures polymerases replication fidelity of various polymerase for these unnatural bases in various model in vitro systems - such as PCR and LAMP. Lastly, the project investigates the biocompat