In the world of high-performance computing (HPC), the growing complexity and shrinking size of hardware components make systems more vulnerable to "soft errors"— temporary glitches that can disrupt calculations. Traditionally, these issues were managed through hardware-based solutions like redundancy, but these approaches consume significant energy, a major concern for modern processors. This project addresses the challenge of making HPC systems more resilient to soft errors without the high energy costs of traditional methods. It focuses on identifying and protecting the most vulnerable parts of a program — the specific states where errors are most likely to cause problems. By doing this efficiently, the project aims to ensure that programs can continue to function correctly even when errors occur. The broader benefits of this project include advancing the field of reliable computing, promoting energy-efficient technologies, and supporting education by making cutting-edge resilience techniques accessible to software developers and classrooms. Ultimately, this work contributes to the creation of more robust and efficient computing systems that can handle the increasing demands of modern technology, benefiting industries, education, and society as a whole. This project aims to address the increasing vulnerability of HPC systems to transient hardware faults, or soft errors, which are exacerbated by larger system scales, advanced technology scaling, and lower operating voltag