Today's data science systems, ranging from batch jobs to interactive interfaces, are surprisingly fragile. Data scientists typically use dozens of libraries, but a single bug in any can destroy hours or even days of computation, causing significant pain. This issue has been widely discussed in the data science community and academic literature. Yet, no principled mechanisms have been proposed to address the issue which might be puzzling to database researchers because existing databases implement checkpointing to periodically save changes in data for future recovery. Why haven't data science systems adopted checkpointing? What are the unique properties of data science systems that challenge the adoption? This project will answer these questions and bring checkpointing to data science systems with zero modifications to existing libraries and programs. If successful, this project can enable checkpointing, for the first time, in today's data science ecosystems. It will enable recovery from crashes, execution “undos”, suspending cloud resources without losing data, etc. This project first identifies a critical challenge: data science systems lack mechanisms for detecting changes in data, an important premise of checkpointing. Existing databases achieve this with centralized buffer pools. In contrast, data science systems intentionally omit centralized data spaces, allowing individual libraries to manage data using shared memory, GPUs, and remote machines for high performanc