Technological advances have made it possible to collect massive datasets in many scientific applications. A major challenge is to create algorithms that can analyze these datasets efficiently while also providing guarantees on the quality of the analysis. This project focuses on iterative algorithms that start with an initial guess and then refine it until they reach a solution of the desired quality. Prior work has made it possible to design efficient iterative algorithms with excellent performance guarantees, under the assumption that all of the data is processed synchronously, without any losses or errors. However, large datasets often need to be distributed across multiple servers, leading to asynchronous updates, partial losses, and errors. The main contribution of this project is a novel framework for the design and analysis of iterative algorithms that process very large datasets in a distributed fashion. This framework naturally captures many of the phenomena that arise in distributed data processing, and can be used to design strategies that are more efficient than requiring the servers to maintain perfect synchronization. Furthermore, this research will train undergraduate and graduate students to be experts on distributed processing of massive datasets. It will also lead to the creation of educational materials, such as tutorial articles, focused on the statistical analysis of massive datasets. From a technical perspective, this project builds on the approximat