This project addresses theoretical challenges in high-dimensional probability, with a particular focus on those arising in data science. It aims to develop rigorous mathematical foundations for understanding the authenticity and privacy of synthetic data, tackling questions such as “What is artificial, mathematically?” and “How can we distinguish artificial data from real?” As a related aim, the project will broaden the reach of random matrix theory in data science by developing new geometric approaches to random matrices and random tensors. By establishing a probabilistic framework for detecting synthetic data, the project will develop an adversarial classification model and characterize the regimes where artificial data can be reliably identified. This analysis will draw on connections to high-dimensional Gaussian geometry and convexity. To develop a mathematical framework for private synthetic data, the project will explore metric-based characterizations of the privacy-accuracy tradeoff, grounded in the methodology of high-dimensional probability. Furthermore, this project will advance non-spectral random matrix theory by developing and applying high-dimensional probability methods to study approximation numbers, general operator norms, and norms of the inverse of random matrices. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.