Modern scientific data sets—ranging from single-cell RNA sequencing with tens of thousands of genes per patient, to galaxy-survey spectra with millions of stars, to user-item interaction matrices in online platforms—share two features: (i) ultra-high dimensionality and (ii) latent parameters that obey common structural laws (e.g., exchangeability, sparsity, or low-rank dependence). This project tackles both challenges at once. It advances statistical foundations for such problems by (1) providing a new framework to theoretically study empirical Bayes methods in these complex models that learn the latent-parameter distribution directly from the data, and (2) developing cutting-edge unsupervised dimension-reduction techniques that embed the high-dimensional observations into lower-dimensional representations while preserving the essential structure and relationships within the data. Together, these tools will transform ad-hoc prior modeling into an objective, data-driven procedure and yield principled, scalable inference for large-scale applications. Further, collaborations with astronomers will ensure immediate scientific impact, and several of the research directions will shape the Ph.D. dissertation of multiple Columbia graduate students, fostering the next generation of data-science leaders. The project integrates two tightly linked research thrusts. (a) Building on recent advances in nonparametric empirical Bayes, the PI will design flexible empirical Bayes estimators