Project Summary A central question in the genetics of complex traits is understanding how variation in DNA sequences leads to variation in phenotype. Recent technological advances in high-throughput phenotyping assays for model organisms and the establishment of large human biobanks and consortium databases have provided opportunities to study the genotype-phenotype maps for complex traits in unprecedented detail. However, it remains a major challenge to model and interpret these data due to the intrinsic high dimensionality of the genotype space and the many ways in which causal genes can interact. My research program is focused on developing new theoretical frameworks and interpretable computational tools to analyze large genotype- phenotype datasets with the goal of (1) accurately predicting phenotypes for novel genotypes and (2) providing biological insights into the genetic architecture of complex traits by identifying key genes, gene interactions, and pathways. The primary focus for my lab in the next five years is to develop new Bayesian and machine learning methods capable of modeling the full spectrum of genetic interactions including pairwise as well as higher-order epistasis. Specifically, we are combining rigorous mathematical modeling with modern machine learning techniques to develop a suite of scalable, principled methods to achieve accurate phenotypic prediction and accelerate the discovery of novel genetic mechanisms. While proof-of-concept versions of many of the proposed methods display state-of- the-art performance, substantial work remains to scale the methods to larger genotype-phenotype datasets, test model performance on a wide range of complex traits and organisms, and interpret the results to gain biological insights. In the coming years, we plan to build these methods into an integrated framework for analyzing complex genetic interactions, which will include computational pipelines for fitting accurate phenotypic prediction models, identifying gene interactions and pathways for experimental validation, and quantification of estimation uncertainty. We will also prioritize the development of user-friendly, GPU-accelerated software packages for all methods. Important applications of the proposed research directions include predicting disease risks in humans, elucidating the genetic mechanisms for economically and clinically important traits, and designing improved plant and animal breeding programs. The computational tools developed here will be broadly useful to geneticists, evolutionary, and clinical biologists.