Project Summary/Abstract Mendelian randomization (MR) is a widely applicable causal inference technique that makes it possible to estimate causal effects using only summary association statistics from genome-wide association studies (GWAS). In recent years, MR has moved from being relatively unknown to a common element of post-GWAS analysis. By facilitating causal inference without a randomized trial, MR makes it possible to rapidly and cheaply assess potential risk factors for human disease. However, most MR methods rely on strong, sometimes unrealistic assumptions. When assumptions are violated, MR will produce biased, misleading results. The goals of this proposal are to 1) develop robust MR statistical methods that address the most crucial problems that arise in analysis of real data sets and 2) develop accessible open-source software to guide a user through the practical challenges of performing MR. We focus on two shortcomings of existing MR methods. First, horizontal pleiotropy is a well-known source of bias in MR. State-of-the-art MR methods are more robust to some types of horizontal pleiotropy than traditional methods. However, there are some forms of horizontal pleiotropy that can only be accounted for by augmenting the analysis with data for confounding variables via multivariable MR (MVMR). Current MVMR methods can only accommodate a few additional variables, while many problems would be best addressed by including larger numbers of traits. In Aim 1, we develop an MVMR method that is computationally scalable and remains accurate when large numbers of traits are included. In Aim 2, we extend this work, developing a method to automatically identify variables that should be included in an MVMR analysis. This is particularly important for understanding the causal role of exposures that have been sparsely studied or have only recently become measurable. In Aim 3, we focus on the challenges posed by linkage disequilibrium (LD). The majority of existing methods rely on LO-pruning variants to obtain an independent set, leading to a loss of valuable information. All current methods assume that LD is the same in the exposure and outcome GWAS. This assumption will not always hold, leading to errors that bias causal estimates. To address these problems, we develop an efficient screening tool to alert users when mis-matching LD may be affecting the results and an LD-aware MR method that can accommodate different LD patterns in exposure and outcome. The methods developed in this proposal will be distributed in user-friendly open-source software. Because the goals of Aims 1-3 are complimentary, in Aim 4 we will integrate these tools into an umbrella software package that guides users through the multiple choices involved in performing MR, from data selection and formatting to interpretion of results. The goal of this package is to address data considerations that are often ignored in methodological research, enabling investigators to obtain more ro...