Project Summary The Human Leukocyte Antigen (HLA) region on human chromosome 6p21 is the most medically important region of the human genome. More than 100 infectious, autoimmune and pharmacological disease phenotypes and cancers are associated with genetic variation of HLA. Nevertheless, despite nearly a half-century of study investigating HLA and disease association, outstanding questions remain regarding the full extent of HLA mediated impact on human health and disease. A major limitation to these studies, lack of sufficient sample size for discovery and replication, can be attributed to the complex nature of the HLA region such that it is often impractical to undertake association studies in very large cohorts. To date, nearly all large-scale studies examining HLA variation in human health have relied on statistical imputation of HLA alleles from SNP (single nucleotide polymorphism) data, rather than direct genotyping of these loci. Here, we propose to exploit pre-existing, high-quality HLA genotyping data collected by the National Marrow Donor Program in order to examine the impact of HLA variation in human health and immunity at unprecedented scale. We will collect self-reported health histories from a sample of greater than 100,000 individuals, allowing examination of genotype-phenotype associations with high-resolution genotypes. Further, we will examine the relationship between HLA variation and antibody response to human cytomegalovirus (CMV) in more than 1,000,000 individuals. To provide context to the association studies, we will examine the relationship between HLA variation and antigenic targets of antibodies in more than 1000 healthy individuals. In Specific Aim 1 we will build a very large dataset collected through the National Marrow Donor Program (NMDP) to identify potential HLA associations across numerous diseases and phenotypes in a Phenome Wide Association study (PheWAS). Phenotype information for PheWAS analysis will be obtained from self-report survey data for approximately 130 conditions, diseases and traits from subjects with HLA genotyping collected by NMDP. In Specific Aim 2 we will determine the antigen specificity of antibodies in serum samples from healthy subjects, stratified by HLA genotype. We will utilize a programmable phage display assay, comprised of 744,000 peptides tiled across the human proteome, representing the entire human peptidome. Additionally, we will screen against the phage display for the virome (480,000 peptides) for specificity for viral antigens, and test binding of these antigens to HLA molecules. We will examine serum samples from over 1000 healthy donors. In Specific Aim 3, we will specifically address serostatus with respect to human cytomegalovirus (CMV) in a sample of over a million individuals. CMV is ubiquitous in all human populations and infection can have profound effects on the immune system. This analysis will provide the first large-scale examination of the association of HLA va...