DESCRIPTION (provided by applicant): We will use whole-genome sequences (WGS) from a unique set of multigenerational Utah pedigrees to explore the causes of genetic variation and the consequences of this variation for disease. We will estimate the rates of mutation and mobile element retrotransposition in 42 three-generation pedigrees, each consisting of grandparents, parents, and large numbers of offspring (626 individuals in total). Using advanced methods to detect single nucleotide variants, structural variants, and mobile element insertions in WGS data, we will address fundamental questions about mutation and mobile element evolution: In a large, well-controlled set of families, what are the rates of mutation and retrotransposition? How are these events affected by paternal and maternal age? Is variation in these rates determined by genetic factors (e.g., DNA repair genes) that segregate in families? What is the role of genomic context (e.g., GC content, recombination) in generating de novo mutations and retrotranspositions? In addition to addressing questions about the causes of genetic variation, we will address the consequences of variation by analyzing WGS in large, multigenerational Utah pedigrees in which there is a strong excess of specific inherited diseases. Under separate funding, we are obtaining WGS from at least 3,000 pedigree members as part of the Utah Genome Project (of which the PI is the Executive Director). These families, which are part of the eight-million-member Utah Population Database, provide important advantages for the genetic analysis of Mendelian and complex diseases because genetic heterogeneity, as well as environmental heterogeneity, are both greatly reduced. Furthermore, large pedigrees offer the potential to follow the transmission of rare variants detected in WGS across generations as they contribute to disease causation, including the causation of common, complex diseases. They thus provide a powerful and unique resource for disease-gene identification. We have developed the VAAST, pVAAST, and Phevor algorithms for detecting and characterizing disease-causing genes in these families. In this project, we will develop and modify these methods to address several key questions: What is the role of noncoding genetic variation in causing inherited disease? To what extent does structural variation, such as copy number variants and genomic rearrangements, contribute to inherited disease? How can existing methods be effectively adapted to identify the multiple variants that underlie susceptibility to common diseases?