The original Centre d’Etude du Polymorphisme Humain (CEPH) genomes have been used for over thirty years worldwide to map the human genome, understand genome biology, and identify genes associated with traits and disease ((1), cited 489 times). They are part of the HapMap project, 1000 Genomes Project, and the National Institute of Standards and Technology. The majority of these genomes came from 44 large threegeneration Utah families composed of 4 grandparents, 2 parents and 7 to 17 children. We have generated whole genome sequence (WGS) from 603 of the original Utah CEPH blood samples, which is available to the research community. This resource continues to be used to understand recombination, human variation, and genome mutations in the germline and soma; to benchmark new bioinformatics tools; and to serve as wellcharacterized controls in genetic discovery. Because germline de novo genomic changes identified in the 2nd generation can be observed in the multiple offspring of the 3rd generation, false positive and somatic genetic changes can be distinguished from germline mutations to establish a “truth” set of de novo variation. Many members of the 3rd generation now have adult offspring (generation 4). This project proposes to build upon this important resource by conducting WGS on DNA from 300 of these newly contacted research participants from the 4th generation. By adding these 300 genomes to the existing resource, the parent-to-child transmission events will nearly double the observations for an expanded “truth” set of genetic variation. This will be a powerful resource to benchmark new tools, further our understanding of recombination and mutation events that lead to disease, and distinguish disease-causing events from normal variation. The project has two aims. First, it will provide aligned reads (CRAM format) and variant calls (in VCF format) of the genomes through dbGaP/AnVIL. Phenotype information to accompany these genomes is available through the University of Utah. Second, the project will discover and characterize multiple forms of genomic variation using the fourgeneration Utah CEPH genome sequence resource. Using best practice methods, we will discover SNV, INDEL, STR, and structural variants among all 903 (603 from generations 1-3 and 300 from generation 4) CEPH genomes. The resulting variant calls will be annotated based on whether they appear to be de novo mutations, segregating variants, or false positives.