SUMMARY: Recurrent pregnancy loss (RPL) occurs in approximately 5% of clinically recognized pregnancy losses. The etiology of RPL is not well characterized: after excluding the known etiologies, approximately half of women with RPL still have no identifiable cause. The fact that RPL is, in fact, recurrent suggests a strong genetic component, however there is currently a very limited understanding of the genomic contributions to RPL. Previous studies are typically deficient in their design, limited by small sample size, incomplete clinical phenotyping and/or the recruitment of singletons only. In this proposal, we put forward our plan to recruit 1000 rigorously-phenotyped RPL trios including from diverse and underrepresented backgrounds across the US and to apply WGS and sophisticated variant detection and interpretation methods developed by our labs to identify pathogenic and likely pathogenic variants for RPL. We will then perform comprehensive integrative data analyses to define the genetic basis of unexplained RPL and map the genes and regions of the chromosome that are absolutely required for human development and a successful pregnancy. Our variant interpretation pipeline includes cutting edge approaches to map likely pathogenic noncoding and structural variants rarely assessed in any pregnancy loss study. We will also perform a pilot RNA-seq study to assess the utility of this approach for gene discovery in the pregnancy loss setting. We will first look for recessive pathogenic variation, including compound heterozygosity and then test for models for de novo mosaicism, mitochondrial mutations, regulatory noncoding variation and overall mutational burden. From these combined analyses, we expect to uncover many variants in genes and regions of the chromosome that are intolerable to functional variation, which we define as the human intolerome. We will build on our previous studies to map the intolerome by combining i) available data from all clinical studies to define the genetic etiology of unexplained pregnancy loss, including data generated in this proposal and in our prior work, ii) network-based approaches to prioritize variants genes important for human development and pregnancy, iii) mouse (KOMP, DMDD/MGI) and cell line knockout studies iv) rare and common disease sequencing studies including Centers for Mendelian Genomics (CMG), Center for Common Disease Genomics (CCDG) and Pediatric Cardiac Genomics Consortium (PCGC), iv) emerging human pangenome studies HPP, and v) population-scale biobank projects such as UK BioBank and All of Us. We will then confirm these predictions via collaborator-led functional studies and retrospective analyses of RPL first losses, siblings and grandparents. The sharing of early, unpublished data from the Yale CMG and HPP enabled by our leadership in these projects is a significant strength of what will be by far the largest and most comprehensive study of RPL performed to date. Our findings will take great ...