An ethnically diverse genomic reference resource for the human heavy and light chain immunoglobulin loci

NIH RePORTER · NIH · R24 · $386,405 · view on reporter.nih.gov ↗

Abstract

Project Summary/Abstract There is a fundamental gap in our understanding of how germline variation in immunoglobulin (IG) heavy (IGH) and light chain (IGK; IGL) loci in the human population impacts the development of the functional antibody (Ab) response in health and disease. However, there is a growing appreciation that IG polymorphism contributes to variability in the Ab repertoire, indicating that the integration of IG genetic data has the potential to inform our understanding of Ab function in various clinical contexts. A critical barrier to progress has been that existing genomic resources for IG loci are lacking and poorly represent diversity found across human populations. IG regions are structurally complex, consisting of large segmental duplications, and are among the most polymorphic in the genome, with large copy number variants (CNVs), elevated nucleotide diversity, and population-specific haplotype variants. These complexities have long made IG loci difficult to study at the genomic and population level using standard high-throughput methods, with direct negative impacts on genetic disease association studies and more recently the analysis of expressed Ab repertoire data. As a result, our knowledge of human IG germline diversity (particularly in non-Caucasians) and its contribution to disease lags far behind that of other well studied immune loci. This highlights a direct need for publically available well- characterized IG haplotype references and accurate variant catalogues from diverse ethnic backgrounds to facilitate the design and integration of more accurate genotyping tools, analysis pipelines, and their interpretation. To meet this need, we have developed several robust approaches, which we will utilize here to establish critical community resources for the IG loci. We will first enumerate up to 16 novel IGH/K/L haplotype reference assemblies from an existing set of 8 fosmid libraries from individuals of African, Asian, and European descent. We will also use a novel multi-haplotype informed genotyping pipeline to profile IGH/K/L genetic variation in a cohort of 180 familial and unrelated individuals from these same three populations. This will represent the most comprehensive population survey of IG germline diversity, including descriptions of variable, diversity, joining, and constant gene variation, and locus-wide single nucleotide polymorphisms (SNPs) and CNVs, allowing for fine-scale assessment of variant imputation panels for disease association studies. Finally, to facilitate the utility of these data as long-term resources, all sequences, tools/methods, and analysis pipelines will be made publically available. We will work with established databases to ensure all sequences are deposited in both raw and annotated form. This will include the integration of assemblies into future releases of the human genome reference for use by the genomics community, as well as updates to existing germline gene/allele databases critic...

Key facts

NIH application ID: 9955200
Project number: 5R24AI138963-03
Recipient: UNIVERSITY OF LOUISVILLE
Principal Investigator: Melissa Laird Smith
Activity code: R24
Funding institute: NIH
Fiscal year: 2020
Award amount: $386,405
Award type: 5
Project period: 2018-07-23 → 2022-06-30