# An ethnically diverse genomic reference resource for the human heavy and light chain immunoglobulin loci

> **NIH NIH R24** · UNIVERSITY OF LOUISVILLE · 2020 · $386,405

## Abstract

Project Summary/Abstract
There is a fundamental gap in our understanding of how germline variation in immunoglobulin (IG) heavy (IGH)
and light chain (IGK; IGL) loci in the human population impacts the development of the functional antibody (Ab)
response in health and disease. However, there is a growing appreciation that IG polymorphism contributes to
variability in the Ab repertoire, indicating that the integration of IG genetic data has the potential to inform our
understanding of Ab function in various clinical contexts. A critical barrier to progress has been that existing
genomic resources for IG loci are lacking and poorly represent diversity found across human populations. IG
regions are structurally complex, consisting of large segmental duplications, and are among the most
polymorphic in the genome, with large copy number variants (CNVs), elevated nucleotide diversity, and
population-specific haplotype variants. These complexities have long made IG loci difficult to study at the
genomic and population level using standard high-throughput methods, with direct negative impacts on genetic
disease association studies and more recently the analysis of expressed Ab repertoire data. As a result, our
knowledge of human IG germline diversity (particularly in non-Caucasians) and its contribution to disease lags
far behind that of other well studied immune loci. This highlights a direct need for publically available well-
characterized IG haplotype references and accurate variant catalogues from diverse ethnic backgrounds to
facilitate the design and integration of more accurate genotyping tools, analysis pipelines, and their interpretation.
To meet this need, we have developed several robust approaches, which we will utilize here to establish critical
community resources for the IG loci. We will first enumerate up to 16 novel IGH/K/L haplotype reference
assemblies from an existing set of 8 fosmid libraries from individuals of African, Asian, and European descent.
We will also use a novel multi-haplotype informed genotyping pipeline to profile IGH/K/L genetic variation in a
cohort of 180 familial and unrelated individuals from these same three populations. This will represent the most
comprehensive population survey of IG germline diversity, including descriptions of variable, diversity, joining,
and constant gene variation, and locus-wide single nucleotide polymorphisms (SNPs) and CNVs, allowing for
fine-scale assessment of variant imputation panels for disease association studies. Finally, to facilitate the
utility of these data as long-term resources, all sequences, tools/methods, and analysis pipelines will be made
publically available. We will work with established databases to ensure all sequences are deposited in both raw
and annotated form. This will include the integration of assemblies into future releases of the human genome
reference for use by the genomics community, as well as updates to existing germline gene/allele databases
critic...

## Key facts

- **NIH application ID:** 9955200
- **Project number:** 5R24AI138963-03
- **Recipient organization:** UNIVERSITY OF LOUISVILLE
- **Principal Investigator:** Melissa Laird Smith
- **Activity code:** R24 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $386,405
- **Award type:** 5
- **Project period:** 2018-07-23 → 2022-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9955200

## Citation

> US National Institutes of Health, RePORTER application 9955200, An ethnically diverse genomic reference resource for the human heavy and light chain immunoglobulin loci (5R24AI138963-03). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/9955200. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
