Immune repertoire sequencing: error correction, analysis, and visualization on the cloud

NIH RePORTER · NIH · R43 · $217,313 · view on reporter.nih.gov ↗

Abstract

Project Summary Antibodies are vital molecules produced by the adaptive immune system, and are a critical component in identifying foreign agents for removal within an organism. Produced by B cells, antibodies have an enormously diverse set of possible compositions, created by somatic recombination and hypermutation processes that are specific to these types of immune cells. Due to their incredible diversity, studying them for the purpose of antibody discovery or disease characterization, becomes a difficult task. Recently, next-generation sequencing (NGS) technologies have been successfully applied to study the diverse repertoire of antibodies produced by B cells. This technology has proven incredible for understanding this component of the immune response in a new level of detail. Unfortunately, NGS produced strings contain errors as part of the process. These errors can be confused as true sequence diversity, and can confound downstream analysis and interpretation. Furthermore, structuring such deep sequenced antibody repertoire data for answering questions about the immune response is non-trivial and compute resource intensive; problems that not many labs are well suited to address. Our proposal seeks to break down barriers for entry of these repertoire sequencing assays by providing innovative informatics approaches to error correction and analysis, delivering results in an interactive cloud platform. Our service will be the first to offer non-human/mouse species support, as well as support for transgenic animals, critical for many drug discovery companies.

Key facts

NIH application ID: 10010744
Project number: 1R43GM137688-01
Recipient: DIGITAL PROTEOMICS LLC
Principal Investigator: Natalie Castellana
Activity code: R43
Funding institute: NIH
Fiscal year: 2020
Award amount: $217,313
Award type: 1
Project period: 2020-04-01 → 2021-05-10