# Rapid response for pandemics: single cell sequencing and deep learning to predict antibody sequences against an emerging antigen

> **NIH NIH R01** · KECK GRADUATE INST OF APPLIED LIFE SCIS · 2021 · $1,851,627

## Abstract

ABSTRACT
One of the “holy grails” in immunology is to be able to directly predict tight-binding variable chain antibody
sequences in silico against foreign or non-self `antigenic' proteins. Immunoglobulin chain rearrangement can
potentially encode approximately 1016 different variants of antibody heavy and light chain sequences. However,
only a small fraction of the sequence space is generally accessed for evolving antibodies against foreign proteins.
The computational challenge is to go from a model of the structure of an antigen to predicting a set of antibody
chain sequences that can bind tightly to the antigen. If solved, it might be possible to move in less than 24 hours
from the first cryo-electron-microscopic structure of a novel viral protein to advance a set of potent antibody-like
molecular candidates for testing. Towards solving this problem, this project aims to develop a deep learning
architecture that will take as input thermodynamic, quantum mechanical (density functional), and local structure-
based network topographical features of the antigens and their cognate antibodies, and will output their
respective binding affinity constants.
We will design a generative adversarial network (GAN), which we think is uniquely suited for regression-based
ML approaches for the immune system, to discover associations between the epitope and the variable chain
features. This approach requires a large data stream of antigen and cognate antibody sequences, which until
recently was difficult to obtain. A recently described single B-cell receptor (BCR) specific tagging method coupled
with single cell deep sequencing (“linking B cell receptor to antigen specificity through sequencing” or LIBRA-
seq) can rapidly isolate and sequence the BCR variable chain coding regions that can bind with high selectivity
to antigenic epitopes.
Towards the specific project goals, in Task 1, LIBRA-seq will be used to rapidly identify and generate candidate
immunoglobulin coding sequences in response to specific linear and nonlinear epitopes (against controls),
chosen through computational/molecular modeling and prioritized with SARS-CoV-2 Spike protein epitopes (but
not restricted to these), injected into a mouse model, to generate large training sets; in Task 2, these training
sets, along with other data sets already available in public databases, will generate a series of structural features
(described above), which will be used to train the GAN; in Task 3, the predicted epitope-antibody interactions
will be validated by direct experiments with synthetic antibody and phage-display systems. Thus, the proposed
strategy combines foundational principles in evolutionary biology, genomics, structural chemistry, and computer
science to the solution of a general biological engineering problem.
Results from this project are expected to lay the foundations for a rigorously tested and fully automated machine-
learning system that could rapidly generate synthetic antibody candid...

## Key facts

- **NIH application ID:** 10274223
- **Project number:** 1R01AI169543-01
- **Recipient organization:** KECK GRADUATE INST OF APPLIED LIFE SCIS
- **Principal Investigator:** Jeniffer Bertha Hernandez
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $1,851,627
- **Award type:** 1
- **Project period:** 2021-09-16 → 2025-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10274223

## Citation

> US National Institutes of Health, RePORTER application 10274223, Rapid response for pandemics: single cell sequencing and deep learning to predict antibody sequences against an emerging antigen (1R01AI169543-01). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10274223. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
