# Learning a molecular shape space for the adaptive immune system

> **NIH NIH R35** · UNIVERSITY OF WASHINGTON · 2024 · $367,139

## Abstract

Project Summary
 The adaptive immune system consists of highly diverse B- and T-cell receptors, which can recognize and
neutralize a multitude of diverse pathogens. Immune recognition relies on molecular interactions between
immune receptors and pathogens, which in turn is determined by the complementarity of their 3D structures and
amino acid compositions, i.e., their shapes. Immune shape space has been previously introduced as an
abstraction for such molecular recognition to explain how immune repertoires are organized to counter diverse
pathogens. However, the relationships between immune receptor sequence, shape, and specificity are very
difficult to quantify in practice. We propose to use recent advances in machine learning and the wealth of
molecular data to infer an effective shape space, grounded in biophysics of protein interactions. The key is to
find a representation of proteins in general, and of immune receptors, in particular, that reflects the relevant
biophysical properties that determine a protein receptor’s stability, function, and interaction with pathogens.
 Representation learning is a powerful technique in machine learning that uses large amounts of data to
infer a reduced representation. Since protein function is closely related to the 3D structure, we will develop novel
machine learning methods that use atomic coordinates of a protein structure as input and, through
transformations that respect the physical symmetries in the data, learn representations that reflect biophysical
properties of proteins and protein-protein interactions. We believe a key innovation in our approach is the
analysis of amino acid neighborhoods within 3D protein structures. The distribution of these neighborhoods will
reveal how they differ at the surface, in the bulk, and at functionally important regions such as catalytic sites.
The learned protein representation will enable us to characterize how specific compositions of amino acid
neighborhoods are the building blocks of protein structure and protein function. We will transfer the
representation of protein universe to immune receptors to learn the immune shape space. The leaned immune
shape space will enable us to address how affinity and specificity are encoded by immune receptors in different
cell types. We will study how the modular structure of immune receptors, with separate pathogen engaging and
framework regions, enables receptors to diversify and target a multitude of pathogens, without compromising
their stability. We will use the complementary aspect of shape recognition to predict the antigenic targets of the
immune receptors, and through collaborations, we will experimentally validate our predictions.
 Our approach opens a new path towards interpretable computational models of proteins and immune
receptors that describe how biological properties and biological function emerge from protein subunits.
Additionally, the inferred molecular representations can be used as a generative mode...

## Key facts

- **NIH application ID:** 10865002
- **Project number:** 5R35GM142795-04
- **Recipient organization:** UNIVERSITY OF WASHINGTON
- **Principal Investigator:** Armita Nourmohammad
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $367,139
- **Award type:** 5
- **Project period:** 2021-08-15 → 2026-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10865002

## Citation

> US National Institutes of Health, RePORTER application 10865002, Learning a molecular shape space for the adaptive immune system (5R35GM142795-04). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10865002. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*