# CAREER: Structure-Aware Protein Language Models for Decoding Immune Recognition

> **NSF 01002627DB NSF RESEARCH & RELATED ACTIVIT** · Icahn School of Medicine at Mount Sinai (NY) · $522,190

## Abstract

The immune system protects the body by recognizing abnormal cells and infectious threats, yet the molecular rules that allow immune cells to distinguish dangerous targets from healthy cells remain poorly understood. This project will develop new computational approaches to better understand immune recognition, a fundamental problem in biology with broad relevance to health, cancer immunotherapy, autoimmunity, and future biomedical discovery. The project will also advance computing by developing new methods to model biological systems using structure-aware protein language models. In addition to the research activities, the project will support a month-long summer program in New York City that introduces high school students to immunology, data science, and machine learning through hands-on projects and mentorship. Openly shared software, educational materials, and datasets will help broaden access to computational biology and strengthen the future scientific workforce.

This project will develop structure-aware protein language models to decode how T cell receptors recognize peptide antigens presented by major histocompatibility complex molecules. The research will generate high-confidence three-dimensional models of paired T cell receptors, represent local structural environments as symbolic tokens, and integrate sequence, structure, and spatial information in a transformer architecture trained with masked multimodal prediction and contrastive learning. The resulting repre

## Key facts

- **NSF award ID:** 2542232
- **Awardee organization:** Icahn School of Medicine at Mount Sinai (NY)
- **SAM.gov UEI:** C8H9CNG1VBD9
- **PI:** Diego Chowell
- **Primary program:** 01002627DB NSF RESEARCH & RELATED ACTIVIT
- **All programs:** Artificial Intelligence (AI), CAREER-Faculty Erly Career Dev, COMPUTATIONAL BIOLOGY, Biotechnology
- **Estimated total:** $522,190
- **Funds obligated:** $204,412
- **Transaction type:** Continuing Grant
- **Period:** 07/01/2026 → 06/30/2031

## Primary source

NSF Award Search: https://www.nsf.gov/awardsearch/showAward?AWD_ID=2542232

## Citation

> US National Science Foundation, Award 2542232, CAREER: Structure-Aware Protein Language Models for Decoding Immune Recognition. Retrieved via AI Analytics 2026-06-08 from https://api.ai-analytics.org/grant/nsf/2542232. Licensed CC0.

---

*[NSF Awards dataset](/datasets/nsf-awards) · CC0 1.0*
