# Antigen-independent prediction and biomarker identification of cancer-specific T cells

> **NIH NIH R01** · UT SOUTHWESTERN MEDICAL CENTER · 2021 · $374,864

## Abstract

Project Summary/Abstract
Cancer immunotherapy has achieved remarkable clinical success treating late-stage tumors, yet the response
rates remain low and the side effects are often severe. Designing effective immunotherapies relies on accurate
identification of tumor-reactive T cells. This is an extremely difficult task because 1) most of the cancer
antigens are unknown; 2) the majority of the tumor-infiltrating T cells (TIL) does not recognize cancer cells; and
3) without known antigens, the only approach to acquire such T cells is to perform ex vivo expansion of TILs
stimulated by autologous cancer cells, which generates non-specific T cells and is infeasible to many patients.
Nonetheless, this strategy is widely adopted in current clinical trials for anti-cancer treatment, despite its
reduced therapeutic efficacy and unpredictable side effects of autoimmunity. Therefore, unbiased, antigen-
independent identification of tumor-reactive T cells, if possible, will be a major clinical priority as it will
significantly increase the efficiency and safety of T cell based immunotherapies. Here we propose to achieve
this goal through the development of novel machine learning methods. Such approach has not yet been
explored because the fundamental difference between cancer and non-cancer T cells lies in their receptor
sequences (TCR), and training data of cancer-specific TCRs is currently unavailable. To prepare for this task,
we have developed the software TRUST, to extract the T cell antigen-binding CDR3 regions from bulk tumor
RNA-seq data, and the software iSMART to group these CDR3s into antigen-specific clusters. These tools
allowed us to develop a new rationale for producing large training sets of tumor-reactive TCRs, even without
knowing cancer antigens. In our preliminary analysis, we observed that TCRs from the training data can be
matched to tumor antigens that bind to HLA-A*02:01 and elicit immune response in vivo. The cancer-specific
CDR3 amino acid sequences also show significantly different biochemical features from non-cancer ones,
based on which we further developed software DeepCAT to demonstrate the feasibility of de novo prediction of
cancer TCRs. These exciting results highlighted the importance to develop better computational method to
track the tumor-reactive T cells for clinical applications. Accordingly, we propose the following Specific Aims: In
Aim 1, we will deliver a new machine learning method for accurate classification of tumor-reactive T cells using
the CDR3 sequences. In Aim 2, we will derive a set of biomarkers for the cancer-specific T cells for fast and
accurate flow sorting of these T cells from TILs. In Aim 3, we will perform single cell sequencing and functional
validation of cancer-specific T cells using humanized animal model to validate the predicted genes, and to
produce a prioritized list of promising targets for cancer diagnosis, prognosis and therapy development. These
Aims will be accomplished with the great...

## Key facts

- **NIH application ID:** 10248560
- **Project number:** 5R01CA245318-02
- **Recipient organization:** UT SOUTHWESTERN MEDICAL CENTER
- **Principal Investigator:** Bo Li
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $374,864
- **Award type:** 5
- **Project period:** 2020-09-01 → 2025-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10248560

## Citation

> US National Institutes of Health, RePORTER application 10248560, Antigen-independent prediction and biomarker identification of cancer-specific T cells (5R01CA245318-02). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10248560. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
