# Song - Proj 3

> **NIH NIH P20** · DARTMOUTH COLLEGE · 2024 · $238,197

## Abstract

PROJECT SUMMARY
T cells play important roles in our immune system by recognizing various antigens, including viruses, through
their diverse T-cell receptors (TCRs). The collective set of TCRs in a person is called the TCR repertoire. One
of the key steps in understanding the TCR repertoire is to identify the binding specificity of each TCR, which
may provide rich insights into the donor’s immune history and potential. However, only a limited number of
antigens and their cognate receptors can be profiled in an experiment for specificity discoveries. In the
research team’s previous works, they demonstrated the ability to extract microbiome and TCR repertoire
information from sequencing data. With these methods, each sample can provide a glimpse of microbiome and
TCR repertoire interactions, suggesting that it is possible to associate the TCRs with their binding targets by
inspecting a large volume of samples. Aim 1 will leverage the resources provided by the CQB cores to
generate essential datasets to investigate how well RNA-seq data can represent the microbiome and TCR
repertoire data. In order to efficiently process vast amounts of raw sequencing data sets, the team will develop
novel computational methods that can significantly reduce the computational overhead. Additionally, they will
extend these methods to work on a broader range of sequencing platforms, such as Oxford Nanopore long-
read data, to incorporate more samples in the study. Aim 2 will apply these methods to obtain the microbiome
and TCR repertoire data from publicly available RNA-seq samples, curate the resources into databases, and
develop computational and statistical tools to annotate the binding specificities of TCRs toward microbiomes.
The tools and resources generated from this project will be disseminated via open-source software and CQB
cores, as well as a user-friendly suite of packages that researchers can use to process RNA-seq samples and
annotate the specificities of TCRs found in their data. The specificity annotation method will enable biologists
to directly identify disease-related TCRs, and leverage the information to track the dynamics of the immune
system or develop TCR-based treatment strategies.
RELEVANCE
T cells can trigger immune reactions upon recognizing various antigens through their diverse TCRs. These
receptors exhibit high specificity to their binding targets and encode valuable health information, such as
bacteria or viral infection history. In this project, we propose an approach to predict the receptor’s binding
specificity by jointly exploring the microbiome and TCR repertoire information from RNA-seq samples. The
TCR specificity annotation procedure will be a valuable method in disease studies to discover crucial TCRs
triggering the immune response, and researchers can utilize the identified TCRs in designing T-cell-based
treatment.

## Key facts

- **NIH application ID:** 10852732
- **Project number:** 2P20GM130454-06
- **Recipient organization:** DARTMOUTH COLLEGE
- **Principal Investigator:** Li Song
- **Activity code:** P20 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $238,197
- **Award type:** 2
- **Project period:** 2019-08-01 → 2029-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10852732

## Citation

> US National Institutes of Health, RePORTER application 10852732, Song - Proj 3 (2P20GM130454-06). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10852732. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
