PROJECT SUMMARY T cells play important roles in our immune system by recognizing various antigens, including viruses, through their diverse T-cell receptors (TCRs). The collective set of TCRs in a person is called the TCR repertoire. One of the key steps in understanding the TCR repertoire is to identify the binding specificity of each TCR, which may provide rich insights into the donor’s immune history and potential. However, only a limited number of antigens and their cognate receptors can be profiled in an experiment for specificity discoveries. In the research team’s previous works, they demonstrated the ability to extract microbiome and TCR repertoire information from sequencing data. With these methods, each sample can provide a glimpse of microbiome and TCR repertoire interactions, suggesting that it is possible to associate the TCRs with their binding targets by inspecting a large volume of samples. Aim 1 will leverage the resources provided by the CQB cores to generate essential datasets to investigate how well RNA-seq data can represent the microbiome and TCR repertoire data. In order to efficiently process vast amounts of raw sequencing data sets, the team will develop novel computational methods that can significantly reduce the computational overhead. Additionally, they will extend these methods to work on a broader range of sequencing platforms, such as Oxford Nanopore long- read data, to incorporate more samples in the study. Aim 2 will apply these methods to obtain the microbiome and TCR repertoire data from publicly available RNA-seq samples, curate the resources into databases, and develop computational and statistical tools to annotate the binding specificities of TCRs toward microbiomes. The tools and resources generated from this project will be disseminated via open-source software and CQB cores, as well as a user-friendly suite of packages that researchers can use to process RNA-seq samples and annotate the specificities of TCRs found in their data. The specificity annotation method will enable biologists to directly identify disease-related TCRs, and leverage the information to track the dynamics of the immune system or develop TCR-based treatment strategies. RELEVANCE T cells can trigger immune reactions upon recognizing various antigens through their diverse TCRs. These receptors exhibit high specificity to their binding targets and encode valuable health information, such as bacteria or viral infection history. In this project, we propose an approach to predict the receptor’s binding specificity by jointly exploring the microbiome and TCR repertoire information from RNA-seq samples. The TCR specificity annotation procedure will be a valuable method in disease studies to discover crucial TCRs triggering the immune response, and researchers can utilize the identified TCRs in designing T-cell-based treatment.