AI Transfer of the MHC-I Immunopeptidome from Human to Dog

NIH RePORTER · NIH · R01 · $158,559 · view on reporter.nih.gov ↗

Abstract

Abstract This application is being submitted in response to the Notice of Special Interest (NOSI) identified as NOT-CA- 24-029. It is submitted as a supplement to our parent R01CA252713 project, entitled “Canine MHC-I genotyping and tumor specific neoantigen determination”. This supplement project aims to use artificial intelligence (AI), specifically a recently emerged large language model, to identify human and canine major histocompatibility complex class I (MHC-I) alleles that are peptide- presentation equivalents and then to transfer the >700,000 immunopeptidome data of the human to the corresponding dog MHC-I alleles. To achieve this, we will expand our collaboration to include Dr. Dajiang Zhu, a leading scientist in applying AI in biomedical research, to perform supplemental studies as follows. 1) We will use advanced statistical models to group the ~170 human MHC-I alleles with immunopeptidome data into supertypes based on their peptide-binding specificity. 2) We will train a local Llama, a cutting-edge open- source large language AI model from Meta, to discover an MHC-I pseudosequence that can distinguish these supertypes. 3) We will use the MHC-I pseudosequence to identify equivalent human and canine alleles, and transfer human immunopeptidome dataset to each corresponding canine allele. 4) We will validate our models by using our newly established mass spectrometry protocol to characterize the immunopeptidome of chosen canine alleles. This proposed supplement study will significantly enhance Aims 2a and 2b of the parent R01, by increasing experimentally characterized canine alleles from 30 to up to 200 (Aim 2a) and increasing allele-specific antigen model building from 30 to up to 200 (Aim 2b), accelerating canine tumor specific neoantigen discovery. Importantly, our study will provide a powerful and innovative strategy for many other mammalian models with little empiric immunopeptidome data to effectively borrow the human data, potentially saving millions of dollars, and enhancing their translational values, the goal of the OMF.

Key facts

NIH application ID
11074925
Project number
3R01CA252713-04S1
Recipient
UNIVERSITY OF GEORGIA
Principal Investigator
William Hildebrand
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$158,559
Award type
3
Project period
2021-05-11 → 2026-04-30