Structure Driven CD8+ T Cell Cross-Reactivity Predictions

Publication date

DOI

Document Type

Master Thesis

Collections

Open Access logo

License

CC-BY-NC-ND

Abstract

CD8+ T cell cross-reactivity—where a single TCR recognizes multiple pep- tide–MHC (pMHC) complexes—plays a key role in antiviral immunity but also contributes to autoimmunity and off-target toxicity. Because TCR sequences are rarely available and highly diverse, prediction methods that rely on TCR information have limited applicability. In this thesis, we assess whether cross-reactivity can be predicted using only peptide–MHC features. We compiled 6,834 peptide pairs from 33 experimental studies and extracted sequence-based, physicochemical, and structural descriptors that characterize peptide similarity and pMHC presentation. Feature analyses showed that sequence similarity and physicochemical properties at central, TCR-facing residues are the strongest correlates of cross-reactivity. To include dis- tinct biological modes of cross-reactivity, we developed an ensemble machine-learning framework consisting of two models trained separately on peptide pairs differing by one residue or by multiple residues. This approach improved generalization and reduced dataset-driven biases, achieving an F1-score of 0.71 with high recall. These results demonstrate that meaningful cross-reactivity signals are encoded at the pMHC level and can be captured without TCR information, providing a scalable foundation for applications in vaccine development, immunotherapy safety assessment, and antigen discovery.

Keywords

Citation