Structure Driven CD8+ T Cell Cross-Reactivity Predictions
Publication date
Authors
DOI
Document Type
Master Thesis
Metadata
Show full item recordCollections
License
CC-BY-NC-ND
Abstract
CD8+ T cell cross-reactivity—where a single TCR recognizes multiple pep-
tide–MHC (pMHC) complexes—plays a key role in antiviral immunity but also
contributes to autoimmunity and off-target toxicity. Because TCR sequences are
rarely available and highly diverse, prediction methods that rely on TCR information
have limited applicability. In this thesis, we assess whether cross-reactivity can be
predicted using only peptide–MHC features. We compiled 6,834 peptide pairs from 33
experimental studies and extracted sequence-based, physicochemical, and structural
descriptors that characterize peptide similarity and pMHC presentation. Feature
analyses showed that sequence similarity and physicochemical properties at central,
TCR-facing residues are the strongest correlates of cross-reactivity. To include dis-
tinct biological modes of cross-reactivity, we developed an ensemble machine-learning
framework consisting of two models trained separately on peptide pairs differing
by one residue or by multiple residues. This approach improved generalization and
reduced dataset-driven biases, achieving an F1-score of 0.71 with high recall. These
results demonstrate that meaningful cross-reactivity signals are encoded at the pMHC
level and can be captured without TCR information, providing a scalable foundation
for applications in vaccine development, immunotherapy safety assessment, and
antigen discovery.