Leveraging deep learning to learn optimal features of antibody recognition

Publication date

DOI

Document Type

Master Thesis

Collections

Open Access logo

License

CC-BY-NC-ND

Abstract

AI-guided protein engineering is revolutionizing the development of vaccines and immune therapies. However, the majority of vaccine developments still rely on conventional old-fashioned methods that demand significant experimental effort and time. In this thesis, we present a pipeline that employs protein language models (PLMs), a novel class of deep learning neural networks, to optimize interactions between proteins and antibodies, enabling the precise proposal of mutations to modulate immunogenicity. The pipeline integrated experimental data from the SARS-CoV-2 receptor binding domain (RBD), which was obtained using the yeast surface display technique. Through a labeling method, protein sequences were classified into two groups based on their antibody binding affinity: “binders” and “non-binders”. These two datasets were used to train multiple PLMs employing two strategies: pre-training models from scratch using the experimental data, and fine-tuning previously pre-trained models with the same dataset. This design enabled us to investigate the impact of different architectures and data on the diversity and specificity of the proposed mutations. The results showed that each group of models (“binder” and “non-binder”) could identify mutations that could modulate immunogenicity with antibodies at highly consistent positions and similar frequencies, albeit with variability in the proposed amino acids depending on the model architecture. In the next steps, these PLMs will be applied to identify specific mutations in different SARS-CoV-2 strains (Wuhan, Omicron, Delta, etc.) RBDs sequences, to optimise its interaction with antibodies. The suggested mutations will be computationally evaluated in terms of structural and biophysical feasibility and subsequently experimentally validated by ELISA assays, cross-reactivity and neutralisation tests to measure their effectiveness in modulating the immune response. The innovation of this AI-driven mutation pipeline could provide a powerful tool to enhance the public health response to rapidly mutating pathogens, such as HIV or influenza, by accelerating the development of next-generation immunotherapies and improving their precision and efficacy.

Keywords

Deep learning, PLMs, Vaccines, Immunology

Citation