Leveraging the Transferability Of Structural Graph Features for GNN Pre-training

Publication date

DOI

Document Type

Master Thesis

Collections

Open Access logo

License

CC-BY-NC-ND

Abstract

Graph Neural Networks (GNNs) have shown remarkable success in modeling relational data across various domains. However, training GNNs from scratch for each new task or dataset remains computationally expensive and often requires large amounts of labeled data, which is often scarce. This thesis explores strategies for pre-training GNNs with a focus on how to use common topological features in pre-training for enhancing transferability and how to leverage new unseen features in the downstream task to improve a model’s performance. The central hypothesis is that common and easily obtainable topological features such as degree, PageRank, eigenvector centrality, and clustering coefficients can be leveraged to build generalizable latent representations. Those latent representations can then be used with new features obtained in the downstream task to improve a model’s performance. We investigate methods for encoding these common features during pre-training and how to combine them with downstream features, aiming to improve performance in a downstream task in a domain where data is limited. The work proposes two frameworks for topologically based pre-training and evaluates the effectiveness of the pre-training during the downstream task. Our findings demonstrate that using topological graph features in the pre-training process increases a model’s performance on the downstream task. Moreover, during our experiments, we found that adding topological features to a model greatly increases its performance.

Keywords

GNN; pre-training; graph; graphs; Graph Neural Network; topology; topological features; AI; Artificial Intelligence; pre-training; topological features; degree; PageRank; centrality; clustering coefficients; latent representations; transferability; downstream tasks; limited labeled data; model performance; graph-based learning; feature encoding; domain adaptation

Citation