Enhancing Graph-Based Multi-Agent Reinforcement Learning for Strategic and Diverse Coordination

Publication date

DOI

Document Type

Master Thesis

Collections

Open Access logo

License

CC-BY-NC-ND

Abstract

Reinforcement Learning enables agents to learn complex behaviour through interaction with and feedback from their environment. In coop- erative Multi-Agent Reinforcement Learning, multiple agents must learn to coordinate effectively within a shared environment, aligning their strategies and adapting to dynamic situations. Recently, Multi-Agent Reinforcement Learning combined with Graph Neural Networks has demonstrated signifi- cant potential in complex, cooperative environments by enabling agents to learn from the relational structure between them. However, a common is- sue is the emergence of homogeneous behaviour among agents. This limits their ability to strategically coordinate and adapt to a dynamic environment. This research investigates how diversity can be encouraged in graph-based Multi-Agent Reinforcement Learning to improve strategic coordination. We compare several architectural and training choices using diversity measures and proxies. To limit the effects of homogeneity, we propose a decentralized version of GAPPO (Graph-Attention Proximal Policy Optimisation). We additionally contribute a novel perturbation mechanism, which perturbs the attention mechanism in GAPPO, to encourage exploration and discover al- ternative coordination strategies. The experiments are based on Multiagent Particle Environment’s Simple Tag, and Google Research Football’s 3 vs 1 with keeper scenario. In Simple Tag, the perturbation mechanism seems to destabilize already effective strategies, leading to reduced performance, likely because Simple Tag is a relatively simple to coordinate and naturally diverse environment, which allows optimal strategies to be found more quickly. In the more complex Google Research Football, attention perturbation results in more diverse behavioural patterns, suggesting that the benefits of pertur- bation are better leveraged in more complex coordination settings. These results suggest that while perturbing attention can influence coordination structure, its benefits are environment-dependent and require further inves- tigation.

Keywords

Multi-Agent Reinforcement Learning, Graph-based, Coordination, Diversity

Citation