Enhancing Graph-Based Multi-Agent Reinforcement Learning for Strategic and Diverse Coordination
Publication date
Authors
DOI
Document Type
Master Thesis
Metadata
Show full item recordCollections
License
CC-BY-NC-ND
Abstract
Reinforcement Learning enables agents to learn complex behaviour
through interaction with and feedback from their environment. In coop-
erative Multi-Agent Reinforcement Learning, multiple agents must learn to
coordinate effectively within a shared environment, aligning their strategies
and adapting to dynamic situations. Recently, Multi-Agent Reinforcement
Learning combined with Graph Neural Networks has demonstrated signifi-
cant potential in complex, cooperative environments by enabling agents to
learn from the relational structure between them. However, a common is-
sue is the emergence of homogeneous behaviour among agents. This limits
their ability to strategically coordinate and adapt to a dynamic environment.
This research investigates how diversity can be encouraged in graph-based
Multi-Agent Reinforcement Learning to improve strategic coordination. We
compare several architectural and training choices using diversity measures
and proxies. To limit the effects of homogeneity, we propose a decentralized
version of GAPPO (Graph-Attention Proximal Policy Optimisation). We
additionally contribute a novel perturbation mechanism, which perturbs the
attention mechanism in GAPPO, to encourage exploration and discover al-
ternative coordination strategies. The experiments are based on Multiagent
Particle Environment’s Simple Tag, and Google Research Football’s 3 vs 1
with keeper scenario. In Simple Tag, the perturbation mechanism seems to
destabilize already effective strategies, leading to reduced performance, likely
because Simple Tag is a relatively simple to coordinate and naturally diverse
environment, which allows optimal strategies to be found more quickly. In
the more complex Google Research Football, attention perturbation results
in more diverse behavioural patterns, suggesting that the benefits of pertur-
bation are better leveraged in more complex coordination settings. These
results suggest that while perturbing attention can influence coordination
structure, its benefits are environment-dependent and require further inves-
tigation.
Keywords
Multi-Agent Reinforcement Learning, Graph-based, Coordination, Diversity