Autonomous Lane Merging: A Comparison Between Reinforcement Learning Algorithms

Publication date

DOI

Document Type

Bachelor Thesis

Collections

Open Access logo

License

CC-BY-NC-ND

Abstract

Despite the advancements of self-driving cars, autonomous on-ramp merging on highways still proposes difficulties. To solve this merge problem a simulation was set up in the Unity game engine and an agent was trained using two state of the art reinforcement learning algorithms, Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC), utilizing the Unity Learning Agents Toolkit ML-Agents. The two algorithms are compared to each other with respects to training speed, performance, stability and success rate. The robustness of the algorithms were tested by having the traffic (1) vary in speed, (2) vary in starting positions and (3) switch lanes. The agent had a similar performance with a success rate of 95% when employing either PPO or SAC. Both algorithms showed their advantages and disadvantages. PPO had a more stable performance and less variability in mean reward, while SAC was more sample efficient. Results show that reinforcement learning is an avenue worth pursuing to reach fully autonomous driving. Improvements to the results could still be made through hyperparameter tuning, more complex neural network setup and a more realistic simulation, further proving the advantage of reinforcement learning.

Keywords

reinforcement learning; Proximal Policy Optimization; Soft Actor-Critic; autonomous car; lane merging; Unity

Citation