Finding Eulerian Tours in Mazes Using a Memory-Augmented Fixed Policy Function

Mahrad Pisheh Var,Michael Fairbank,Spyridon Samothrakis

Lecture notes in networks and systems（2023）

引用 0|浏览2

暂无评分

摘要

This paper describes a simple memory augmentation technique that employs tabular Q-learning to solve binary cell structured mazes with exits generated randomly at the start of each solution attempt. A standard tabular Q-learning can solve any maze with continuous learning; however, if the learning is stopped and the policy is frozen, the agent will not adapt to solve newly generated exits. To avoid using Recurrent Neural Networks RNNs to solve memory-required tasks, we designed and implemented a simple external memory to remember the agent’s cell visit history. This memory also expands the state information to hold more information, assisting tabular Q-learning in distinguishing its path from entering and exiting a maze corridor. Experiments on five maze problems of varying complexity are presented. The maze has two and four predefined exits; the exit will be randomly assigned at the start of each solution attempt. The results show that tabular Q-learning with a frozen policy can outperform standard deep-learning algorithms without incorporating RNNs into the model structure.

查看译文

关键词

mazes,eulerian tours,memory-augmented

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要