Finding Eulerian Tours in Mazes Using a Memory-Augmented Fixed Policy Function

Lecture notes in networks and systems(2023)

引用 0|浏览2
暂无评分
摘要
This paper describes a simple memory augmentation technique that employs tabular Q-learning to solve binary cell structured mazes with exits generated randomly at the start of each solution attempt. A standard tabular Q-learning can solve any maze with continuous learning; however, if the learning is stopped and the policy is frozen, the agent will not adapt to solve newly generated exits. To avoid using Recurrent Neural Networks RNNs to solve memory-required tasks, we designed and implemented a simple external memory to remember the agent’s cell visit history. This memory also expands the state information to hold more information, assisting tabular Q-learning in distinguishing its path from entering and exiting a maze corridor. Experiments on five maze problems of varying complexity are presented. The maze has two and four predefined exits; the exit will be randomly assigned at the start of each solution attempt. The results show that tabular Q-learning with a frozen policy can outperform standard deep-learning algorithms without incorporating RNNs into the model structure.
更多
查看译文
关键词
mazes,eulerian tours,memory-augmented
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要