A Value Function Basis for Nexting and Multi-step Prediction

semanticscholar(2021)

引用 0|浏览0
暂无评分
摘要
Humans and animals continuously make short-term cumulative predictions about their sensory-input stream, an ability referred to by psychologists as nexting. This ability has been recreated in a mobile robot using modern reinforcement learning approaches, but in practice there are limitations on how many predictions we can learn. In this paper, we investigate inferring new predictions from a minimal set of learned General Value Functions. We show that linearly weighting such a collection of value function predictions enables us to also make accurate multi-step predictions about future outcomes, and provide a closed-form solution to estimate this linear weighting. We also show that a similar approach can produce accurate estimates of value functions which we did not explicitly train to predict.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要