A Value Function Basis for Nexting and Multi-step Prediction

Andrew Jacobsen,Vincent Liu,Roshan Shariff,Adam White,Martha White

semanticscholar（2021）

引用 0|浏览0

暂无评分

摘要

Humans and animals continuously make short-term cumulative predictions about their sensory-input stream, an ability referred to by psychologists as nexting. This ability has been recreated in a mobile robot using modern reinforcement learning approaches, but in practice there are limitations on how many predictions we can learn. In this paper, we investigate inferring new predictions from a minimal set of learned General Value Functions. We show that linearly weighting such a collection of value function predictions enables us to also make accurate multi-step predictions about future outcomes, and provide a closed-form solution to estimate this linear weighting. We also show that a similar approach can produce accurate estimates of value functions which we did not explicitly train to predict.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要