Weathering Ongoing Uncertainty: Learning and Planning in a Time-Varying Partially Observable Environment
CoRR(2023)
摘要
Optimal decision-making presents a significant challenge for autonomous
systems operating in uncertain, stochastic and time-varying environments.
Environmental variability over time can significantly impact the system's
optimal decision making strategy for mission completion. To model such
environments, our work combines the previous notion of Time-Varying Markov
Decision Processes (TVMDP) with partial observability and introduces
Time-Varying Partially Observable Markov Decision Processes (TV-POMDP). We
propose a two-pronged approach to accurately estimate and plan within the
TV-POMDP: 1) Memory Prioritized State Estimation (MPSE), which leverages
weighted memory to provide more accurate time-varying transition estimates; and
2) an MPSE-integrated planning strategy that optimizes long-term rewards while
accounting for temporal constraint. We validate the proposed framework and
algorithms using simulations and hardware, with robots exploring a partially
observable, time-varying environments. Our results demonstrate superior
performance over standard methods, highlighting the framework's effectiveness
in stochastic, uncertain, time-varying domains.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要