GUBS criterion: Arbitrary trade-offs between cost and probability-to-goal in stochastic planning based on Expected Utility Theory.

Artif. Intell.(2023)

引用 2|浏览8
暂无评分
摘要
Stochastic Shortest Path MDPs (SSP-MDPs) are used to model probabilistic sequential decision problems where the objective is to minimize the expected accumulated cost to goal. However, in the presence of dead-ends, the conventional criterion for SSP-MDPs, which minimizes the expected accumulated cost, can become ill-defined. Lexicographic criteria can solve this by preferring policies that reach the goal with the highest possible probability. Other criteria can instead make a trade-off between some cost measure and probability-to-goal. However, both of these approaches can lead to policies that might not represent the choice of a real decision-maker. In this work, we propose the GUBS criterion to address these problems. GUBS combines goal prioritization over histories with Expected Utility Theory and is the only criterion between all criteria analyzed that not only allows for a trade-off between a large accumulated cost and a small loss in probability-to-goal, but also guarantees arbitrary trade-offs that can be tuned from its parameters without previous knowledge of the problem being solved. We also propose eGUBS, which is a particular case of GUBS when the exponential utility function is used, and two algorithms for optimally solving these problems: eGUBS-VI, a VI-based algorithm; and eGUBS-AO*, a heuristic search algorithm. Results indicate that, when there is a good heuristic function available or when the state space is too large, eGUBS-AO* can perform better than eGUBS-VI by doing an efficient search. In other cases, eGUBS-VI's simpler approach might have better results.
更多
查看译文
关键词
Stochastic Shortest Path, Sequential decision making, Probabilistic planning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要