Maximizing the Conditional Expected Reward for Reaching the Goal.

TACAS(2017)

引用 12|浏览43
暂无评分
摘要
The paper addresses the problem of computing maximal conditional expected accumulated rewards until reaching a target state briefly called maximal conditional expectations in finite-state Markov decision processes where the condition is given as a reachability constraint. Conditional expectations of this type can, e.g., stand for the maximal expected termination time of probabilistic programs with non-determinism, under the condition that the program eventually terminates, or for the worst-case expected penalty to be paid, assuming that at least three deadlines are missed. The main results of the paper are i a polynomial-time algorithm to check the finiteness of maximal conditional expectations, ii PSPACE-completeness for the threshold problem in acyclic Markov decision processes where the task is to check whether the maximal conditional expectation exceeds a given threshold, iii a pseudo-polynomial-time algorithm for the threshold problem in the general cyclic case, and iv an exponential-time algorithm for computing the maximal conditional expectation and an optimal scheduler.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要