Learning Algorithms for Verification of Markov Decision Processes
arxiv(2024)
摘要
We present a general framework for applying learning algorithms and
heuristical guidance to the verification of Markov decision processes (MDPs).
The primary goal of our techniques is to improve performance by avoiding an
exhaustive exploration of the state space, instead focussing on particularly
relevant areas of the system, guided by heuristics. Our work builds on the
previous results of Brázdil et al., significantly extending it as well as
refining several details and fixing errors.
The presented framework focuses on probabilistic reachability, which is a
core problem in verification, and is instantiated in two distinct scenarios.
The first assumes that full knowledge of the MDP is available, in particular
precise transition probabilities. It performs a heuristic-driven partial
exploration of the model, yielding precise lower and upper bounds on the
required probability. The second tackles the case where we may only sample the
MDP without knowing the exact transition dynamics. Here, we obtain
probabilistic guarantees, again in terms of both the lower and upper bounds,
which provides efficient stopping criteria for the approximation. In
particular, the latter is an extension of statistical model-checking (SMC) for
unbounded properties in MDPs. In contrast to other related approaches, we do
not restrict our attention to time-bounded (finite-horizon) or discounted
properties, nor assume any particular structural properties of the MDP.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要