Extensive-Form Game Solving via Blackwell Approachability on Treeplexes
CoRR(2024)
摘要
In this paper, we introduce the first algorithmic framework for Blackwell
approachability on the sequence-form polytope, the class of convex polytopes
capturing the strategies of players in extensive-form games (EFGs). This leads
to a new class of regret-minimization algorithms that are stepsize-invariant,
in the same sense as the Regret Matching and Regret Matching^+ algorithms for
the simplex. Our modular framework can be combined with any existing regret
minimizer over cones to compute a Nash equilibrium in two-player zero-sum EFGs
with perfect recall, through the self-play framework. Leveraging predictive
online mirror descent, we introduce Predictive Treeplex Blackwell^+
(PTB^+), and show a O(1/√(T)) convergence rate to Nash equilibrium in
self-play. We then show how to stabilize PTB^+ with a stepsize, resulting in
an algorithm with a state-of-the-art O(1/T) convergence rate. We provide an
extensive set of experiments to compare our framework with several algorithmic
benchmarks, including CFR^+ and its predictive variant, and we highlight
interesting connections between practical performance and the
stepsize-dependence or stepsize-invariance properties of classical algorithms.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要