EFDO: Solving Extensive-Form Games Based On Double Oracle

2022 4th International Conference on Data Intelligence and Security (ICDIS)(2022)

引用 1|浏览2
暂无评分
摘要
Although algorithm like counterfactual regret minimization (CFR) proved to be effective in small games, the demands on computing and storage resources limits its application among large EFGs. We propose a new algorithm called extensive-form double oracle (EFDO) based on double oracle to solve extensive-form games. EFDO solves a smaller game instead of the original one via CFR repeatedly and is guaranteed to converge to an approximation of Nash Equilibrium. We also introduce extensive-form deep double oracle (EFD2O) by replacing best response solver with deep reinforcement learning. In vanilla Leduc poker, we show that EFDO fits different CFR algorithms well and converges faster than policy space response oracle and extensive-form fictitious play. On a modified Leduc poker game, EFDO achieves an approximate Nash Equilibrium in a number of iterations 1–2 orders of magnitude smaller than different CFR algorithms. Besides, EFD2O also performs better than NFSP and PSRO on the modified Leduc Poker.
更多
查看译文
关键词
game theory,extensive-form games,double oracle,deep reinforcement learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要