EFDO: Solving Extensive-Form Games Based On Double Oracle

Song Qin, Lei Zhang,Shuhan Qi,Jiajia Zhang,Huale Li,Xuan Wang,Jing Xiao

2022 4th International Conference on Data Intelligence and Security (ICDIS)（2022）

引用 1|浏览2

暂无评分

摘要

Although algorithm like counterfactual regret minimization (CFR) proved to be effective in small games, the demands on computing and storage resources limits its application among large EFGs. We propose a new algorithm called extensive-form double oracle (EFDO) based on double oracle to solve extensive-form games. EFDO solves a smaller game instead of the original one via CFR repeatedly and is guaranteed to converge to an approximation of Nash Equilibrium. We also introduce extensive-form deep double oracle (EFD2O) by replacing best response solver with deep reinforcement learning. In vanilla Leduc poker, we show that EFDO fits different CFR algorithms well and converges faster than policy space response oracle and extensive-form fictitious play. On a modified Leduc poker game, EFDO achieves an approximate Nash Equilibrium in a number of iterations 1–2 orders of magnitude smaller than different CFR algorithms. Besides, EFD2O also performs better than NFSP and PSRO on the modified Leduc Poker.

查看译文

关键词

game theory,extensive-form games,double oracle,deep reinforcement learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要