Deriving An Optimally Deceptive Policy In Two-Player Iterated Games

ACC(2016)

引用 23|浏览13
暂无评分
摘要
We formulate the problem of determining an optimally deceptive strategy in a repeated game framework. We assume that two players are engaged in repeated play. During an initial time period, Player 1 may deceptively train his opponent to expect a specific strategy. The opponent computes a best response. The best response is computed on an optimally deceptive strategy that maximizes the first player's long-run payoff during actual game play. Player 1 must take into consideration not only his real payoff but also the cost of deception. We formulate the deception problem as a nonlinear optimization problem and show how a genetic algorithm can be used to compute an optimally deceptive play. In particular, we show how the cost of deception can lead to strategies that blend a target strategy (policy) and an optimally deceptive one.
更多
查看译文
关键词
optimally deceptive policy,two-player iterated games,repeated game framework,nonlinear optimization problem,genetic algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要