Learning and Efficiency in Games with Dynamic Population.
SODA '16: Symposium on Discrete Algorithms Arlington Virginia January, 2016(2016)
摘要
We study the quality of outcomes in repeated games when the population of players is dynamically changing, and where participants use learning algorithms to adapt to the dynamic environment. Price of anarchy has originally been introduced to study the Nash equilibria of one-shot games. Many games studied in computer science, such as packet routing or ad-auctions, are played repeatedly. Given the computational hardness of Nash equilibria, an attractive alternative in repeated game settings is that players use no-regret learning algorithms. The price of total anarchy considers the quality of such learning outcomes, assuming a steady environment and player population, which is rarely the case in online settings.
In this paper we analyze efficiency of repeated games in dynamically changing environments. An important trait of learning behavior is its versatility to changing environments, assuming that the learning method used is adaptive, i.e., doesn't rely too heavily on experience from the distant past. We show that, in large classes of games, if players choose their strategies in a way that guarantees low adaptive regret, high social welfare is ensured, even under very frequent changes.
A main technical tool for our analysis is the existence of a solution to the welfare maximization problem that is both close to optimal and relatively stable over time. Such a solution serves as a benchmark in the efficiency analysis of learning outcomes. We show that such a stable and close to optimal solution exists for many problems, even in cases when the exact optimal solution can be very unstable. We further show that a sufficient condition on the existence of stable outcomes is the existence of a differentially private algorithm for the welfare maximization problem. Hence, we draw a strong connection between differential privacy and high efficiency of learning outcomes in frequently changing repeated games.
We demonstrate our techniques by focusing on two classes of games as examples: independent item auctions and congestion games. In both applications we show that adaptive learning guarantees high social welfare even with surprisingly high churn in the player population.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络