You are here: MIMS > EPrints
MIMS EPrints

2009.32: Stability of learning dynamics in two-agent, imperfect-information games

2009.32: John M. Butterworth and Jonathan L. Shapiro (2009) Stability of learning dynamics in two-agent, imperfect-information games. In: FOGA 09, January 9 - 11, 2009, Orlando, Florida USA.

Full text available as:

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
435 Kb

DOI: 10.1145/1527125.1527143

Abstract

One issue in multi-agent co-adaptive learning concerns convergence. When two (or more) agents play a game with different information and different payoffs, the general behaviour tends to be oscillation around a Nash equilibrium. Several algorithms have been proposed to force convergence to mixed-strategy Nash equilibria in imperfect-information games when the agents are aware of their opponent's strategy. We consider the effect on one such algorithm, the lagging anchor algorithm, when each agent must also infer the gradient information from observations, in the infinitesimal time-step limit. Use of an estimated gradient, either by opponent modelling or stochastic gradient ascent, destabilises the algorithm in a region of parameter space. There are two phases of behaviour. If the rate of estimation is low, the Nash equilibrium becomes unstable in the mean. If the rate is high, the Nash equilibrium is an attractive fixed point in the mean, but the uncertainty acts as narrow-band coloured noise, which causes dampened oscillations.

Item Type:Conference or Workshop Item (Paper)
Uncontrolled Keywords:Reinforcement learning, game theory, learning in games, CICADA
Subjects:MSC 2000 > 68 Computer science
MSC 2000 > 91 Game theory, economics, social and behavioral sciences
MIMS number:2009.32
Deposited By:Dr Jonathan Shapiro
Deposited On:28 April 2009

Download Statistics: last 4 weeks
Repository Staff Only: edit this item