### On Nash Equilibria in Stochastic Games

**Authors:**

Chatterjee, Krishnendu

Jurdzinski, Marcin

Majumdar, Rupak
**Technical Report Identifier:** CSD-03-1281
**October 2003**

CSD-03-1281.pdf

CSD-03-1281.ps

**Abstract:** We study infinite stochastic games played by *n*-players on a finite graph with goals given by sets of infinite traces. The games are stochastic (each player simultaneously and independently chooses an action at each round, and the next state is determined by a probability distribution depending on the current state and the chosen actions), infinite (the game continues for an infinite number of rounds), nonzero sum (the players' goals are not necessarily conflicting), and undiscounted. We show that if each player has a reachability objective, that is, if the goal for each player *i* is to visit some subset *Ri* of the states, then there exists an epsilon-Nash equilibrium in memoryless strategies. We study the complexity of finding such Nash equilibria. Given an *n*-player reachability game, and a vector of values (*v1* ... *vn*), we show it is NP-hard to determine if there exists a memoryless epsilon-Nash equilibrium where each player gets payoff at least *vi*. On the other hand, for every fixed epsilon, the value can be epsilon-approximated in FNP.

We study two important special cases of the general problem. First, we study *n*-player turn-based probabilistic games, where at each state atmost one player has a nontrivial choice of moves. For turn-based probabilistic games, we show the existence of epsilon-Nash Equilibria in pure strategies for all games where the goal of each player is a Borel set of infinite traces. We also derive the existence of pure exact Nash equilibria for *n*-player turn-based games where each player has an omega-regular objective.

Then we study the two player case and show that already for two-player games exact Nash equilibria may not exist. Our techniques for the general case also yield NP coNP epsilon-approximation algorithms for zero-sum reachability games, improving the previously known EXPTIME bound.