On Evolutionarily Stable States and Nash Equilibria that Are Not Characterised by the Folk Theorem

In evolutionary game theory, evolutionarily stable states are characterised by the folk theorem because exact solutions to the replicator equation are difficult to obtain. It is generally assumed that the folk theorem, which is the fundamental theory for non-cooperative games, defines all Nash equilibria in infinitely repeated games. Here, we prove that Nash equilibria that are not characterised by the folk theorem do exist. By adopting specific reactive strategies, a group of players can be better off by coordinating their actions in repeated games. We call it a level-k equilibrium when a group of players coordinate their actions and they have no incentive to deviate from their strategies simultaneously. The existence and stability of the level-k equilibrium in general games is discussed. This study shows that the set of evolutionarily stable states has greater cardinality than has been considered in many evolutionary games.


Introduction
A population is considered to be in an evolutionarily stable state if its genetic composition is restored by selection after a disturbance [1]. In evolutionary game theory, an evolutionarily stable state has a close relationship with the concept of Nash equilibrium (NE) and the folk theorem for infinitely repeated games [2,3]. The folk theorem, which is the fundamental theory of non-cooperative repeated games, states that any feasible payoff profile that strictly dominates the minimax profile is a Nash equilibrium profile in an infinitely repeated game [4,5]. An evolutionary stable state must be a refinement of NE in the corresponding evolutionary game.
The folk theorem has been intensively studied for decades. Different variants of it have been developed to take into consideration the factors such as indefinite iteration, incomplete information and discount rate [6][7][8][9][10]. It is generally assumed that the folk theorem characterises all NE in an infinitely repeated game. However, there does exist some NE that are neglected by classical game theory, as we show in this paper. Let's first see a new variant of the prisoner's dilemma (PD).
We extend PD to a three-player zero-sum game by adding an extra player, the police, whose payoff is equivalent to the negative sum of the payoffs of two prisoners.
Suppose, for simplicity, that the police choose between two options, L and R, which lead to two PDs between X and Y with different payoff values as shown in Fig.1.

If player Z chooses L:
If player Z chooses R: Payoff matrix of a three-player zero-sum game shows the payoffs of three players, X, Y and Z. X and Y choose between two options, D and C, whilst Z chooses between L and R.
The dominant strategy for three players are D, D, and L respectively and the corresponding payoffs are (-3, -3, 6), which is the unique NE of the stage game.
In an infinitely repeated version of this game, X and Y can be better off by choosing (C, C) whatever Z chooses. Three players choosing (C, C, L) in every round should be a NE since mutual cooperation is a NE in each PD according to the folk theorem. Note that the minimax payoffs for three players are -3, -3 and 0 respectively. Z receives the minimax payoff in this equilibrium, which means that the payoff profile does not strictly dominate the minimax payoff profile. Thus, this equilibrium is not characterised by the Folk theorem.
The strategies for players in a repeated game include not only simply aggregations of pure or mixed strategies in a sequence of stage games, but also reactive strategies that one player chooses their action in response to some other players' previous actions. From Tit for tat, Grim trigger, Pavlov and Group strategies [11] to the newly appeared zero-determinant strategies [12][13][14], a number of reactive strategies have been developed and investigated in evolutionary game theory. Reactive strategies are the reason why a payoff profile that is not NE in the stage game can be NE in an infinitely repeated game.
Coordination among a group of players can be formed and maintained when specific reactive strategies are adopted by those players, which leads to equilibrium that does not exist in one-shot games.

Reactive Strategies
Consider an infinitely repeated n-player game the player set and } , , are the strategy set and the payoff set respectively. The iteration of game is counted by t, starting from 0  t .
Each player has a pure action space  be the sequence of actions chosen by player the past choices made by all players other than i.
A player's strategy is reactive if it is a function of other players' past actions (13).
Player i's strategy, i s , is a reactive strategy when there is The strategy in the first stage game, 0 i s , is either a pure strategy or a mixed strategy.
Obviously, reactive strategies do not exist in one-shot games since there always are NEs. The probability that any NE is achieved is 0.5 no matter what pure or mixed strategy they choose.
Unless some reactive strategies are adopted, the probability of convergence to any NE is 0.5 in a repeated version of this game.
Reactive strategies provide a way of coordination among a group of players in repeated games. In a repeated game with multiple Nash equilibria, for example, convergence to a Nash equilibrium can be guaranteed only if the players adopt specific reactive strategies.
There are two pure-strategy NEs, (L, R) and (R, L), in the coordination game as shown in Fig.2. Two players do not have any priori knowledge about which NE strategy profile to choose unless they can communicate with each other before the game. The coordination between X and Y can be achieved with probability 1   in an infinite repeated game if two players adopt the below strategies: Reactive strategies also provide a way of maintaining coordination among a group of players. Grim trigger, for example, is a reactive strategy for the players in iterated prisoner's dilemma to maintain mutual cooperation. There exists a set of trigger strategies in a repeated game, by which the coordination among a group of players can be enforced. Once a group of players have coordinated their actions, they switch to the trigger strategy that one player will choose the minimax strategy if any other player in the group deviates from their coordination strategy.

The Level-k Equilibrium
One assumption in game theory is that the players believe that a deviation in their own strategy will not cause deviations by any other players. This is not a reasonable assumption in repeated games because of the existence of reactive strategies.
Coordination among a group of players can be achieved when they adopt specific reactive strategies, which may lead to equilibrium other than Nash equilibrium in repeated games.

Definition 1:
In a repeated n-player game, a level-k coordination ( n k   2 ) denotes that a group of k players coordinate their actions by adopting some trigger strategies such that they will change their strategies simultaneously once any player in the group deviates from the assigned action.
The necessary condition of a level-k coordination is that k players can be better off by coordinating their actions. Let i v be the minimax payoff of player

Definition 2:
In an infinitely repeated n-player game, we call it a level-k equilibrium ) if a group of k players coordinate their actions and they have no incentive to deviate from their strategies simultaneously. We prove that any level-k equilibrium is also a NE in the below proposition.

Proposition 1:
In an infinitely repeated n-player game, any level-k equilibrium ) is a NE.
Proof: Let's first consider the case of n k  . We have ) , , does deviate from i s in order to gain a higher payoff in the current round, all other members of K will play their minimax strategies in the future rounds. Player i will have to play the minimax strategy and will receive i v in the future rounds. Knowing this, player i has no incentive to deviate from i s .
Since any player has no incentive to deviate from } { i s , it is a NE. ■ Every level-k equilibrium is a NE and a NE is not necessarily a level-k equilibrium.
Thus, the level-k equilibria are refinements of NE in repeated games.
The set of level-k equilibrium forms the Parteto frontier of all NEs in an infinitely repeated game. Any level-k equilibrium is a Pareto optimum for the group of k players.
In three-player games, for example, the relationship between proposition 1 and the folk theorem can be illustrated by Fig. 3.
For any group of players in an n-player game, there must be a strategy profile } { i s such that these players cannot improve their payoffs by changing their strategies simultaneously. This strategy profile can be a level-k equilibrium if (4) and (5) are satisfied for some players. We prove the existence of level-k equilibrium in general repeated games in proposition 2.

Proposition 2:
In an infinitely repeated n-player game where there exists two or more NE, there must be at least one level-k equilibrium.
Proof: When there exists two or more NE, there must be at least one strategy profile that is different from the minimax profile in a NE. Let } { i s denote such a strategy profile. We first prove that there must be ) , , a. the player who did not change has no better strategy.
b. the player who did change is now playing with a strictly worse strategy.
A level-k equilibrium is not stable if it is not a NE in the stage game because once a player within the coalition changes his/her strategy in a level-k equilibrium, all other 1  k players will be triggered to change their strategies. We do not concern the players excluded from the coalition because any change in their strategies has no influence on the coalition.
A level-k equilibrium is stable if it is also a NE in the stage game and the NE is stable. For example, the pure-strategy NEs in Fig. 2 are level-2 equilibria and they are stable.

An Example
This example is to show the multiplicity of equilibria in repeated games. Consider a three-player game as shown in Fig. 4. The option R is dominated by L for every player and the strategy profile (L, L, L) is the unique NE in the stage game.
If player Z chooses L: There are numerous level-3 and level-2 equilibria. A strategy profile is a level-3 equilibrium when X and Y choose (R, R) and Z chooses whatever mixed strategy in every round. There is a typical level-2 equilibrium when X and Y alternately choose (L, R) and (R, L) and Z chooses L (the point F in Fig. 5). Figure 5 The set of payoff profiles of all NE is a 3D polyhedron in the payoff space of X, Y, Z players and △ADE is its projection onto the X-Y plane. △ABC is the projection of the set of feasible profiles.
The point A represents the minimax profile. The point G represents a level-3 equilibrium. Any point on the segment DE represents a level-2 equilibrium.

Conclusions and Discussion
Proposition 1 is a supplement to the folk theorem. The folk theorem proves the existence of the level-n equilibrium in repeated n-player games. Proposition 1 extends it to the general case of level-k equilibrium ( n k   2 ).
The existence of level-k equilibrium explains, to some extent, why the biodiversity in evolutionary games is much more complex than classical game theory has predicted. The level-k equilibrium belongs to the set of NE that has been neglected in game theory. This set of NE possibly contains more complicated equilibrium than the level-k equilibrium, for example the equilibrium that has two or more coalitions in it.

Payoff of Y player
The set of payoff profiles of all NE The emergence of cooperation in evolutionary dynamics has attracted a great deal of research [15][16][17][18][19][20]. The level-k equilibrium suggests that cooperation in evolution starts from a group of players rather than an individual player. A group of players would coordinate their actions by adopting specific reactive strategies if they could be better off by doing so. This collective behaviour is much more effective and robust than any individual behaviour in building and maintaining cooperation in evolution.
A level-k equilibrium is a Pareto optimum for the group of k players. In a level-k equilibrium not only does any individual player not have incentive to unilaterally change their strategies but also a group of k players has no incentive to deviate from it collectively, which means that the level-k equilibrium is stronger than NE in stability.
The concept of level-k equilibrium is different from the refinements of NE, such as strong NE [21] and coalition proof NE [22,23], in that a level-k equilibrium is not necessarily a NE in the stage game and it does not need communication or mediation among players. The level-k equilibrium is based on reactive strategies, which are neither pure strategies nor mixed strategies. It is also different from the core in cooperative game theory [24].
Backward induction has led to some controversy despite of its wide use in finitely repeated games [25,26]. It is obvious that the choices of players cannot be backward inducted when some of them adopt specific reactive strategies. In finitely repeated games, the end game effect cannot prevent the players from adopting specific reactive strategies. When both past actions and the end game effect have an influence on the strategies of players, transition from one NE to another is possible. We will discuss this issue in another paper.