On Nash Equilibrium and Evolutionarily Stable States That Are Not Characterised by the Folk Theorem

In evolutionary game theory, evolutionarily stable states are characterised by the folk theorem because exact solutions to the replicator equation are difficult to obtain. It is generally assumed that the folk theorem, which is the fundamental theory for non-cooperative games, defines all Nash equilibria in infinitely repeated games. Here, we prove that Nash equilibria that are not characterised by the folk theorem do exist. By adopting specific reactive strategies, a group of players can be better off by coordinating their actions in repeated games. We call it a type-k equilibrium when a group of k players coordinate their actions and they have no incentive to deviate from their strategies simultaneously. The existence and stability of the type-k equilibrium in general games is discussed. This study shows that the sets of Nash equilibria and evolutionarily stable states have greater cardinality than classic game theory has predicted in many repeated games.


Introduction
A population is considered to be in an evolutionarily stable state if its genetic composition is restored by selection after a disturbance [1]. In evolutionary game theory, an evolutionarily stable state has a close relationship with the concept of Nash equilibrium (NE) and the folk theorem for infinitely repeated games [2,3]. The folk theorem, which is the fundamental theory of non-cooperative repeated games, states that any feasible payoff profile that strictly dominates the minimax profile is a Nash equilibrium profile in an infinitely repeated game [4,5]. An evolutionary stable state must be a refinement of NE in the corresponding evolutionary game.
The folk theorem has been intensively studied for decades. Different variants of it have been developed to take into consideration the factors such as indefinite iteration, incomplete information and discount rate [6][7][8][9][10][11][12][13][14][15]. It is generally assumed that the folk theorem characterises all NE in an infinitely repeated game. However, there does exist some NE that are neglected by classical game theory, as we show in this paper. Let's first see a new variant of the prisoner's dilemma (PD).
We extend PD to a three-player zero-sum game by adding an extra player, the police, whose payoff is equivalent to the negative sum of the payoffs of two prisoners. Suppose, for simplicity, that the police choose between two options, L and R, which lead to two PDs between X and Y with different payoff values as shown in Fig 1. The dominant strategy for three players are D, D, and L respectively and the corresponding payoffs are (-3, -3, 6), which is the unique NE of the stage game.
In an infinitely repeated version of this game, X and Y can be better off by choosing (C, C) whatever Z chooses. Three players choosing (C, C, L) in every round should be a NE since mutual cooperation is a NE in each PD according to the folk theorem. Note that the minimax payoffs for three players are -3, -3 and 0 respectively. Z receives the minimax payoff in this equilibrium, which means that the payoff profile does not strictly dominate the minimax payoff profile. Thus, this equilibrium is not characterised by the Folk theorem.
The strategies for players in a repeated game include not only simply aggregations of pure or mixed strategies in a sequence of stage games, but also reactive strategies that one player chooses their action in response to some other players' previous actions. From Tit for tat, Grim trigger, Pavlov and Group strategies [16] to the newly appeared zero-determinant strategies [17][18][19], a number of reactive strategies have been developed and investigated in evolutionary game theory. Reactive strategies are the reason why a payoff profile that is not NE in the stage game can be NE in an infinitely repeated game.
Coordination among a group of players can be formed and maintained when specific reactive strategies are adopted by those players, which leads to equilibrium that does not exist in one-shot games.

Reactive Strategies
Consider a repeated n-player game G = {I,S,U} T where I = {1,. . .,n} is the player set and S = {S 1 ,. . .,S n } and U = {U 1 ,. . .,U n } are the strategy set and the payoff set respectively. The iteration of game is counted by t, starting from t = 0. Each player has a pure action space A t i in the t th stage game. Let ℏ t i ¼ ða 0 i ; a 1 i ; . . . ; a tÀ1 i Þ be the sequence of actions chosen by player i 2 I within t−1 periods, and ℏ t . . . ; h t n Þ the past choices made by all players other than i. For simplicity of expression, the payoff of a player in a repeated game is computed i , which denotes the average payoff over a period of T+1.
A player's strategy is reactive if it is a function of other players' past actions [18]. Player i's strategy, s i , is a reactive strategy when there is The strategy in the first stage game, s 0 i , is either a pure strategy or a mixed strategy. Obviously, reactive strategies do not exist in one-shot games since there always are ℏ t i ¼ ℏ t Ài ¼ for any i.
Reactive strategies provide a way of coordination among a group of players in repeated games. In a repeated game with multiple Nash equilibria, for example, convergence to a Nash equilibrium can be guaranteed only if the players adopt specific reactive strategies.
There are two pure-strategy NEs, (L, R) and (R, L), in the coordination game as shown in Fig 2. Two players do not have any a priori knowledge about which NE strategy profile to choose unless they can communicate with each other before the game. The coordination between X and Y can be achieved with probability ρ ! 1 in an infinite repeated game if two players adopt the below strategies: if both players chose ðR; LÞ at t À 1 Otherwise Reactive strategies also provide a way of maintaining coordination among a group of players. Grim trigger, for example, is a reactive strategy for the players in iterated prisoner's dilemma to maintain mutual cooperation. There exists a set of trigger strategies in a repeated game, by which the coordination among a group of players can be enforced. Once a group of players have coordinated their actions, they switch to the trigger strategy that one player will choose the minimax strategy if any other player in the group deviates from their coordination strategy.

Type-k Equilibrium
One assumption in game theory is that the players believe that a deviation in their own strategy will not cause deviations by any other players. This is not a reasonable assumption in repeated games because of the existence of reactive strategies. Coordination among a group of players can be achieved when they adopt specific reactive strategies, which may lead to equilibrium other than Nash equilibrium in repeated games.
Definition 1: In a repeated n-player game, a type-k coordination (2 k n) denotes that a group of k players coordinate their actions by adopting some trigger strategies such that they will change their strategies simultaneously once any player in the group deviates from the assigned action.
The necessary condition of a type-k coordination is that k players can be better off by coordinating their actions. Let v i be the minimax payoff of player i 2 I and s Ã i the minimax strategy. Let K denote the group of k players where K 2 I. The necessary condition for a type-k coordination is that there exists a strategy profile f s i g hold for all i 2 K and whatever {s j } (j = 2 K). A type-k coordination can be maintained if all players involved adopt a trigger strategy like this: keep playing the coordination strategy if all other players play their coordination strategies; otherwise, play the minimax strategy.
Given f s i g (i 2 K), the best responses of the players who do not belong to K can be determined. Let f s i g (i 2 I) denote the strategy profile of all players. If k players cannot further improve their payoffs by deviating from f s j g simultaneously, the strategy profile f s i g (i 2 I) is a stable state (equilibrium) in the repeated game. This equilibrium is different from the concept of NE in that k players coordinate their actions.
Definition 2: In an infinitely repeated n-player game, we call it a type-k equilibrium (2 k n) if a group of k players coordinate their actions and they have no incentive to deviate from their strategies simultaneously.
A strategy profile f s i g is a type-k equilibrium if are satisfied for any fs 0 i g (i 2 K and s 0 i 6 ¼ s i ) and {s j } (j = 2 K). We prove that any type-k equilibrium is also a NE in the below proposition. Proposition 1: In an infinitely repeated n-player game, any type-k equilibrium (2 k n) is a NE.
Proof: Consider a strategy profile f s i g that satisfies (5). Let's first consider the case of k = n. We have v i < u i ð s 1 ; . . . ; s n Þ for all i 2 I. According to the folk theorem, f s i g is a NE.
In the case of k < n, if v i < u i ð s 1 ; . . . ; s n Þ are satisfied for all i 2 I, f s i g is a NE according to the folk theorem. It is impossible that, for any player i, there is v i > u i ð s 1 ; . . . ; s n Þ because player i could deviate from s i to the minimax strategy so that the payoff is guaranteed to be v i . This conflicts with the fact that s i is player i's best response. We simply need to consider v i ¼ u i ð s 1 ; . . . ; s n Þ for some players i = 2 K. Let M denote the group of players who receive their minimax payoffs. Any player i 2 M cannot improve his\her payoff by deviating from s i since s i is the best response to s Ài .
Any player i 2 K cannot improve their payoff by deviating from s i . If player i does deviate from s i in order to gain a higher payoff in the current round, all other members of K will play their minimax strategies in the future rounds. Player i will have to play the minimax strategy and will receive v i in the future rounds. Knowing this, player i has no incentive to deviate from s i .
Since any player has no incentive to deviate from f s i g, it is a NE. Every type-k equilibrium is a NE and a NE is not necessarily a type-k equilibrium. Thus, the type-k equilibria are refinements of NE in repeated games.
The set of type-k equilibrium forms the Parteto frontier of all NEs in an infinitely repeated game. Any type-k equilibrium is a Pareto optimum for the group of k players. In three-player games, for example, the relationship between proposition 1 and the folk theorem can be illustrated by Fig 3. For any group of players in an n-player game, there must be a strategy profile f s i g such that these players cannot improve their payoffs by changing their strategies simultaneously. This strategy profile can be a type-k equilibrium if (4) and (5) are satisfied for some players. We prove the existence of type-k equilibrium in general repeated games in proposition 2.
Proposition 2: In an infinitely repeated n-player game where there exists two or more NE, there must be at least one type-k equilibrium.
Proof: When there exists two or more NE, there must be at least one strategy profile that is different from the minimax profile in a NE. Let {s i } denote such a strategy profile. We first prove that there must be v i < u i (s 1 ,. . .,s n ) for at least two players. Assume that there is v a < u a (s 1 ,. . .,s n ) for the player a and v i = u i (s 1 ,. . .,s n ) for any i 6 ¼ a. Since all players except a play their minimax strategies and they have no incentive to deviate unilaterally (because {s i } is a NE), s a is the minimax strategy for a. This conflicts with the premise that {s i } is different from the minimax profile. Thus, there must be v i < u i (s 1 ,. . .,s n ) for at least two players.
Suppose that there are v i < u i (s 1 ,. . .,s n ) for k players in the NE. If those k players cannot improve their payoffs by changing their strategies simultaneously, this NE is a type-k equilibrium. Otherwise, there must be a strategy profile fs 0 i g such that those k players cannot further improve their payoffs by changing their strategies simultaneously and fs 0 i g is a type-k equilibrium.
A NE is stable if a small change in the strategy of one player leads to a situation such that a. the player who did not change has no better strategy. b. the player who did change is now playing with a strictly worse strategy.
A type-k equilibrium is not stable if it is not a NE in the stage game because once a player within the coalition changes his/her strategy in a type-k equilibrium, all other k−1 players will be triggered to change their strategies. We do not concern the players excluded from the coalition because any change in their strategies has no influence on the coalition.
A type-k equilibrium is stable if it is also a NE in the stage game and the NE is stable. For example, the pure-strategy NEs in Fig 2 are type-2 equilibria and they are stable.
We have discussed the existence of type-k equilibrium in infinitely repeated games without discounting. In a repeated game with discounting, the discounted future payoffs must be greater than the excess current payoff due to deviating from the type-k equilibrium in order for each player in a type-k equilibrium to persist their strategies. Consider a constant discount factor δ 2 (0,1) so that the summation of player i's payoff in T+1 periods is the maximum payoff of player i in the stage game given that i deviates from the type-k equilibrium while all players except i keep their strategies unchanged. For each player i within the coordination group, there should be This is the necessary condition for the existence of type-k equilibrium in infinitely repeated games with discounting.

An Example
This example is to show the multiplicity of equilibria in repeated games. Consider a threeplayer game as shown in Fig 4. The option R is dominated by L for every player and the strategy profile (L, L, L) is the unique NE in the stage game.
There are numerous type-3 and type-2 equilibria in the infinitely repeated version of this game. A strategy profile is a type-3 equilibrium when X and Y choose (R, R) and Z chooses whatever mixed strategy in every round. There is a typical type-2 equilibrium when X and Y alternately choose (L, R) and (R, L) and Z chooses L (the point F in Fig 5).
If it is a repeated game with discounting, the necessary condition of the above type-2 equilibrium is that, for players X and Y, there are lim

Conclusions
Proposition 1 is a supplement to the folk theorem. The folk theorem proves the existence of the type-n equilibrium in repeated n-player games. Proposition 1 extends it to the general case of type-k equilibrium (2 k n). Type-k equilibrium is a solution concept for repeated non-cooperative games. A type-k equilibrium is a Pareto optimum for the group of k players. In a type-k equilibrium not only does any individual player not have incentive to unilaterally change their strategies but also a group of k players has no incentive to deviate from it collectively, which means that the type-k equilibrium is stronger than NE in stability.
Type-k equilibrium is different from other refinements of NE, such as strong NE [20] and coalition proof NE [21,22], in that a type-k equilibrium is not necessarily a NE in the stage game and it does not need communication or mediation among players. Type-k equilibrium is different from the concepts of coalition [23,24] or core [25] of cooperative games in that the players in a type-k equilibrium are payoff-maximized and they make their choices independently. A type-k equilibrium does not exist in one-shot game because coordination among the players can be formed only if all players choose to adopt specific reactive strategies. Reactive strategies are strategies for repeated games and they are neither pure strategies nor mixed strategies in the stage game.
The emergence of cooperation in evolutionary dynamics has attracted a great deal of research [26][27][28][29][30][31][32][33][34][35][36]. The type-k equilibrium suggests that cooperation in evolution starts from a group of players rather than an individual player. A group of players would coordinate their actions by adopting specific reactive strategies if they could be better off by doing so. This collective behaviour is much more effective and robust than any individual behaviour in building and maintaining cooperation in evolution.
Backward induction has led to some controversy despite of its wide use in finitely repeated games [37,38]. It is obvious that the choices of players cannot be backward inducted when some of them adopt specific reactive strategies. In finitely repeated games, the end game effect cannot prevent the players from adopting specific reactive strategies. When both past actions and the end game effect have an influence on the strategies of players, transition from one NE to another is possible.
The existence of type-k equilibrium explains, to some extent, why the biodiversity in evolutionary games is much more complex than classical game theory has predicted. The type-k equilibrium belongs to the set of NE that has been neglected in non-cooperative game theory. This set of NE possibly contains more complicated equilibrium than the type-k equilibrium, for example the equilibrium that has two or more coalitions in it.