Emergence and suppression of cooperation by action visibility in transparent games

Real-world agents, humans as well as animals, observe each other during interactions and choose their own actions taking the partners’ ongoing behaviour into account. Yet, classical game theory assumes that players act either strictly sequentially or strictly simultaneously without knowing each other’s current choices. To account for action visibility and provide a more realistic model of interactions under time constraints, we introduce a new game-theoretic setting called transparent games, where each player has a certain probability of observing the partner’s choice before deciding on its own action. By means of evolutionary simulations, we demonstrate that even a small probability of seeing the partner’s choice before one’s own decision substantially changes the evolutionary successful strategies. Action visibility enhances cooperation in an iterated coordination game, but reduces cooperation in a more competitive iterated Prisoner’s Dilemma. In both games, “Win–stay, lose–shift” and “Tit-for-tat” strategies are predominant for moderate transparency, while a “Leader-Follower” strategy emerges for high transparency. Our results have implications for studies of human and animal social behaviour, especially for the analysis of dyadic and group interactions.


Transparent games and reaction times distributions
In the Methods section we argue that evolution favours equal reaction times both in iPD and i(A)CG, since the optimal behaviour in iPD is to wait as long as possible, and in i(A)CG -to act as quickly as possible. However, for iPD there is a notable exception: the Leader-Follower (L-F) strategy is better off when acting fast and exposing its choice to the partner. Consider, for instance, a population consisting of L-F players of two types, the first acting fast and the second waiting. In all inter-type interactions, players of the first type have an upper hand since they take the role of Leaders, maximizing own payoff. Thus the first type dominates the second and finally takes over the population. The question then is, whether this contradiction to the general rule for the transparent iPD (to wait as long as possible) changes the simulation results?
Additional simulations show that this is not the case. We have used the same evolutionary simulations as before with one modification. Instead of using for all types a fixed probability to see the partner's choice p see , we computed this probability for each pair of types as shown in SN1 Fig. 1: from the reaction times (RT) modelled by exponentially modified Gaussian distributions and from the visibility threshold ∆T . Since we were mainly interested in the influence of the types' mean RT on the results, we set other parameters to constants: σ = 0.1 and τ = 0.5.
For each two types i and j we computed probabilities to see partner's choice as follows: 1. Using exponentially modified Gaussian distribution, we generated for each type samples of reaction times RT i,k , RT j,k for k = 1, 2, . . . , K with K = 10 6 .
2. We computed reaction time differences between types i and j by ∆RT k = RT j,k − RT i,k .
3. We estimated probabilities to see partner's choice by where A stands for the number of elements in the set A.
We performed three series of evolutionary simulations for ∆T = 1.98, 0.478, 0.001. These values were selected so that for any type i probability p ii see was equal to 0.001, 0.2 and 0.499, respectively. Each series consisted of 80 runs of evolutionary simulations, we traced 10 9 generations in each run. Except the way the values of p ij see were computed, the simulations were as described in the main text of the manuscript As expected, results were similar to those with equal RT but more noisy since additional type variability increases the number of generations necessary for the population to reach the equilibrium state. In SN1 Fig. 2A, for low (but non-zero) transparency WSLS wins with a total relative frequency above 85% (without GWSLS), but as transparency increases the share of WSLS drops down. On the contrary, the Leader-Follower strategy has the best performance for high transparency with a relative frequency 27% (SN1 Fig. 2C). Note that all successful types have marginal RT: WSLS-players mostly have maximal reaction times, while L-F-players have minimal reaction times. RT were modelled by exponentially modified Gaussian distributions with µ randomly selected from the set {2.0, 2.1, . . . , 3.0}, σ = 0.1 and τ = 0.5. WSLS is considered here together with GWSLS, they have a strategy profile (1abc;1***;****) with a, b < 2/3, c ≥ 2/3. We characterized as L-F all strategies with a profile (*00b;****;*11c), where b < 1/3 and c < 2/3. Finally, we considered a strategy as defecting if it has entries s 4 , s 12 < 0.2, s 1 , s 2 , s 3 < 1/3 and s 8 < 2/3. (A) For low transparencies WSLS is predominate and WSLS-players clearly prefer waiting over fast action. (B) For moderate transparencies population is controlled either by the waiting WSLS players or by the fast-acting defectors, though the latter are successful only since many strategies may have s i 9 , s i 10 , s i 11 , s i 12 > 0, resulting in cooperation with apparent defectors. (C) For high transparencies Leader-Follower outperforms defecting strategies. Note that in all cases types with marginal RT prevail and the observed strategy frequencies are similar to those for equal RT.
The only principal difference from the simulations with fixed p see takes place for moderate transparencies, in particular, for ∆T = 0.478 when probability to see the partner's choice in intra-type interactions is given by p ii see = 0.2. SN1 Fig. 2B shows that in this case defecting strategies have an unexpectedly high relative frequency. However, this seems to be an artefact caused by the fact that for the most types added to the population strategy entries s i 9 , s i 10 , s i 11 , s i 12 > 0 (meaning that players may cooperate even seeing that partner defects). Playing against fast-acting defectors, these types take the role of Followers and become an easy prey. Indeed, if a defecting strategy has µ i = 2, its opponent with µ j = 2.5 sees the choice of the defector with probability p ji see > 0.5, and an opponent with µ j = 3 with probability p ji see > 0.8. In this case probabilities s i 9 , . . . , s i 12 are much more important than for the case when RT are equal and these entries are used only with probability p ii see = 0.2. Fast-acting defecting strategies can be only counteracted by TFT-like strategies with s i 9 , s i 10 , s i 11 , s i 12 ≈ 0. Note that the L-F strategy is not successful against defecting strategies in this case, since L-F can only survive for high p ii see .