Influence of Opinion Dynamics on the Evolution of Games

Under certain circumstances such as lack of information or bounded rationality, human players can take decisions on which strategy to choose in a game on the basis of simple opinions. These opinions can be modified after each round by observing own or others payoff results but can be also modified after interchanging impressions with other players. In this way, the update of the strategies can become a question that goes beyond simple evolutionary rules based on fitness and become a social issue. In this work, we explore this scenario by coupling a game with an opinion dynamics model. The opinion is represented by a continuous variable that corresponds to the certainty of the agents respect to which strategy is best. The opinions transform into actions by making the selection of an strategy a stochastic event with a probability regulated by the opinion. A certain regard for the previous round payoff is included but the main update rules of the opinion are given by a model inspired in social interchanges. We find that the fixed points of the dynamics of the coupled model are different from those of the evolutionary game or the opinion models alone. Furthermore, new features emerge such as the independence of the fraction of cooperators with respect to the topology of the social interaction network or the presence of a small fraction of extremist players.


Introduction
Evolutionary game theory has been introduced as a framework to study the processes of selection of genes or behaviors in biological and social systems [1][2][3]. Its aim is to characterize the choices in terms of strategies of individuals of a population playing a game. A particular strategy generates a payoff to the individual playing it that depends on the selection of the rest of individuals.
The key assumption of the evolutionary theory is that the fitness of an individual to reproduce directly relates to the payoff obtained [1]. Consequently, most successful strategies in terms of payoff are also those that multiply faster and can eventually become dominant after some generations.
These ideas find an analytical expression in the form of the socalled replicator equation [4][5][6]. If x i stands for the fraction of individuals in the population playing strategy i, f i (x x) for their payoff and f f (x x) for the average payoff over all the population, the replicator equation reads The fixed points and limit cycles of the equation define the final state of the system regarding the distribution of strategies in the population [3][4][5]7]. Moreover, the study of the stability of the solutions, particularly if they are formed by single strategies, to invasion by other strategies motivates the definition of evolutionary stables strategies (ESS) [7]. To illustrate the predictions of this approach, one can consider the social dilemmas such as the public goods game or the prisoner's dilemma. In these games, each individual must choose between collaborating with her partners getting a intermediate value of the payoff or to defect and try to take advantage of those partners that are collaborating to gain a higher payoff. Despite collaboration is beneficial to the population as a whole, the egoist inclination of each single individual to maximize her payoff leads to generalized defection as the replicator equation predicts since this is the only stable solution [3,4]. This result can seem a little drastic especially when considered in the light of everyday experience in human societies or the known behavior of social animals. Different mechanisms have been proposed to explain how the collaboration levels can increase in a population. One is, for instance, taking into account the finite and discrete character of the individuals in the population. This point goes beyond the assumptions of the continuous theory and provide thus a escape door to obtain more collaboration or even to the invasion of collaborative individuals in a full-defect population [8][9][10]. However, its efficiency as an explanation does not extend to large systems since the probability of survival or invasion of collaborative strategies decreases fast with the population size. Other possibility that has been theoretically discussed is that structured populations may increase collaboration. Geographical extended systems simulated using spatial lattices show a remanent level of collaboration [11][12][13][14] and even chaotic patterns separating areas of collaborating and defecting individuals [11]. The structure of social networks enhances collaboration via the heterogeneity of individual roles that the different positions in the network produce [15][16][17][18][19][20]. Also random mutations or the individuals' free exploration to search for a best response to the strategies of their counterparts are another element that can promote collaboration [13,[21][22][23][24]. Finally, the fixed points of the system dynamics, including the level of cooperation, are affected too by the way in which the system updates either by taking into account discrete versus continuous dynamics [25,26] or by altering the update rules [27][28][29][30].
In this work, we explore a mechanism that can also play an important role to raise collaboration levels in social systems. The basic idea goes back to the fact that humans not always take the most rational option when presented with a dilemma [31][32][33]. This has been observed in experiments in controlled environments in which participants, in general students, were playing Prisoner's dilemma [34][35][36][37][38][39]. Also, in other level, it is a well known behavior in the world of finances where decisions on buying and selling can be taken based on rumors or on a general state of opinion about the possibilities of an investment [40]. Our proposal is to increase the dimensionality of the system by noting that the opinion on which is the best strategy is an important variable to incorporate, even though in some cases such belief can be wrong or baseless with respect to actual performance in the game. The evolution of the system includes thus a purely social ingredient related to opinion formation [41] followed by a process of decision taking that relies on the formed opinion. In the abstract representation of Equation (1), the addition of a variable of opinion can be modeled as where the index describing the opinion j can be continuous or, as in this example, discrete, w j represents the fractions of individuals holding opinion j, g() is a function that relates the opinion j with the probability of playing strategy i and the function h() describes the evolution of the opinions given the state of the system and the outcome of the game. The addition of the new field w corresponding to the opinions of the individuals and the new rules of update given by the interchange of opinions between individuals can lead to extremely different fixed points and solutions for this system. In the following, we provide an example with a simple model that shows how these ideas can be implemented in practice and how the dynamic and stationary predictions of evolutionary game theory can dramatically change due to the coupling between opinion and games.

Model
We take as basis a well-known model for opinion dynamics, the Deffuant model [42], and a game inspired by the dilemma of the tragedy of commons [43,44]. The opinion in the Deffuant model is described by a continuous variable w between {1 and 1. Considering a population of N agents, each one placed on a node of a network, the update of opinions is carried out by randomly choosing an agent i and one of her neighbors j and comparing their opinions at time t, w i and w j . If Dw i {w j DvE, the interaction occurs and the new opinions are given by Otherwise, if the difference between w i and w j is larger than E, there is no interaction. The parameter m is the so-called convergence parameter since it regulates to which new value the opinions converge after interaction. In this work, we set it at m~1=2 which implies that the final opinion is the average over both agents opinion. The Deffuant's model shows bounded confidence in the sense that interactions between agents whose opinions are further apart than E are forbidden. The value of E is thus a key parameter to take into account in the following study.
For the game, we consider a simple set of rules that permit the exploration of a dilemma and a harmony scenario by tuning a single parameter. This allows us to show the validity of our findings regardless of the game's ESS. In the rules every time that an agent i plays, she does so with all her k i neighbors. An unit of wealth is then distributed among all of them. If everybody cooperates then the payoff is 1=k i for each agent. Otherwise, each defector is given priority and takes a portion p as payoff. If the total amount requested by the defectors, pn D i , is larger than 1 nobody takes anything. On the contrary, if pn D i ƒ1, the cooperators evenly divide the remaining 1{pn D i . Note that for low values of p, pv1=k i , collaboration is the strategy with the largest payoff and in a pure evolutionary framework becomes the only survival. The same occurs on the other extreme for high values of p, strictly speaking for pw1 defection has a zero payoff. In the area of intermediate p values, the equilibrium of our system is equivalent to that of the public goods game and show the effects of the tragedy of commons dilemma because defection is the most advantageous strategy but if every agent opts for it none of them get any payoff [43,44].
After describing the opinion dynamics and the game rules, it is important to explain how both are coupled. As illustrated in Figure 1, the two extremes of the opinion variable w are identified with the strategies D and C. w represents thus the opinion of the agents about which is the best strategy to win the game. The pass from an agent's opinion to real action is taken by assuming a probability p C~( 1zw)=2 of playing C and p D~1 {p C~( 1{w)=2 of choosing D. It is important to stress that the game is actually played in a mixed strategy framework and that this way of implementing opinion and action is assuming incomplete information, actions based on impressions and a social component in the way the players move towards the selection of a strategy. In practice, the model is updated by choosing a random agent i in each time step, then she plays the game with her neighbors and after this her opinion is updated depending on the earned payoff. For updating the opinions, a neighbor of i, j, is randomly selected and the new opinions are calculated using Deffuant's model of Eq. (3) only if j's payoff is equal or higher than i's. Note that only i's opinion is updated, which introduces an asymmetry in Deffuant's rules. This asymmetry prevents players that are doing better from changing opinion due to interactions with others performing worse, and it also breaks the strong conservation of the average opinion that is a feature of the original Deffuant's model.

Results
Let us start by considering a mean-field situation in which in each time step a randomly selected agent interacts with a group formed by four other agents chosen at random. The first results can be seen in Figure 2 where the average opinion SwT and the average fraction of cooperators f C are displayed as a function of time. The curves of different colors correspond to three values of p: p~0:1, p~0:4 and p~0:8. For games with 5 participants, cooperation C is the most advantageous strategy below pv1=5~0:2. In general if the number of players per game is n, the particular value of p for which C is the best strategy is given by pv1=n. Similar results to those described next are found for any value of n as long as the values of p are consistently updated. The blue curves (p~0:1) correspond thus to a harmony game, where the C strategy becomes prevailing in the system from an evolutionary perspective. This is actually the outcome when the state of the system is updated following a replicator dynamics (see plots on the left column of Figure 2). Otherwise, for pw0:2, the replicator dynamics results in a final state formed mainly by defectors. The update based only on the opinion dynamics, without allowing any coupling between the payoff of the game and the update of the opinions, leads to the selection of a few opinion values. These values of w are separated more than 2E and depend on the model initial conditions. The variability of the initial conditions causes the slight dispersion in the distributions P(w). This is the known final state for the Deffuant model [41,42].
More interestingly, the combination of both game and opinion dynamics on the right-hand plots produces a final state that does not correspond with any of the fixed points of the uncoupled dynamics. Although the defectors are still a minority for p~0:1 and a majority for the other values of p, the dispersion of opinions is noticeable and a small reservoir of agents with opinion opposite to the majority remains. The origin of this small group of agents lies in the difference between the social and the evolutionary dynamics. Bounded confidence prevents the interaction of agents with very different opinion regardless of their difference in payoff. The members of the small group of roguish agents can play with any other agent but they only update their opinion when confronted with their own peers. This behavior would be eliminated in an evolutionary framework, where the payoff and the fitness are strictly related but this is not necessarily the case in a social environment. Actually, this kind of stubbornness against facts has been observed in behavioral economics where persons are asked to play a repeated Prisoner's dilemma. A fraction of the participants opted for pure defection or even pure collaboration despite the existence of more advantageous strategies such as tip for tap or a Markovian response [35,36,38,39]. These experiments also show a continuous strategy exploration by the participants that may not be so certain of their own choices.
The fact that the small group of contrarian players dissolves when the social constraints are relaxed can be observed in Figure 3. In the plot A), the distribution of agents' opinions is displayed for different values of the bound confidence parameter . If E is very low there is very few interaction between agents and therefore the opinions remain frozen in the initial condition, which is an uniform distribution. When E increases, the agents are able to interact with other players holding very different opinions. This leads to the convergence of opinions to values close to the extreme w~{1, which corresponds to pure defection and that in the dilemma with evolutionary dynamics is the only ESS. The players recognize thus defection as the most adequate strategy in the limit E?2 but due to the stochastic nature of the relation between opinions and action are not able to reach w~{1. These results are stable within each of the two games to the variation of the values of the portion taken by the defectors p. The average fraction of cooperators f C can be seen in Figure 3B as a function of p. For all the values of E, a change can be observed in p~0:2 coinciding with the modification of the nature of the game from harmony to a dilemma. Apart from this, some minor corrections are seen due to the discreteness of the group of players. Since only 5 players are considered in each round and if n D stands for the number of defectors in a round, the total payoff reserved for the defectors is n D p. If this amount goes over the unit neither defectors or collaborators get any payoff. Therefore, the maximum number of defectors that a round can sustain comply with the relation n D pw1. The values of p coinciding with 1=n D mark thus a change on the payoff partition in the game. A final detail that we also wanted to explore here is the stability of the solutions if the total wealth is taken as main factor of the opinion update instead of the instantaneous payoff. The use of the total wealth adds a more consistent memory effect since the choice of a successful strategy allows for a continuous income. Still the players are able to recognize the optimal strategy for large values E, but it is important to note the large dispersion of opinions and the peak around w&{0:3 far from the extreme w~{1. Also the stability of the system with p becomes altered with more violent bumps in f C when p passes through the fractional values modifying the payoff partition.
A simplistic mean-field configuration is not a valid match to the more complicated structure that social interactions can present. The social interactions are normally modeled as network whose vertices and edges represent individuals and social relations, respectively. In theoretical works, it has been shown that the topological characteristics of such networks can affect the game outcome increasing, for instance, the level of cooperation in the Prisoner's dilemma [16,18,19]. However, experimental results where real individuals play the Prisoner's dilemma with different network topologies contradict this conclusion since the level of cooperation seems to be similar for different network topologies [35][36][37][38][39]. The explanation provided for this effect is the presence of the so-called moody conditional cooperators: individuals that take their strategic decisions regarding cooperation or defection based on their previous experience as much as on their neighbors' payoff.
The results of our model point in the same direction with a very weak dependence on the topology of the interaction networks as can be seen in Figure 4. In order to introduce different interaction topologies, we run the model on a 2D square lattice, on Erdös-Renyi (ER) graphs [45] and on Barabasi-Albert (BA) scale-free networks [46]. The ER and BA graphs are particular types of complex networks with different level of heterogeneity in the Opinion Dynamics and Evolution of Games PLOS ONE | www.plosone.org number connections of the nodes (degree, k). For ER, the distribution of degrees is Poissonian centered around the average SkT, while for the BA the distribution of degree is a power-law decaying function with exponent {3, P(k)k k {3 . In each case, an agent plays each round of the game with her nearest neighbors alone. In Fig. 4A, the fraction of cooperators f C is displayed as a function of the parameter p for different network topologies and E~1=2. The fraction of cooperators is not very sensitive to the topology. One can find a stronger difference in the distribution of opinions as can be seen in Figure 4B and C, where it can be seen that a model with random interactions or scale-free networks have more marked peaks. We have also explored the spatial distribution of opinions and strategies when the game is played in a 2D square lattice with 4 neighbors per node (Fig. 4D and E). As occurs with the Prisoner's dilemma in replicator dynamics [11], the reduced dimensionality allows for the formation of clusters of agents with close opinions playing similar strategies. The local character of the interactions makes that clusters of collaborators can survive. In Figure 4, we explore also the effect that the heterogeneity in the  degree of the agents in the social networks can have on the opinion. The agents' opinion in an instance and the average opinion over many realizations is displayed as a function of the agents degree (plots F and G). The average opinion tends to be more negative, closer to defection, for better connected agents regardless of the particular characteristics of the network. Even though all the results shown in Figs. 2-5 are for systems of approximately 1000 agents, we have explored larger systems and networks. For instance, for systems with 10000 agents the dynamics becomes slower but the main features such as opinion distributions, fraction of cooperators and formation of domains in lattices are maintained in the stationary regime.
A final aspect of the model that we analyze is the effect that a small fraction of radical agents can have on the opinion and strategies played by the rest of the population. There are two precedents that justify the concern with the role that the extremists can play. One is the existence of such radical individuals playing always the same strategies either cooperation or defection in the experiments [38,39]. The second is that the effect of extremists, who go under the name of contrarians or zealots in the literature, is well known in the opinion dynamic models [47][48][49] or even in the evolution of games [50,51]. A small fraction of extremists can drive the system out of consensus. The fraction of cooperators obtained with the model as a function of p and the opinion distribution for p~0:8 are depicted in Figure 5. The curves for the model with a fraction of extremists of 5% either of players C or D are over-imposed to the baseline without extremists. As can be seen, the average fraction of collaborators f C is weakly dependent on the presence of extremists or zealots. Apart from a slight shift due to the additional 5% players of pure strategies, no major change is observed. However, the same cannot be said regarding the opinion distributions. Both models with extremists show different distributions even though the effect is more dramatic if the zealots are playing ''defect''.

Discussion
In summary, we have introduced a model that couples opinion dynamics and strategies selection in a game. Our main assumptions are that the agents have not certainty on which strategy is optimal and that they form an opinion on this issue which can be Figure 4. Influence of the topology of the interaction network on the outcome of the game. In A), the fraction of collaborators f C as function of the parameter p. In B) and C) the opinion distribution for p~0:1 and p~0:8. Remember that the nature of the game passes from a harmony game to public goods game dilemma at p~0:2. In D) and E), maps showing the opinion and strategy played in an instance of the game. And in F) and G, in the background in grey the agents' opinion for a realization for the game and the average opinion for 100 realizations as a function of the agents' degree k. In all cases E~1=2 and the sizes of the systems are N~1000 for all the systems except the 2D lattices that count with 32|32~1024 agents. The networks are built with SkT&3. doi:10.1371/journal.pone.0048916.g004 updated by social pressure. In particular, for the game we have selected a model based on the rules proposed in the Tragedy of Commons by G. Hardin that allows us to explore two possible final equilibria by tuning a single parameter p. For p below 0:2, the rules of our system produce a scenario that reminds a harmony game, while for pw0:2 a social dilemma equivalent to the public goods game is found. For the opinion dynamics, we use the Deffuant's model that characterizes by having a continuous opinion variable w and a bounded confidence mechanism embodied by the parameter E. If the opinions of two agents are further away than E, no interaction is possible. We take advantage of the continuous nature of w to couple opinions and actions via a mixed strategy scenario. The two available strategies C and D become thus an action that is taken with certainty only in the limits of opinion w 1 and {1, respectively. Any intermediate value of the opinion can be translated into a probability of choosing C or D with a bias towards the closest extreme in w.
Once the coupling of opinion and game dynamics is on, the outcome of the game changes. Of course, the model is stochastic and so a certain amount of dispersion in the main descriptive variables is expected due to the inherent randomness. However, variables such as the average fraction of collaborators or the distribution of opinion reach fixed points in the dynamics different from the de-coupled systems that reflect the constraints that opinion and game payoff put on each other. This effect is enhanced when the parameter is decreased imposing a more strict bounded confidence regime. Cooperation can thus be increased with a more social dynamics for the evolution of the strategies but this is not the only feature that calls for attention in our results.
The presence of the variable of opinion allows the system to adapt to different interaction topologies or to the existence of extremist players in a very particular way. In correspondence to the empirical observations, in the coupled model the fraction of cooperators is not altered by the consideration of different topologies or by the introduction of extremists. It is the opinion distribution instead which is modified to absorb the impact of the new conditions. In the experiments, this phenomenon was explained by the presence of moody players that have into account previous strategies when a new strategic decision was taken. In our model this role is played by the memory effect that the opinion variable provides. In this work, we have selected particularly simple rules for the game and the opinion dynamics. In order to gain further insights in the decision process of real players more theoretical and experimental work is needed. Nevertheless, the interplay between opinion and actions and the fact that the opinion gets updated by social pressure can significantly modify the scenario in evolutionary games.