Evolution of Cooperation in Spatial Traveler's Dilemma Game

Traveler's dilemma (TD) is one of social dilemmas which has been well studied in the economics community, but it is attracted little attention in the physics community. The TD game is a two-person game. Each player can select an integer value between and () as a pure strategy. If both of them select the same value, the payoff to them will be that value. If the players select different values, say and (), then the payoff to the player who chooses the small value will be and the payoff to the other player will be . We term the player who selects a large value as the cooperator, and the one who chooses a small value as the defector. The reason is that if both of them select large values, it will result in a large total payoff. The Nash equilibrium of the TD game is to choose the smallest value . However, in previous behavioral studies, players in TD game typically select values that are much larger than , and the average selected value exhibits an inverse relationship with . To explain such anomalous behavior, in this paper, we study the evolution of cooperation in spatial traveler's dilemma game where the players are located on a square lattice and each player plays TD games with his neighbors. Players in our model can adopt their neighbors' strategies following two standard models of spatial game dynamics. Monte-Carlo simulation is applied to our model, and the results show that the cooperation level of the system, which is proportional to the average value of the strategies, decreases with increasing until is greater than the critical value where cooperation vanishes. Our findings indicate that spatial reciprocity promotes the evolution of cooperation in TD game and the spatial TD game model can interpret the anomalous behavior observed in previous behavioral experiments.


Introduction
Cooperation is ubiquitous in biological and social systems [1][2][3]. In general, cooperation is expensive, which leads to the so-called social dilemma. For a social dilemma, a group of individuals can achieve the maximal payoff by cooperation, but individuals perform best by acting in their own interests. Understanding the origins of cooperation in a group of unrelated and self-interested individuals is a central problem in biological, social, and physical science [3]. The evolutionary game theory is an elegant framework to study such problem [4][5][6][7]. Based on this framework, several mechanisms are proposed to explain the cooperation behaviors in the societies [3]. Among them, we focus on the spatial reciprocity which has been recognized as one of the five primary mechanisms to promote the appearance of cooperation [3]. Below, we briefly review some well-studied spatial evolutionary game models including the spatial prisoner's dilemma (PD) game [8][9][10], the spatial snowdrift (SD) game [11,12], and the spatial public goods game [13][14][15]. For a comprehensive discussion on this topic, we refer the readers to the surveys [7,16,17] and the references therein.
The spatial PD game model is perhaps the most popular spatial game model where each player is located on a node of the network and the players play PD game with their neighbors [8][9][10]. Here the PD game is a two-person game. In the PD game, each player can choose either cooperation or defection. If both of them select cooperation, they both gain R. If both of them choose defection, they both receive P. Instead, if one choose cooperation and the other select defection, the cooperator obtains S while the defector receives T. The parameters in the PD game are required to meet the conditions TwRwPwS and TzSv2R. It has shown that in well-mixed populations, defection is the only evolutionarily stable strategy [5]. That is to say, cooperators are doomed to extinction. However, as observed in the seminal work of Nowak and May [8], when the players of the PD game are located on the square lattice and each of them plays PD game with its neighbors, then the cooperators can survive by forming clusters. This work has inspired a large number of studies on spatial game models [7,[17][18][19][20][21][22][23][24][25][26][27][28][29]. Besides the spatial PD game, another notable spatial game called spatial snowdrift (SD) game are also investigated by the researchers [11,12]. In which the SD game (also called hawk-dove or chicken game) is also a two-person game and the players can only choose either cooperation or defection [6]. Unlike the PD game, the parameters (R, S, P, and T) in the SD game are restricted to TwRwSwP. In the SD game, however, the spatial structure is shown to inhibit the evolution of cooperation [11]. Both of the PD and SD games are the two-person game. The public goods (PG) game, however, is an N-person game where the players can choose either to contribute to the common pool (cooperation) or to contribute nothing (defection). The total investment is multiplied by a so-called multiplication factor because of the synergy effects of cooperation. Then, the multiplied investment is equally distributed to all the individuals irrespective of their initial strategies. The rational player will select defection if its payoff is smaller than the investment cost [30]. As a consequence, the society evolves to the ''tragedy of the commons'' [30], i.e., all the individuals become the free riders. However, similar to the spatial PD game, when the players in the PG game have spatial neighborhood interactions, the number of free riders in the system can be significantly reduced [13,14,25,31,32]. Another spatial game called spatial ultimatum game [33] also deserves to mention. Such a spatial game is recently used to study the evolution of fairness [33]. It has turned out that the spatial structure promotes the dominance of the fair players [33]. More recently, Szolnoki et al. in [34] show that the spatial structure promotes the evolution of fairness only if the players have a multitude of choices to pose their ultimatums.
In this paper, we consider the traveler's dilemma (TD) game which has received extensive attention in the economic society but has attracted little attention in the physics community so far. Similar to the PD game, the TD game is also a two-person game which is proposed by Basu [35]. We give a brief description of the TD game as follows: assume that two travelers have identical souvenirs and both of which have been lost by the airline. The two travelers come back to their airline to ask for compensation. The airline representative does not know the accurate price of the souvenirs, but he knows that the price falls within an interval ½R,M. Therefore, the airline representative asks the two travelers to write down the value from R to M separately. If both travelers claim the same value, then the airline will compensate both with that amount. However, if they declare different values, the airline representative will assume that the lower value is more accurate. Therefore, the representative pays the traveler who claims the lower value that amount plus a bonus of R for his honesty, and gives the other traveler the lower value minus R for penalty. For example, if one traveler declares that the price of the souvenir is 20 while the other traveler declares that its price is 30. Suppose R~2, then the first traveler will receive 22 while the other will get 18. Following [36], we assume that both travelers declare an integer number, and both R and M are integer number. To create a social dilemma, we restrict Rw1, similar restriction has been done in [36].
By the classical game theory, the Nash equilibrium of the TD game is that both travelers claim the minimal number R [35]. Clearly, the maximal total payoff of the travelers is 2M by both declaring the maximal value M. As a result, the TD game yields a social dilemma. Many previous experimental studies found that the players' behavior significantly deviated from the prediction of the classical game theory. Capra et al. [37] found that there exists an inverse relationship between R and the average claim. That is to say, for a small R, the average claim could be a large value. Subsequently, Goeree and Holt [38] presented a learning framework to interpret such anomalous behavior. More recently, Manapat et al. [36] proposed a stochastic evolutionary framework to explain the cooperation behavior observed in TD game. Specifically, they studied stochastic evolutionary dynamics in finite populations with varying selection and mutation rate parameters, and their theoretical results confirmed the observed cooperation behavior. In this paper, we study TD game on a square lattice by adopting the standard spatial game model. Using Monte-Carlo simulation, we find that the observed cooperation behavior in our system is consistent with the previous experimental observations. Furthermore, we also present an analysis on an ideal model where the players can only select two pure strategies (R and M) to explain the observed phenomenon which further confirms our results. Our findings indicate that the spatial reciprocity can facilitate the evolution of cooperation in TD game, and thereby the spatial TD game model can be used to interpret the observed cooperation behavior in TD game.

Model
The TD game is a two-person game with multiple strategies. In the TD game, each player selects an integer value from the range ½R,M as a pure strategy. Clearly, there are M{Rz1 strategies. For convenience, we label these strategies as R, Á Á Á ,M, where 1vRvM. Without loss of generality, we set M~100, and similar setting has been considered in [36]. The payoff, denoted by A ij , for a traveler claiming an integer value i[½R,M (strategy i) when the other declaring an integer value j[½R,M (strategy j), is given by if i~j, izR, if ivj, j{R, if iwj: In the above TD game, the Nash equilibrium is to choose the minimal value R [35]. Similar to the prisoner's dilemma game, in TD game, defection (claiming a low value) will dominate cooperation (claiming a high value). In many previous behavioral studies [37,38], however, the researchers found that the players in TD game tended to select a much higher value than the minimal value. In this paper, we examine the impact of spatial structure in TD game. More specifically, we study evolutionary TD game in finite structured population where each player is located in a site of a square lattice with periodic boundary conditions. In our model, each player plays TD game with their nearest neighbors, and the total payoff of a certain player is the sum over all the payoffs gained by playing TD game with his neighbors. Following the standard spatial game model [8,39], a randomly chosen player u can revise his strategy by adopting a strategy from his neighbors' strategies. We consider two strategy adaption rules. The first one is the best-take-over rule where the player always updates his strategy based on his payoff and his neighbors' payoffs. Specifically, under this rule, if the payoff of the player is smaller than the maximal payoff of his neighbors, the player adopts the strategy of his neighbor who has the maximal payoff, otherwise his strategy is unchanged. Similar strategy-adoption rule has been used for studying spatial prisoner's dilemma game [8,30]. The second one is the Fermi rule [16]. In this rule, a player u randomly selects one of his neighbor v and adopts the strategy of player v with the probability where s u denotes the strategy of player u, and t denotes the noise parameter modeling the uncertainty caused by strategy adoption. As explained in many previous studies [15,26,39], for any finite positive t, better performing strategies are easier adopted and poor performing strategies are selected with a very small probability. At t?0 limit, the strategy adoption is nearly deterministic where the players will always select the better strategies, while at t?? limit, the strategy adoption is random. We apply Monte-Carlo simulation to above spatial TD game. The size of the square lattice in our simulation is 500|500. Initially, unless specified otherwise, each player on site u is randomly designated a strategy from R to M with equal probability, i.e., 1=(M{Rz1). In each Monte-Carlo step, for the best-take-over rule, each player revises his strategy based on his payoff and his neighbor's payoff as described above. For the Fermi rule, each player randomly selects one of his neighbors j and adopts the strategy of player j according to the probability described in eq. (2). In all the simulations, we consider both synchronous updating and asynchronous updating for the players' strategies. To measure the cooperation level of a system, we define a quantity denoted by r c as the normalized difference between the average value of all the players' strategies and the minimal value of strategy (R). More formally, r c is given by Clearly, r c is proportional to the average claim over all the players, and the value of r c falls within a range of [0, 1]. r c~0 denotes that all the players declare the minimal value R in which  the system has the lowest cooperation level, and r c~1 denotes that all the players declare the maximal value M where the system has the highest cooperation level. For the best-take-over rule, we run 11,000 Monte-Carlo simulation steps. r c is obtained by averaging over the last 1,000 Monte-Carlo steps. However, for the Fermi rule, we perform 25,000 Monte-Carlo simulation steps and r c is calculated through averaging over the last 5,000 Monte-Carlo steps. All the results presented below are the average results over 30 realizations of initial strategies.

Results
We start by reporting the results of the spatial TD game under best-take-over rule. Fig. 1 depicts the simulation results for r c as a function of the parameter R on two different square lattice models with synchronous and asynchronous strategy updating. From Fig.  1, we can find that the results obtained by the asynchronous strategy updating (Fig. 1(c), (d)) are very similar to the results got by the synchronous strategy updating ( Fig. 1(a), (b)). Therefore, we focus on describing the results obtained by the synchronous   Fig. 1(a)), we can observe that (1) cooperation emerges given R is smaller than the critical value R t (R t &40), and (2) cooperation level r c decreases monotonically with increasing R until R reaches R t , where the cooperation level becomes 0. These results are consistent with the previous experimental observations in traditional TD game [36,37], which show that there exists an inverse relationship between R and the mean value claimed by the players in TD game. Further, the results suggest that the spatial TD game model can be used to interpret the anomalous behavior observed in traditional TD game. Similar results ( Fig. 1(b)) can be observed on the square lattice with 8-player neighborhood (Moore neighborhood). There is a minor difference in this model. In particular, there are certain points in Fig. 1(b) showing that r c does not decrease monotonically with increasing R, although the general results conform with those of the previous model. Similar to the results observed in traditional spatial game models (e.g., spatial prisoner's dilemma game and spatial public goods game), our findings indicate that spatial reciprocity also promotes cooperation in TD game. Fig. 2 and Fig.  3 show the cooperation level r c and the standard deviation of the strategies d as a function of the evolution time on different square lattices given that R~10. We have also confirmed that the results are very similar for other R values (e.g., R~20). From Fig. 2 and Fig. 3, we can see that the cooperation level and the standard deviation are unchanged after 10,000 Monte-Carlo steps. That is to say, the system converges into a stable state within 10,000 Monte-Carlo steps for different square lattices, although early convergence can be observed. Moreover, it can be observed that the convergence time of the system with asynchronous updating is slightly longer than that of the system with synchronous strategy updating. In addition, we also study the effect of the initial strategy distribution on the evaluation of cooperation in the spatial TD game. Fig. 4 depicts the results for the cooperation level r c as a function of R given that the initial strategy distribution is a powerlaw distribution (with a power factor 3) from the strategy space (We have also checked other power factors. The results are very similar.). As can be seen, the results are similar to the results observed in Fig. 1. These results indicate that the spatial TD game seems to be robust to the initial strategy distribution. In the following, we will interpret the emergence of cooperation in spatial TD game and the observed phenomenon of the inverse relationship between r c and R respectively.
To further reveal the potential mechanism behind the emergence of cooperation in spatial TD game, we can see the spatial patterns of the spatial TD game generated in our simulation. Figs. 5(a-d) show a series of three characteristic snapshots taken at different times which describe the cooperation level of R~2 and R~10 on two different square lattice models with size 100|100 respectively. Time evolution starts with a random initial state and ends in a stationary state (from the left snapshot to the right snapshot of Figs. 5(a-d)). From the left snapshot to the right snapshot of Fig. 5(a), we can observe that the cooperation level of the system increases with increasing iterations until the system goes to the stationary state. In addition, it can be seen that cooperators who declare the same large value will form scattered clusters (middle snapshot of Fig. 5(a)), and such clusters spread out over the territory of defectors who declare small values. In the stationary state (right snapshot of Fig. 5(a)), we can see that the strategy value becomes very large (close to M) and the number of different strategies becomes very small comparing with those in the initial state. Moreover, the players who declare the same large  value will form a stationary cluster, and such stationary clusters can resist the invasion of the defectors. Similar results can be observed in Figs. 5(b-d). These results indicate that the square lattice structure promotes the formation of clusters of cooperators, and thereby enhances the cooperation level of the system which further confirms that spatial reciprocity works well in TD game. In addition, by comparing the right snapshot of Fig. 5(a) with the right snapshot of Fig. 5(c), we can observe that the cooperation level of R~2 is clearly larger than the cooperation level of R~10. The reason is that, for R~2, the players with the same large strategy (nearly M) form a large cluster (see the right snapshot of Fig. 5(a)), while for R~10, the size of such cluster is small. Furthermore, for R~10, there is a large territory occupied by the players who declare the same medium value (around 75). As a consequence, the cooperation level of R~10 is smaller than the cooperation level of R~2.
As observed in Fig. 1, the cooperation level decreases monotonically as R increases until R is greater than the critical value R t (R t &40). To interpret this observation, here we study the relationship between the cooperation level of the system (r c ) and the parameter R in an ideal model where the players on the square lattice can only select two pure strategies: R or M. First, we consider the case of the cooperator invasion. For simplicity, we assume that the system initially has four cooperators (players selecting strategy M) forming a square cluster, and all the other players are defectors (players selecting strategy R). Under such initial state, for the square lattice with 4-player neighborhood, we have the following results: (1) if Rv2M=5, the cooperators conquer the whole population, (2) if Rw2M=5, the cooperators are extinct, and (3) if R~2M=5, cooperators and defectors are coexistent (the initial state is unchanged). Similarly, for the square lattice with 8-player neighborhood, we have the following results: (1) if Rv3M=10, the cooperators invade the whole population, (2) if Rw3M=10, the cooperators are extinct, and (3) if R~3M=10, then cooperators and defectors are coexistent (the initial state is unchanged). Fig. 6 and Fig. 7 illustrate the time evolution of cooperator-invasion on a 10|10 square lattice with 4-player neighborhood and 8-player neighborhood respectively. As desired, if the conditions of the cooperator-invasion are satisfied, the cooperators take over the whole population in the stationary state as illustrated in Fig. 6 and Fig. 7.
Second, we consider the case of defector invasion. Suppose that the system initially has only one defector who selects strategy R, and the rest of the players are cooperators who select strategy M. Under such initial configuration, for the square lattice with 4player neighborhood, we have the following results: (1) if Rv3M=8, the defectors vanish in the stationary state, (2) if 3M=8ƒRƒ2M=3, the defectors and cooperators will be coexistent in the stationary state, and (3) if Rw2M=3, the defectors conquer the whole population in the stationary state. Likewise, for the square lattice with 8-player neighborhood, we can derive that (1) if Rv7M=16, defectors will disappear in the stationary state, (2) if 7M=16ƒRƒ8M=11, the defectors and cooperators will coexist, and (3) if Rw8M=11, the defectors will take over the entire population. Fig. 8 and Fig. 9 depict the time evolution of defector-invasion on a 7|7 square lattice with 4-player neighborhood and 8-player neighborhood respectively. From Fig. 8 and Fig. 9, we can clearly see that if the conditions of defector-invasion are met, then the defectors will occupy the whole lattice.
Based on our analysis in the ideal models, for the square lattice with 4-player neighborhood, we can conclude that, if Rv2M=5, then cooperator invasion will emerge and if Rw2M=3, then there is no cooperator in the system. For a large square lattice system, we can approximately analyze a small sub-lattice (eg. 10|10) by applying our results in the ideal model. In such a small sub-lattice, assume that four players who form a 2|2 cluster as illustrated in  in figure (a)) and forty-six cooperators (the forty-six blank squares). If Rw2M=3, the defectors invade all the squares in the stationary state (t~7, figure(d)). doi:10.1371/journal.pone.0058597.g008  If R is large, then the cooperator-invasion condition (i.e., Rv2M 0 =5) could not be met. As a result, all the cooperators would vanish, and thereby the defectors will occupy the small sublattice. Further, the sub-lattice occupied by the defectors would spread out over the whole lattice given a large R, thus resulting in a low cooperation level. In contrast, if R is small, then the cooperator-invasion condition (i.e., Rv2M 0 =5) could be satisfied, and thereby the cooperators invade the small sub-lattice, and then form a cooperator-cluster which can defend the invasion of defectors. If R is small enough, the cooperator-cluster could spread out over the whole system, leading to a high cooperation level. On the other hand, suppose only one player selects the smallest strategy R and all the other players in the sub-lattice select strategy M 0 (i.e., RvM 0 ƒM). If R is large, the defector-invasion condition Rw2M 0 =3 could be easily satisfied, thereby the sublattice could be occupied by the defectors. Then, the defectors form a cluster which could spread out over the whole lattice, thus resulting in a low r c . On the contrary, if R is small, then the defector-invasion condition (i.e., Rw2M 0 =3) could not be satisified. Moreover, if the condition Rv3M 0 =8 is met, then the cooperators will occupy the small sub-lattice. Consequently, the cooperators will form a cooperator-cluster, and then they could spread out over the whole lattice, which leads to a high r c . Put it all together, we conclude that large R promotes defector invasion, while small R facilitates cooperator invasion. Therefore, the cooperation level of the system (r c ) exhibits an inverse relationship with the parameter R. In addition, it is worth noting that if Rw2M=3 (implying Rw2M 0 =3), then the system will be dominated by the defectors. Hence, the critical value of the system must be smaller than 2M=3. Our result in Fig. 1 (left panel) shows that the critical value is around 40, which is clearly smaller than 2M=3&67. Similar analysis can be done in the square lattice with 8-player neighborhood. Now we turn to report the result of the spatial TD game with Fermi rule. Fig. 10 depicts the results for r c as a function of R on two lattice models at t~0:1 under both synchronous and asynchronous strategy updating. Like the best-take-over case, we are able to observe that the cooperation emerges given R is smaller than the critical value R t . In general, r c decreases monotonically with increasing R until the critical value R t , where r c~0 . Fig. 11 shows the critical value R t as a function of the noise parameter t under different lattice models. In Fig. 11, the region below the curve denotes the parameter space where the cooperation level of the system is greater than or equal to 0, i.e., r c §0, while the region above the curve is the parameter space in which the cooperation level of the system equals 0, i.e., r c~0 . Compared to the model with best-take-over rule, there are two differences in the model with Fermi rule. First, we find that the critical value of the model with Fermi rule is slightly smaller than those of the model with best-take-over rule. Second, we can see that r c is slightly fluctuating, although the values are obtained by averaging a large number of Monte-Carlo steps. Moreover, we have checked that using a longer transient time (e.g., 50,000) and averaging over a larger number of Monte-Carlo steps do not significantly affect the simulation results. Fig. 12 and Fig. 13 illustrate the evolving behavior of the system. Indeed, from Fig. 12 and Fig. 13, we can observe that the system converges into a relatively stable state after 20,000 Monte-Carlo steps. These results indicate that the besttake-over rule could be better than the Fermi rule to promote the emergence of cooperation in spatial TD game.

Conclusions
To summarize, we have investigated the evolution of cooperation in spatial TD game, where the players are placed on a square lattice. An individual gains payoff by playing TD game with his immediate neighbors. Two evolutionary rules, namely best-takeover rule and Fermi rule are studied in the spatial TD game model. More specifically, for the best-take-over rule, each player revises the strategy based on his payoff and his neighbors' payoffs. For the Fermi rule, a randomly-selected player adopts one of his neighbors' strategies with a probability depending on the difference of their payoff. We apply Monte-Carlo simulation to our models, and the results show that the cooperation level of the spatial TD game has an inverse relationship with the parameter R. In particular, the cooperation level decreases monotonically with increasing R until R reaches the critical value R t , where the cooperation level vanishes. By visualizing the spatial patterns of our models, we find that the cooperators who select the same large strategy will form clusters in the stationary state, and such clusters can resist the invasion of defectors. To further explain our findings, we analyze the conditions of both cooperator-invasion and defector-invasion in an ideal model, where the players are given two pure strategies to select: R or M. Our analysis implies that the large R hampers cooperator invasion and facilitates defector invasion, while the small R promotes cooperator invasion and impedes defector invasion. As a result, the cooperation level of the system exhibits an inverse relationship with the parameter R.  Our findings suggest that the spatial reciprocity can promote the evolution of cooperation in TD game. Furthermore, these findings indicate that the spatial TD game model can be used to interpret the anomalous behavior in TD game that is observed in many previous behavioral studies [37,38]. We hope that this work will inspire future studies on investigating the evolution of cooperation in spatial TD game, which has attracted little attention in physics community. For example, one promising direction is to study the impact of network structure on the evolution of cooperation in spatial TD game. In addition, evolutionary dynamics are typically affected by the mutation rate [40]. Another promising direction is to investigate how mutation influences the evolution of cooperation in spatial TD game.