## Figures

## Abstract

A game of rock-paper-scissors is an interesting example of an interaction where none of the pure strategies strictly dominates all others, leading to a cyclic pattern. In this work, we consider an unstable version of rock-paper-scissors dynamics and allow individuals to make behavioural mistakes during the strategy execution. We show that such an assumption can break a cyclic relationship leading to a stable equilibrium emerging with only one strategy surviving. We consider two cases: completely random mistakes when individuals have no bias towards any strategy and a general form of mistakes. Then, we determine conditions for a strategy to dominate all other strategies. However, given that individuals who adopt a dominating strategy are still prone to behavioural mistakes in the observed behaviour, we may still observe extinct strategies. That is, behavioural mistakes in strategy execution stabilise evolutionary dynamics leading to an evolutionary stable and, potentially, mixed co-existence equilibrium.

## Author summary

A game of rock-paper-scissors is more than just a children’s game. This type of interactions is often used to describe competition among animals or humans. A special feature of such an interaction is that none of the pure strategies dominates, resulting in a cyclic pattern. However, in wild communities such interactions are rarely observed by biologists. Our results suggest that this lack of cyclicity may stem from imperfectness of interacting individuals. In other words, we show analytically that heterogeneity in behavioural patterns may break a cyclic relationship and lead to a stable equilibrium in pure or mixed strategies.

**Citation: **Kleshnina M, Streipert SS, Filar JA, Chatterjee K (2021) Mistakes can stabilise the dynamics of rock-paper-scissors games. PLoS Comput Biol 17(4):
e1008523.
https://doi.org/10.1371/journal.pcbi.1008523

**Editor: **Attila Csikász-Nagy,
King’s College London, UNITED KINGDOM

**Received: **November 17, 2020; **Accepted: **March 18, 2021; **Published: ** April 12, 2021

**Copyright: ** © 2021 Kleshnina et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **All relevant data are within the manuscript and its Supporting information files.

**Funding: **This study was supported by the following grants: the European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie Grant Agreement #754411, to MK; the Australian Research Council Discovery Grant DP180101602, to JAF; the European Research Council Consolidator Grant 863818 (FoRM-SMArt) to KC. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

The question frequently arising in ecology is: *Under which conditions does a particular type of species survive?* This question is also relevant in the context of understanding a wide range of environmental, social, genetic and other conditions potentially influencing evolutionary trajectories. Evolutionary game theory, a branch of game theory and ecological sciences, aims to answer that question [1–5]. One of the most well-known games applied to biology is the rock-paper-scissors game (RPS). Here, rock beats scissors, scissors beat paper and paper beats rock. Whether we are talking about population dynamics or economics and human behaviour, this game is known to illustrate salient features while being easy to understand (for a thorough review of the models used to study RPS games see [6]). In biology, this game was applied to explain cyclic dynamics in some species such as mating strategies of side-blotched lizards [7, 8] and phenotypic competition in bacterial strains of *E. Coli* [9, 10]. Furthermore, in the engineered microbial populations, introduction of such a competition seemed to stabilise the community [11] and even promote cooperation [12]. Moreover, it was suggested that introduction of new strategies into classic social dilemmas, such as loners [13–15] or risk-averse hedgers [16], can lead to cyclic competition. Nevertheless, cyclicity is rarely observed in wild communities of microbes [17], even though it was shown experimentally that behavioural heterogeneity in microbes can stabilise communities [18]. Recently, it was suggested that it might be challenging for such non-transitive competition to evolve in the first place [19]. However, even if cyclic competition emerges, its stability can be very sensitive to the exact balance in the community, potentially leading to the dominance of only one strategy [20]. In this paper, we utilise a game-theoretic concept of incompetence [21, 22] which allows individuals to make mistakes during the execution of their strategy. This results in a potentially unintended strategy being actually played during the interaction with another individual. We show that such an assumption can induce evolutionary stability in the initially unstable rock-paper-scissors dynamics and predict possible outcomes of the competition under the assumption of execution errors.

Behavioural stochasticity is an expanding field rich in different approaches to the problem. An approximation of behavioural errors of players in games was first considered as “trembling hands” [23] with the presence of mistakes during the strategies’ execution with some small probability. Later, in evolutionary games it was modelled via mutations [24, 25], language learning [26–28] or other experimental learning processes [29–32], adaptation dynamics [33], phenotypic plasticity [34], edge diversity in games on graphs [35, 36], and noise in continuous and discrete-time replicator dynamics [37–40]. Furthermore, mutations of players were introduced to the replicator dynamics via the replicator-mutator dynamics [28, 41], where each type has its own mutation rate but these mutations do not occur simultaneously. However, behavioural stochasticity at the moment of interaction was not considered in these studies.

An attempt to generalise players’ behavioural mistakes via the notion of incompetence was made in classic game theory [42]. Later, the concept of evolutionary games under incompetence was suggested to model such social problems of species in biological settings [22]. The notion of incompetence proposes a general framework for modelling behavioural mistakes with the underlying assumption that only one of the *n* non-cooperative strategies can be executed. That is, with a certain probability, individuals might execute a strategy different from the one they chose. In these settings, both players are prone to making mistakes resulting in stochastic payoffs of all involved individuals, altering overall population’s fitness.

Here, we consider the following scenario. Imagine, each randomly chosen individual finds itself in the pairwise interaction with another randomly chosen individual. Both of them choose a strategy to play. However, the chance that they will play their chosen strategies depends on two factors: on the overall level of behavioural plasticity in the population and a distribution of behavioural mistakes. If the population is completely homogeneous, then all interactions among the individuals are deterministic (λ = 1, see Fig 1A). However, if the population’s behaviour is plastic (λ < 1), then individuals may make mistakes when executing their chosen strategies. The probabilities of playing one or another strategy are determined both by the degree of plasticity, λ, and their maximal probabilities of mistakes captured in matrix *S* (λ = 0). The latter results in behavioural plasticity that perturbs the game outcome (see Fig 1B). In some games, execution errors mean that organisms are able to execute strategies required by the environmental conditions even when they make a wrong choice. That is, species execute strategies that are required for their survival in the environment, by mistake. We do not assume that they carry out this execution consciously. However, this random characteristic may be crucial when we consider changing environments where adaptation becomes particularly important and depends strongly on the interplay between behavioural patterns and fitness.

(A) Rock-paper-scissors dynamics with pure strategies is described by a fitness matrix such that the cyclic relationship between the three strategies is promoted. (B) The effect of execution errors on the example of one interaction: here individual 1 has chosen strategy paper and individual 2 has chosen strategy rock. Without mistakes, individual 1 would win this instance of the contest. However, a mistake in the execution leads to mixed strategies being played for both individuals resulting in different possible outcomes of the interaction. Hence, the outcome of the game is no longer deterministic but stochastic and depends on the probability distribution of mistakes.

In low-dimensional games this interplay can be captured and analysed in detail. Unfortunately, it becomes challenging as dimensionality of a game grows where even small perturbations may impact an evolutionary outcome. However, under a natural assumption that behavioural mistakes are completely random, we can describe game behaviour for general *n* dimensions. We show that in such settings, strategies (or behavioural types) leverage their fitness advantage. This in turn might lead to only one strategy dominating. Further, we assume that mistakes do not have to be completely random. We consider a symmetric case of an unstable RPS game where no choice of strategies yields a fitness advantage. Such games lead to a heteroclinic orbit where none of the strategies dominate. We choose such settings precisely because it is challenging to induce stability in these games. By contrast, an initially stable version of the RPS game can promote biodiversity even in finite populations settings [43], and even very small perturbations can stabilise a classic version of the RPS game [44]. We show that behavioural mistakes bring asymmetry to the game, breaking the cyclic relationship and potentially leading to dominance of one of the strategies. That is, the structure of execution errors may technically imply the existence of an evolutionary stable interior point.

## Model

In this paper we focus on the RPS dynamics. Hence, we shall mostly work with the general form of *R* given by
(1)
where [4].

In classic games, there is an underlying assumption that players are able to execute the chosen actions perfectly. We assume that actions selected by players may not coincide with the executed actions. Such behavioural stochasticity results in executing unintended strategies and is captured in matrix *Q*(λ) from [21] defined as
(2)
In [21] the authors called *Q*(λ) the incompetence matrix with elements *q*_{ij}(λ). However, in the biological context considered here the name *plasticity matrix* is more appropriate. This stochastic matrix is constructed from the set of all probabilities of player 1 executing action *j* given that she selects action *i*. When λ = 1, *Q*(1) = *I* and no mistakes are observed in the population. Hence, the population is behaviourally homogeneous and all interactions are deterministic. However, if λ < 1, then with probabilities *q*_{ij}(λ) an individual chooses to play strategy *i* but plays strategy *j* instead. We say that in such a case the population is *Q*(λ)-heterogeneous and the outcomes of the interactions are now stochastic. We shall call λ the *strength of behavioural plasticity*. In the limit as λ → 0, the matrix *Q*(0) is equal to *S*, which is defined as a limiting distribution of behavioural mistakes. Such a matrix in the case of a three-strategy matrix game has the form
(3)
and is also a stochastic matrix. Every *i*-th row of this matrix defines a mixed profile of each strategy *i*. We define the expected incompetent reward matrix as a perturbation of the fitness matrix by plasticity (or incompetence), namely
(4)

It is sufficient to consider the following simpler canonical form of the fitness matrix
(5)
where *D*(*R*(λ)) is a matrix with each column *j* consisting of the diagonal elements of *R*(λ), inducing *r*_{jj}(λ) = 0, *j* = 1, 2, 3 since such positive linear transformation of the fitness matrix does not affect the qualitative behaviour of replicator dynamics [45]. In our further analysis, we will focus on the equilibrium analysis of the games with the fitness matrix , and explore possible transitions caused by λ changing values in [0, 1]. Then, substituting (4) into (5), we obtain a new game with mistakes given by
(6)
where every element of the fitness matrix has the form
with **q**_{i} being the *i*-th row of matrix *Q*(λ) from (2).

In the evolutionary sense, behavioural mistakes lead to perturbations in fitness that populations obtain over time. This might be due to populations’ migration to new and unexplored environments or due to changing environments. Then, interacting individuals obtain a finite number, *n*, of available behavioural strategies. With the absence of mistakes, both interacting individuals are making their strategical choices which lead to some payoff according to the fitness matrix *R*. However, mistakes from matrix *Q*(λ) perturb the outcome of the interaction twice as both interacting individuals are prone to execution errors. Hence, the population dynamics now depends on the degree of plasticity, that is competency of individuals, according to replicator equations [46] defined as
where the fitness of *i*-th strategy is given by
where is the *i*-th transposed unit basis vector. The mean fitness of the entire population is defined as

### Interpretation of λ

The model proposed here was first referred to as a “game with incompetence of players” [21, 42]. That is, the matrix *Q* was consisting of probabilities of players’ mistakes, when they intended to execute strategy *i* but played strategy *j* instead. Such a model was inspired by an analogy with tennis players, where less experienced players are prone to hitting a different shot to one they initially intended. Here, players have a set of *n* possible shots to hit. Given the complexity level of the shot as well as players’ talents, those probabilities of mistakes will not be uniform. Moreover, players are learning while training and, hence, reducing their incompetence. This was captured in the parameter λ: with the level of mistakes decreasing as λ → 1.

This concept was next considered in the evolutionary settings as a modelling approach to adaptation to a new environment [22]. First, it was assumed that a population is immersed into a new environment, which can happen either due to migration of animals or changing environmental conditions. It is assumed that there are *n* behavioural types or strategies available to individuals. Then, new conditions might increase stress levels and force individuals’ behaviour to deviate from the one in the old environment. Such deviations are then captured in the matrix *S*. As time passes by, animals learn and adapt to their new environmental conditions, which is then reflected in the parameter λ. In such settings, one can also assume some form of learning dynamics, λ(*t*) [47].

Another possible way to think about this model, is to apply it at a genetic level [48]. That is, we would construct a game between *n* pure types, for instance, genes in microbes. The time-dependent process of λ(*t*) evolving from 1 to 0 can then be considered more as environmental stimuli dynamics and have various functional forms reflecting environmental fluctuations. Matrices *S* and *Q* would represent levels of phenotypic plasticity, where each phenotype would allow some mixing between *n* genes that depend on the level of environmental stimuli. Then, natural selection would drive the evolution, which might result in extinction of one type or another. This also depends on the assumption concerning the exact form of environmental fluctuations.

Here, we focus on the more general interpretation of λ as the strength of behavioural plasticity. For this general approach we do not impose any time-dependence on λ. Instead, we study all possible equilibria for each of the values of λ in the interval [0, 1]. Every pure strategy *i* has an assigned probability distribution captured in the matrix *Q*(λ). When λ = 0, the population utilises a limiting distribution of mistakes *S* and has maximal plasticity. When λ = 1, the population’s behaviour is deterministic and no plasticity is observed. This can be interpreted as an approach to modelling behavioural heterogeneity or noise in interactions. Specifically, in the settings of phenotypic plasticity, it is natural to assume a complete randomisation in the strategy execution corresponding to *S* being comprised of uniformly distributed probability vectors. However, in terms of adaptations to new environmental conditions, probability of mistakes may differ depending on the strategy being chosen. Thus, we shall assume a general form of matrix *S*. Next, we shall first demonstrate this model on some examples.

### Motivating example 1

First, consider phenotypic behavioural plasticity as an interpretation of the model. In such settings, it is natural to assume that “execution errors” are symmetric and equally likely. That is, let us assume that if λ = 0, then individuals are completely random in their strategic choice. Then, all components of matrix *S* are equal and are given by . Next, let us assume that strategies are not equal in their fitness advantages by considering the fitness matrix *R* given by

Game flows for different values of λ are depicted in Fig 2. For λ = 1, the game possesses an unstable mixed equilibrium (see Fig 2A). As the strength of behavioural plasticity increases (λ decreases), the game dynamics experiences several transitions. First, the interior equilibrium of the game with pure strategies is pushed to the population adopting only rock strategy (Fig 2B and 2C) via the existence of the unstable equilibrium point on the paper-rock edge. Note that in panel B an interior equilibrium point still exists whereas in panel C the game transits to having no interior equilibrium.

Here, a stable fixed point is denoted by a red circle and a unstable fixed point is denoted by a white circle. The colour in the interior of the simplex indicates the rate of change: from slow (blue) to fast (red). In this example, completely random execution errors lead to the dominance of the rock strategy. We use the Wolfram Mathematica project [49] to produce these phase planes.

Since the stable equilibrium is a strict Nash equilibrium, it is an evolutionary stable strategy (ESS) [4]. However, for any given λ and strategy choice, , the observed stochastic behaviour of organisms, , is defined by the matrix *Q*(λ) as a result of
(7)
Hence, at λ = 0 the game possesses a stable pure equilibrium that corresponds to the execution vector , where is the first column of the matrix *S*. That is, if λ is sufficiently close to 0, the game obtains a stable interior completely mixed equilibrium. Hence, given the execution vector, the assumption of behavioural plasticity introduces a stable interior ESS to an unstable rock-paper-scissors game. Note that, from the perspective of the observed strategy, it does not matter which of the strategies will dominate in this case as they all have the same probabilities of mistakes.

### Motivating example 2

The assumption that individuals make mistakes completely at random is somewhat limiting. In some cases, more freedom in the definition of individuals’ plasticity is required. For instance, if we assume that λ is interpreted as an adaptation process to new environmental conditions, then some behavioural choices may have different distributions of mistakes. For instance, let us consider an example where the fitness matrix *R* is given as follows

Note that the determinant of *R* is negative, which implies that this game possesses an interior fixed point () that is unstable. Hence, there exists a heteroclinic orbit as there are no stable equilibria. The dynamics then oscillate from the centre of the simplex to the boundaries. Such dynamics are generally quite robust under perturbations. We now consider how the game dynamics behave under our assumptions.

The exact probability distributions captured in *S* would depend on the particular situation and species under consideration. Let us demonstrate the influence of execution errors on the following example of matrix *S*. Assume that at the highest level of execution errors (λ = 0) individuals play each of their chosen strategies with probability not less than . When an individual plays a scissors strategy, her strategy execution is completely random. However, choosing a rock or paper strategy may induce some asymmetry in the strategy execution. An individual playing a rock strategy executes only rock and paper strategies with probabilities and , respectively. Individuals, who choose a paper strategy, obtain a limiting distribution of mistakes of (). Then, the matrix *S* is given by

Game flows for different values of λ are depicted in Fig 3. As λ varies from 1 to 0, the game dynamics go through several transitions (see panel A for the overview). The first transition happens at when pure stable equilibrium of scissors emerges (see Fig 3B and 3C). Note that the interior equilibrium still exists but the heteroclinic orbit does not—the dynamics converge to a stable point. Next, at λ ≈ 0.287, a paper strategy becomes stable (Fig 3D). Further, the interior fixed point vanishes at leaving unstable fixed points on the rock-scissors and paper-scissors edges (Fig 3E), which is followed by a rock strategy becoming stable at λ ≈ 0.209 (Fig 3F). This interval of all three strategies being stable is rather short as at paper loses its stability (Fig 3G). However, these are not the only transformations occurring: at the interior equilibrium emerges again (Fig 3H). While the interior equilibrium exists again, scissors lose their stability too at and at λ = 0 rock is the only stable equilibrium. Note that for λ = 0 (Fig 3J), the stable observed pure equilibrium will again be a mixed strategy due to the execution errors given by .

(A) Frequencies of each strategies in the interior equilibrium as functions of λ. Here, *x*_{1} represents rock frequency, *x*_{2}—paper frequency and *x*_{3}—scissors frequency. The interior equilibrium exists for most the values of λ but (, ). Further, the coloured bar at the top of the plot indicates stability intervals of λ for different vertices (a stable vertex is indicated on top of the bar). For instance, vertex 3 is the only stable vertex for λ ∈ (≈ 0.287, ). Game flow for the unstable rock-paper-scissors game is depicted for different values of λ as follows: (B) λ = 1, (C) λ = 0.3, (D) λ = 0.27, (E) λ = 0.22, (F) λ = 0.205, (G) λ = 0.16, (H) λ = 0.1, (I) λ = 0.05, (J) λ = 0. We depicted each transition in the game from panel A. Here, a stable fixed point is denoted by a red circle and a unstable fixed point is denoted by a white circle. Hence, as λ changes its values from 1 to 0, the game experiences several transitions in its equilibria and for different degrees of execution errors, each of the pure strategies has a chance to dominate. However, for the maximum plasticity, only pure rock strategy survives.

These examples demonstrate that execution errors might break the heteroclinic orbit by introducing a stable equilibrium in the game. That is, stochasticity induced by mistakes might stabilise dynamics that were unstable before. In addition, in the case of players executing only mixed strategies, the game might obtain a stable interior point altering its original dynamics (see Figs 2C and 3J). In the following analysis we shall examine possible transitions in unstable RPS games. We aim to define conditions under which we can secure existence of a stable equilibrium.

## Results

### Games with completely random plasticity

Let us first consider the case when behavioural mistakes are completely random. Such settings can be interpreted as either a form of phenotypic plasticity or just noise in the interactions. Then, the matrix *S* is such that any strategy obtains the same probability of mistakes, that is, , ∀*i*, *j* = 1, 2, 3. For such a game, the canonical fitness matrix simplifies to
where *J* is a matrix of ones, *R* is the fitness matrix of the original game and λ ∈ (0, 1]. In a game with λ ∈ (0, 1], if strategies do not induce any overall fitness advantage to any strategy (that is, *R* is a row-sum-constant matrix), then uniform execution errors will not affect the resulting equilibrium (see S1 File, Proposition 1).

**Result 1**. *Let* *be an interior equilibrium of R. If the limiting distribution of mistakes, S, is a uniform matrix, that is*, , ∀*i*, *j* = 1, 2, 3 and *R is a row-sum-constant matrix, then* *is an interior equilibrium for the game* *for any* λ ∈ (0, 1].

In other words, if in a row-sum-constant game everyone is making mistakes with the same probabilities, then population dynamics are invariant under these mistakes. However, diversity in fitness advantages between the strategies might help one of the groups to benefit from behavioural heterogeneity of the population by leveraging its fitness advantage. We can calculate the interior fixed point in a general row-sum case as follows:

**Result 2**. *Let* *be an interior fixed point of the original game R. Then, for* λ *sufficiently close to* 1 *and the limiting distribution of mistakes, S, being a uniform matrix, that is*, , ∀*i*, *j* = 1, 2, 3, *we obtain that* (8) *is an interior fixed point for the game* .

Note that, the point from Eq (8) remains a fixed point of the replicator dynamics as long as it is preserved in the interior of the simplex, that is, as long as (see S1 File, Theorem 1). Hence, it can be easily verified that for this point to remain in the interior of the simplex for all λ ∈ [0, 1] we must have that . The exact position of is determined by the entries of the matrix *R*. Since for a general form of *R* the interior equilibrium will not be located in the exact centre of the simplex, some strategies can go extinct first. This confirms that not only the strength of behavioural plasticity and probabilities of mistakes are important, but also the relative fitness advantages captured in the fitness matrix *R*. That is, mistakes can give a chance to some strategies to make use of their fitness advantage leading to the dominance of a particular strategy.

#### Remark.

Note that Result 2 holds for any number of strategies *n* and any game. For a general form of the result see S1 File.

Result 2 implies that the interior equilibrium of the original game is shifted by behavioural heterogeneity and drives less fit strategies to extinction. However, the observed strategy will remain the same for any dominating pure strategy due to the symmetry in mistakes distributions. Hence, uniform *S* introduces evolutionary stability in the games with heteroclinic cycles. Moreover, for the extreme case of behavioural plasticity (λ ≈ 0), this equilibrium will be close to a completely mixed equilibrium (). Note that in the case of λ = 0, the matrix is a zero matrix, meaning that the strategies are neutral and any point in the simplex is stable.

### Breaking the cyclic relationship

Next we address the question: What if behavioural mistakes of individuals are not necessarily uniformly distributed? For instance, if we treat the parameter λ as some form of adaptation or learning, then the probabilities of mistakes might be different for different strategies. In such a case, we consider the general form of matrix *S* as in Eq (3). In order to study the effect of the limiting distribution of mistakes (as λ → 0), we shall focus on the form of a RPS game, where no strategy gains a fitness advantage. That is, we assume a row-sum-constant fitness matrix with an unstable equilibrium in the centre of the simplex by letting *a*_{1} = *a*_{2} = *a*_{3} = *a* and *b*_{1} = *b*_{2} = *b*_{3} = *b* in the matrix (1). The condition *a* > *b* ensures instability of the interior fixed point () and, hence, the existence of a heteroclinic orbit.

In three dimensions (see (6)), transitions in a game are caused by either the elements changing the sign, or cofactors changing the sign, or the determinant of the fitness matrix changing its sign [22, 50]. In the case of RPS games, the stability of the interior equilibria is determined by the sign of the determinant of the fitness matrix [51]. There are three cases: (a) if det(*R*) < 0, then such a game obtains an unstable interior equilibrium resulting in a heteroclinic cycle; (b) if det(*R*) > 0, then such an equilibrium is a stable fixed point; (c) if det(*R*) = 0, then there exists an unstable interior equilibrium and periodic orbits. Then, under the assumptions of our model, the interior point’s stability could potentially switch. That is, if the determinant of the fitness matrix changes its sign while the game is still a RPS game, the unstable interior point could become stable. However, the equilibrium behaviour of the game with is the same as *R*(λ) by [45] given the relation between the two matrices in Eq (5). Since the determinant of the fitness matrix *R*(λ) always preserves the same sign as det(*R*), then also cannot change its sign while the interior point exists. Hence, stability properties of the interior equilibrium cannot be changed in our model. However, the equilibrium can be pushed to the boundary of the strategy space due to the asymmetry in the matrix *S*.

Note that for a homogeneous population (λ = 1) the interior fixed point, , can be calculated as
(9)
where *R*_{ji} are cofactors of the matrix *R*. Then, as rate of execution errors of players increase (λ → 0), the interior point might transform and become infeasible as one or two of the components reach 0, that is, for some *i*.

Hence, depending on the probabilities of mistakes, we can describe possible transitions in the game dynamics induced by the changes in the strength of behavioural plasticity, λ. For instance, the game might possess an unstable interior equilibrium for any λ. However, the stability of the vertices will be disturbed as the entries of change their signs. An example of such a case can be found in Fig 4A, where the components of the interior equilibrium are plotted. The coloured bar at the top of each plot indicates the interval of λ where the vertices (either one or two or all three) are stable. Further, Fig 4B shows an example of the game transitions with the interior equilibrium disappearing after some λ and varying stability of the vertices is depicted. However, given the rational functional form of , it is possible that the interior equilibrium exists for more than one sub-interval of λ. For instance, in Fig 4C and panel D the interior fixed point emerges twice as λ changes from 0 to 1. Furthermore, stability transitions of the vertices in such cases are rich in structure. This is especially the case for the example depicted in Fig 4D.

The components of the interior fixed point are plotted as functions of λ. Further, the coloured bar at the top of the plot indicates stability intervals of λ for different vertices (a stable vertex is indicated on top of the bar). (A) The interior fixed point exists for all λ but vertices interchange their stability. In the limit of mistakes (λ → 0), two vertices are stable. (B) The interior fixed point exists for a sub-interval and vertices interchange their stability. As λ → 0, two vertices are stable. (C) The interior fixed point exists for two sub-intervals of (0, 1). In the limit of mistakes (λ → 0), only vertex 1 is stable. (D) The interior fixed point exists for almost all values of λ. In the limit of mistakes (λ → 0), all three vertices are stable. Generally, the exact equilibria transitions and existence of an interior equilibrium is determined by the limiting distribution of mistakes, *S*. We found that for almost all matrices *S* there is a high chance that at least one of the pure strategies will become dominant.

Generally, components of the interior equilibrium are rational functions with numerators and denominators being 4-th order polynomials in λ. Consequently, there is a variety of possible behaviours. However, we can determine strict conditions for a vertex to be stable, based on its behaviour for a mixed strategy profile captured in the corresponding rows of the matrix *S*. Specifically, we can determine those conditions in the following result (see S1 File for more details).

**Result 3**. *Let* λ^{c} ∈ (0, 1) *be such that* , *where* *and* , *i, j, k are all distinct. Vertex j is a stable point of the replicator dynamics under incompetence for* λ ∈ [0, λ^{c}) *if and only if* (10)

This result follows from the fact that as the population becomes more plastic as λ → 0 and *R*(λ)→*SRS*^{T}, the canonical form of the fitness matrix is reduced to
where *C*_{ij} = (**s**_{i} − **s**_{j})^{T}*R***s**_{j} and **s**_{i} are the corresponding rows of *S*. If we think about **s**_{j}’s as a mixed strategy that population use when vertex *j* is stable, we can interpret those conditions as stability requirements. That is, the plastic behaviour of strategy *j* has to be a better response to itself than both mixed profiles of strategies *i* and *k*. Hence, stability of the strategic choice of pure strategy *i* is determined by the stability of its mixed profile.

#### Remark.

Note that for λ = 0, conditions (10) imply stability of vertex *j* for any number of strategies *n* and any game.

Note that since the stable equilibrium is pure, it is a strict Nash equilibrium. Hence, the original replicator dynamics obtains an evolutionary stable point under the assumptions of our model. In fact, by Eq (7), according to the strategy execution of individuals, we obtain a stable point in the interior that corresponds to **s**_{i}, where *i* is a stable vertex. A schematic representation of such a transformation can be found in Fig 5.

The original game possesses an unstable equilibrium . Once the execution errors are introduced, for some λ the game can obtain a stable equilibrium represented by a vertex *i*. As the probability to play mixed strategies in this case is high, it keeps disturbing the strategy execution of the players choosing strategy *i* according to Eq (7) resulting in an equilibrium that is possibly in the interior.

As demonstrated in Examples 1 and 2, while λ decreases from 1 to 0, the dynamics can experience several bifurcations where an equilibrium can emerge on one of the edges. An edge-equilibrium is characterised by exactly one of the components of the equilibrium being 0, that is, . Hence, this edge point is determined by the interaction between the two remaining strategies. However, for the version of an RPS game considered here, those equilibria will be mostly unstable (see S1 File for more details). Note that the point on the corresponding edge might exist before the interior reaches the boundary.

Overall, when strategies are initially equivalent in their fitness advantages in the non-plastic game, the asymmetry in matrix *Q*(λ) introduces asymmetry in the game . This in turn leads to the competition between pure strategies resulting in some stable fixed point of the dynamics. The outcome of the competition is determined by the interplay among mixed profiles of all three strategies. Specifically, the mixed profile of strategy *j* has to be uninvadable by both populations consisting of individuals following strategies *i* and *k*. This observation is different from the case of a uniform mixed strategies profile. That is, if in the former case, mixed profiles were likely to introduce a completely mixed equilibrium, in the case of asymmetric *S*, competition between the mixed strategies is important.

## Discussion

Much research has been devoted to describing behavioural mistakes of organisms and how those mistakes affect the outcome of the evolutionary competition. In addition, the RPS game itself received a lot of attention due to its ability to describe cyclic competitive interactions. However, such cycles are rarely observed in nature. We propose that behavioural heterogeneity or noise can induce stabilisation of communities driving them to evolutionary stable outcomes. Our model introduces behavioural mistakes in the context of a cyclic RPS game. Here, behavioural mistakes imply that individuals might execute a strategy different from the intended one. We encode all probabilities of mistakes in a matrix *Q*(λ) and allow individuals to play either a mixed or pure strategy. The degree of plasticity is captured by the parameter λ varying from 1 (no plasticity) to 0 (maximum plasticity).

We then explore the influence of the limiting distribution of mistakes captured in matrix *S* on the evolution of social behaviour of species. Depending on the matrix *S*, different pure strategies might benefit from those mistakes. Such matrix captures mistake probabilities for the limiting case of λ = 0. We analyse the interplay of learning and fitness advantages and define conditions under which strategies can prevail. For example, in the case with completely random mistakes, the most beneficial strategy is the strategy with the highest relative fitness advantage (see Result 2). However, it does not change the outcome of the evolution since in this case it will be a completely mixed interior point.

One can also interpret our model as adaptation to new environmental conditions. Then, it is natural to expect that specific environments require different strategies to be adopted. For instance, in the case with an RPS game with the interior equilibrium () and a general form of *S*, different strategies might become stable depending on their behavioural plasticity as their competence evolves (see Result 3). However, even if behavioural choice of organisms will evolve to a stable pure strategy, their executed strategy (for λ ≠ 1) might differ from their actual type. Conversely, we will obtain a vector of mixed strategies given by Eq (7). Hence, *S* can introduce stability in the game which might preserve all three strategies from extinction.

Interestingly, at λ = 0, strategies are leveraging the advantage they can gain from mistakes from maximum plasticity. For instance, in the case with a general form of limiting probability distribution, stability of a pure strategy is determined by its plastic response to itself (see Result 3). For a strategy to become stable, it is necessary to be uninvadable by the other two plastic strategies.

Overall, behavioural heterogeneity, captured through the execution noise, might help species to benefit from behavioural heterogeneity or plasticity. The ability of our model to induce a stable equilibrium in the unstable game might help in explaining why such unstable RPS dynamics are not observed in wild communities. That is, plasticity in behaviour might help to stabilise the evolutionary outcome and sometimes enable one of the strategies to become dominant.

## Supporting information

### S1 File. Mathematical appendix.

In this document we derive all results presented in the manuscript.

https://doi.org/10.1371/journal.pcbi.1008523.s001

(PDF)

## Acknowledgments

Authors would like to thank Christian Hilbe and Martin Nowak for their inspiring and very helpful feedback on the manuscript.

## References

- 1.
Apaloo J., Brown J. S., and Vincent T. L., “Evolutionary game theory: ESS, convergence stability and NIS,”
*Evolutionary Ecology Research*, vol. 11, no. 4, pp. 489–515, 2009. - 2.
Hofbauer J. and Sigmund K., “Evolutionary game dynamics,”
*Bulletin of the American Mathematical Society*, vol. 40, no. 3, pp. 479–519, 2003. - 3.
McKelvey R. and Apaloo J., “The structure and evolution of competition-organized ecological communities,”
*The Rocky Mountain Journal of Mathematics*, vol. 25, no. 1, pp. 417–436, 1995. - 4.
Nowak M.,
*Evolutionary dynamics: exploring the equations of life*. UK: The Belknap press of Harvard University press, 2006. - 5.
Smith J. and Price G., “The logic of animal conflict,”
*Nature*, vol. 246, pp. 15–18, 1973. - 6.
Szolnoki A., Mobilia M., Jiang L.-L., Szczesny B., Rucklidge A. M., and Perc M., “Cyclic dominance in evolutionary games: a review,”
*Journal of the Royal Society Interface*, vol. 11, no. 100, p. 20140735, 2014. pmid:25232048 - 7.
Sinervo B. and Lively C., “The rock–paper–scissors game and the evolution of alternative male strategies,”
*Nature*, vol. 380, no. 6571, p. 240, 1996. - 8.
Corl A., Davis A. R., Kuchta S. R., and Sinervo B., “Selective loss of polymorphic mating types is associated with rapid phenotypic evolution during morphic speciation,”
*Proceedings of the National Academy of Sciences*, vol. 107, no. 9, pp. 4254–4259, 2010. pmid:20160090 - 9.
Kirkup B. C. and Riley M. A., “Antibiotic-mediated antagonism leads to a bacterial game of rock–paper–scissors in vivo,”
*Nature*, vol. 428, no. 6981, pp. 412–414, 2004. pmid:15042087 - 10.
Kerr B., Riley M. A., Feldman M. W., and B. Bohannan J. M., “Local dispersal promotes biodiversity in a real-life game of rock–paper–scissors,”
*Nature*, vol. 418, no. 6894, pp. 171–174, 2002. - 11.
Liao M. J., Din M. O., Tsimring L., and Hasty J., “Rock-paper-scissors: Engineered population dynamics increase genetic stability,”
*Science*, vol. 365, no. 6457, pp. 1045–1049, 2019. pmid:31488693 - 12.
Lewin-Epstein O. and Hadany L., “Host–microbiome coevolution can promote cooperation in a rock–paper–scissors dynamics,”
*Proceedings of the Royal Society B*, vol. 287, no. 1920, p. 20192754, 2020. - 13.
Michor F. and Nowak M. A., “The good, the bad and the lonely,”
*Nature*, vol. 419, no. 6908, pp. 677–679, 2002. pmid:12384681 - 14.
Hauert C., De Monte S., Hofbauer J., and Sigmund K., “Volunteering as red queen mechanism for cooperation in public goods games,”
*Science*, vol. 296, no. 5570, pp. 1129–1132, 2002. pmid:12004134 - 15.
Rossine F. W., Martinez-Garcia R., Sgro A. E., Gregor T., and Tarnita C. E., “Eco-evolutionary significance of “loners”,”
*PLoS biology*, vol. 18, no. 3, p. e3000642, 2020. pmid:32191693 - 16.
Guo H., Song Z., Geček S., Li X., Jusup M., Perc M., Moreno Y., Boccaletti S., and Wang Z., “A novel route to cyclic dominance in voluntary social dilemmas,”
*Journal of the Royal Society Interface*, vol. 17, no. 164, p. 20190789, 2020. pmid:32126192 - 17.
Vega N. M. and Gore J., “Simple organizing principles in microbial communities,”
*Current opinion in microbiology*, vol. 45, pp. 195–202, 2018. pmid:30503875 - 18.
Zhao K., Liu L., Chen X., Huang T., Du L., Lin J., Yuan Y., Zhou Y., Yue B., Wei K., et al., “Behavioral heterogeneity in quorum sensing can stabilize social cooperation in microbial populations,”
*BMC biology*, vol. 17, no. 1, p. 20, 2019. pmid:30841874 - 19.
Park H. J., Pichugin Y., and Traulsen A., “Why is cyclic dominance so rare?,”
*Elife*, vol. 9, p. e57857, 2020. - 20.
Liao M. J., Miano A., Nguyen C. B., Chao L., and Hasty J., “Survival of the weakest in non-transitive asymmetric interactions among strains of e. coli,”
*Nature communications*, vol. 11, no. 1, pp. 1–8, 2020. pmid:33247128 - 21.
J. Beck,
*Incompetence*,*training and changing capabilities in Game theory*. PhD thesis, University of South Australia, Australia, 2013. - 22.
Kleshnina M., Filar J. A., Ejov V., and McKerral J. C., “Evolutionary games under incompetence,”
*Journal of mathematical biology*, vol. 77, no. 3, pp. 627–646, 2018. pmid:29484454 - 23.
Selten R., “Reexamination of the perfectness concept for equilibrium points in extensive games,”
*International Journal of Game Theory*, vol. 4, no. 1, pp. 25–55, 1975. - 24.
Stadler P. and Schuster P., “Mutation in autocatalytic reaction networks,”
*Journal of mathematical biology*, vol. 30, no. 6, pp. 597–631, 1992. pmid:1640182 - 25.
Tarnita C. E., Antal T., and Nowak M. A., “Mutation–selection equilibrium in games with mixed strategies,”
*Journal of theoretical biology*, vol. 261, no. 1, pp. 50–57, 2009. pmid:19646453 - 26.
Komarova N., “Replicator-mutator equation, universality property and population dynamics of learning,”
*Journal of theoretical biology*, vol. 230, pp. 227–239, 2004. - 27.
Komarova N., Niyogi P., and Nowak M., “The evolutionary dynamics of grammar acquisition,”
*Journal of theoretical biology*, vol. 209, pp. 43–59, 2001. pmid:11237569 - 28.
Nowak M., Komarova N., and Niyogi P., “Evolution of universal grammar,”
*Science*, vol. 291, no. 5501, pp. 114–118, 2001. pmid:11141560 - 29.
Fudenberg D. and Levine D.,
*The theory of learning in games*. USA: The MIT Press, 1999. - 30.
Hopkins E., “Two competing models of how people learn in games,”
*Econometrica*, vol. 70, no. 6, pp. 2141–2166, 2002. - 31.
McKelvey R. and Palfrey T., “Quantal response equilibria for normal form games,”
*Games and Economic Behavior*, vol. 10, pp. 6–38, 1995. - 32.
Selten R., “Evolution, Learning and Economic Behavior,”
*Games and Economic Behavior*, vol. 3, pp. 3–24, 1991. - 33.
Levin S., “Complex adaptive systems: Exploring the known, the unknown and the unknowable,”
*Bulletin of the American Mathematical Society*, vol. 40, no. 1, pp. 3–19, 2003. - 34.
Dridi S., “Plasticity in evolutionary games,”
*bioRxiv*, p. 509604, 2019. - 35.
Su Q., Li A., Zhou L., and Wang L., “Interactive diversity promotes the evolution of cooperation in structured populations,”
*New Journal of Physics*, vol. 18, no. 10, p. 103007, 2016. - 36.
Su Q., Zhou L., and Wang L., “Evolutionary multiplayer games on graphs with edge diversity,”
*PLoS computational biology*, vol. 15, no. 4, p. e1006947, 2019. pmid:30933968 - 37.
Foster D. and Young P., “Stochastic evolutionary game dynamics,”
*Theoretical population biology*, vol. 38, no. 2, pp. 219–232, 1990. - 38.
Fudenberg D. and Harris C., “Evolutionary dynamics with aggregate shocks,”
*Journal of Economic Theory*, vol. 57, no. 2, pp. 420–441, 1992. - 39.
Avrachenkov K. and Borkar V. S., “Metastability in stochastic replicator dynamics,”
*Dynamic Games and Applications*, vol. 9, no. 2, pp. 366–390, 2019. - 40.
Albrecht A., Avrachenkov K., Howlett P., and Verma G., “Evolutionary dynamics in discrete time for the perturbed positive definite replicator equation,”
*The ANZIAM Journal*, 2020. - 41.
Bomze I. and Burger R., “Stability by mutation in evolutionary games,”
*Games and Economic Behavior*, vol. 11, no. 2, pp. 146–172, 1995. - 42.
Beck J. D., Ejov V., and Filar J. A., “Incompetence and impact of training in bimatrix games,”
*Automatica*, vol. 48, no. 10, pp. 2400–2408, 2012. - 43.
Claussen J. C. and Traulsen A., “Cyclic dominance and biodiversity in well-mixed populations,”
*Physical review letters*, vol. 100, no. 5, p. 058104, 2008. pmid:18352437 - 44.
Zeeman E., “Population dynamics from game theory,” in
*Global Theory of Dynamical Systems*, pp. 471–497, Springer, 1980. - 45.
Hofbauer J., Schuster P., Sigmund K., and Wolff R., “Dynamical systems under constant organization ii: Homogeneous growth functions of degree p = 2,”
*SIAM Journal on Applied Mathematics*, vol. 38, no. 2, pp. 282–304, 1980. - 46.
Taylor P. and Jonker L., “Evolutionary stable strategies and game dynamics,”
*Mathematival Biosciences*, vol. 40, pp. 145–156, 1978. - 47.
M. Kleshnina,
*Evolutionary games under incompetence and foraging strategies of marine bacteria*. PhD thesis, The University of Queensland, 2019. PhD thesis. - 48.
Kleshnina M., McKerral J. C., Gonzalez-Tokman C., Filar J. A., and Mitchell J. G., “Shifts in evolutionary balance of microbial phenotypes under environmental changes,”
*bioRxiv*, 2020. - 49.
L. Izquierdo and S. Izquierdo, “Replicator-mutator dynamics with three strategies,” 2011.
- 50.
Bomze I., “Non-cooperative two-person games in biology: A classification,”
*International Journal of Game Theory*, vol. 15, pp. 31–57, 1986. - 51.
Weissing F., “Evolutionary stability and dynamic stability in a class of evolutionary normal form games,” in
*Game Equilibrium Models I*, pp. 29–97, Springer, 1991.