Evolutionary games, climate and the generation of diversity

Environmental stochasticity and climate affect outcomes in evolutionary games, which can thereby affect biological diversity. Our maximum likelihood (ML) estimates of replicator dynamics for morph frequency data from control (25 years) and three experimentally perturbed populations (14 years) of side-blotched lizards yield a 3 × 3 payoff matrix in the generalized Rock-Paper-Scissors family; it has intransitive best replies, and each strategy is its own worst reply. ML estimates indicate significant interactive effects of density and temperature on morph frequency. Implied dynamics feature a powerful interior attractor and recover (for the first time) observed 4-5 year oscillations. Our evolutionary experiment on morph frequency confirms that oscillations are driven by frequency dependent selection, but climate entrains the cycles across the perturbed and control populations within 10 generations. Applying the model across the species range, we find that climate also accounts for morph fixation and mating system diversity, suggesting climate may similarly impact ecosystem diversity.

Stochastic Structure. Any statistical procedure to estimate a parameter vector θ must specify a general form G(x, θ, e) for the relevant variables x (here, morph frequencies, which we refer to as shares, and covariates), including a stochastic structure for error e to account for data variability y not fully captured in the deterministic component G(x, θ, 0).
The best known error structure is Normal-additive: the investigator finds the vector θ that best fits the equation y t = G(x t , θ, 0) + e t , e.g., by minimizing residual sum of squareŝ e t = y t − G(x t , θ, 0). One assumes that G(x t , θ, 0) is independent of the random variable e t , with mean zero and a Normal distribution. This error model has the advantage of familiarity and ease of interpretation. For the share data to which standard evolutionary models apply, however, it has a severe disadvantage -it doesn't respect the simplex constraints, and so can predict negative shares and other nonsensical behavior.
For fitting share or proportional data, then, we need an error model that obeys simplex constraints that (a) each component of G(x, θ, e) is nonnegative and (b) they sum to 1. The Logit error model is also widely known and automatically satisfies (a). As explained later, it can be generalized and normalized so that it also satisfies (b). Its main disadvantage is that, as explained below, the estimated coefficients don't quite correspond to the parameters of interest, i.e., estimates are not statistically consistent and can be difficult to interpret. The Dirichlet error model is less well known, but by construction it satisfies (a) and (b), delivers consistent estimates of θ, and can be extended to deal with interactive covariates.
Simulation Exercise. To check that the estimation procedure is consistent and robust, we use the Dirichlet-Replicator model with various parameters to generate artificial data samples of a size similar to the field data. We then checked whether the estimation procedure could recover the specific parameters chosen to generate the data.
Our first such exercise sets the parameter η = 25 and sets W as in Panel (a) of Table A below, an apostatic RPS matrix with interior NE at (1/3, 1/3, 1/3). Apply the Dirichlet-Replicator model to generate sample time series of 60 periods each. The average estimated payoff matrix over 50 such samples appears in Panel (b) of the table. Somewhat reassuringly, the average fitted matrix is also an apostatic RPS, and its coefficients are all reasonably close to the true values used in the simulation.
A very tough example sets η = 25 and but now takes W as in Panel (a) of Table B below, a barely apostatic RPS matrix with interior NE at (0.8, 0.1, 0.1) not far from a corner.
Again we use the model to generate samples of length 60 and, in panel (b), report average estimates over 50 samples. The estimated W is almost (but not quite) apostatic RPS: the entries in the first column are biased downward and there is a virtual tie there for the best reply.    The generalized log-odds function g maps a point s = (s 1 , ..., s m ) ∈ S m to the point (ln s 1 , ..., ln s m ) ∈ R m and then projects orthogonally onto the m − 1 dimensional null subspace R m 0 -the vectors in R m whose components sum to zero -by subtracting 1 m m j=1 ln s j from each component. The g function has a global inverse known as the multinomial logistic function, L : (y 1 , ..., y m ) ∈ R m 0 → ( e y 1 j e y j , ..., e yn j e y j ) ∈ S m . In particular, for m = 3, the map g : To verify that L −1 = g for m = 3, simply confirm that for s i = L i (y 1 , y 2 , y 3 ) = e y i e y 1 +e y 2 +e y 3 we have 1 3 ln s 2 1 s 2 s 3 = 1 3 ln e 3y 1 −0 = y 1 if y 1 + y 2 + y 3 = 0, and similarly for y 2 and y 3 .
The generalized Logistic regression model applies the logistic function to the dependent variable y t , which in our application is the current state vector S(t) = (s 1 , s 2 , s 3 ) ∈ S. Thus the left hand side variable in the regression will be Likewise, the main explanatory variable will be the generalized log odds of the replicator Taking products over the sample of observations and then taking logs, we obtain the log-likelihood function To save parameters (without doing any apparent violence to our data) we assume that all three σ i 's have the same value, which we refer to as the error amplitude. Of course, when there are covariates such as ν and τ , one includes with X i (t) the other systematic terms on the right hand side of (3). We again use the Matlab routine fmincon for the numerical optimization, and again impose the usual coefficient constraints and starting values.
The logistic normal model keeps the likelihood of the data very tractable and at the same time restricts the variables to the simplex. However, note that a structural interpretation of the coefficients is compromised. To explain, for notational simplicity suppose again that the covariates are known to be zero.
and L is nonlinear. That is, the maximum likelihood parameter estimates in (6) are biased. In some cases the bias is small but in others it is large, and it is hard to assess a priori.   Can the Logistic regression model recover parameters used in the simulated data? We see from Tables E and F that it does a decent job in the easy symmetric case and even in the difficult asymmetric case the estimates bear some resemblance to the true parameters.   proper model, such as logit, which respects the bounded range of the variables but is more complicated to set up and interpret. In that spirit, we report estimates of the payoff matrix and covariate sensitivities obtained by a linear regression model applied to our lizard data.
The basic linear model is where W i,· is the i-th row of W . This model assumes Gaussian additive noise as in the previous section. Condition (9) guarantees that the estimated shares sum to one. On the other hand, this model does not guarantee that S i ∈ [0, 1], although this condition is likely to be satisfied for sufficiently small noise variances.
The parameters of the model are the payoff matrix W and the noise variances (σ 2 1 and σ 2 2 ). As usual, set Z i (t + 1) = W i (t) W (t) S i (t), the prediction from deterministic replicator dynamics.
Taking products and logs as usual (or logs and sums), we obtain the log-likelihood function A simplifying assumption is that σ 1 = σ 2 = σ. Then maximizing the log-likelihood function with respect to entries of W is equivalent to the simpler non-linear least squares problem: We solve this least squares minimization problem numerically using the Matlab routine fmincon, after imposing the usual non-negative entries and overall sum constraints on W .
The algorithm converges quickly from neutral initial conditions and returns the estimated payoff matrices in Tables G and H Table H: Estimates of (W, β ν , β τ ) in the VAR model with covariates.σ = 0.40 ± 0.14.
Applied to the simulated data, we obtain the estimates reported in Tables I and J.
The coefficient vectors (β ν 10 , β ν 20 , β ν 30 ) and (β τ 10 , β τ 20 , β τ 30 ) capture level effects, while as before matrices ((β ν ij )) and ((β τ ij )) capture interactive effects for the usual covariates (Table K).  If significant, level effects would imply differences in the impact of covariates on payoffs, among populations, which would affect the location of the interior attractor. Note that none of the level effects is highly significant individually and only two are marginally significant. We will soon see (in the last line of see for example Davidson & MacKinnon (2004). The first row of the table shows that we can reject the null hypothesis that population density effects are insignificant with great confidence (p < 0.001). The second and third rows shows that temperature effects are marginally significant (p < 0.10, barely) by themselves, and that the combination is highly significant (p < 0.0001). The last row rejects level effects, i.e., the hypothesis that the covariates operate directly on the fitness of each morph, rather than interactively via all other morphs. Other robustness checks, e.g., showing that the fit is not appreciably improved by setting to zero the β coefficients that are not significant at the 5% level, are omitted to save space.  Hadamard products are convenient to work with because they commute and are readily decomposed. The identity matrix for this operation is 1 n×n = ((1)), i.e., all entries are 1, in contrast to the identity matrix for ordinary matrix multiplication, I n , whose off-diagonal entries are 0.
Let the multi-matrix Q summarize genotype transmission from parents to progeny; examples appear below. Let matrices F and M summarize fertility selection for female and male parents, let A and B summarize their respective sexual preferences, and let s F and s M denote phenotype shares in females and males respectively. Then, for the appropriate normalization factor c > 0, the general expression is where s is the phenotype share vector in the next generation for the target sex, assuming underlying additive effect of alleles. In our application, the target sex is males, but both sexes have the same share vector in many applications, as noted below.
What does this general expression look like for our lizard system? With six genotypes and Mendelian transmission, the following hex-matrix shows the probability that each genotype of progeny will be produced by a given pair of parental genotypes:  Since the observed female genotype data from one generation to the next will capture any impact of the female game on females, we can simplify some aspects of the structural model for male payoffs, which is our focus here. Thus we defer estimating a female game 2 , which would more than double the number of parameters to estimate. By setting F = I 6 , we capture the impacts of the female genotypes on male payoffs via their observed matings with males, and the male payoff matrix captures any strategic impacts on male progeny. As noted in the text, we decompose overall fitness matrix via Hadamard products for temperature and density impacts on male payoffs: preferences subsume both direct female choice (Bleay & Sinervo 2007) and cryptic female choice (Calsbeek & Sinervo 2004), both of which operate in the side-blotched lizard system.
Assuming that survival (and mating preferences) of phenotypes are additively determined by underlying alleles, we write 6 × 6 genotype matrix M in terms of entries from the 3 × 3 phenotype fitness matrix expressing male versus male game payoffs Ω = ((ω ij )) as in Table   N.  . The reader should note that these three heterozygous classes and the system of additivity will contribute to a rapid rise in frequency of the three strategies, even when homozygous classes are very low in frequency.
Moreover, this structural model with sexual reproduction is analogous to the simple replicator model presented in the text.
For example, the second component of s , for genotype bo, is given by the quadratic form Thus, given non-negative fitness values ω ij , i, j ∈ {O, B, Y }, and the current genotype vector, one can use equation (15) to predict the next phenotype vector. One fits the ω ij s to the data as described earlier, by finding values (plus a value of η, the Dirichlet parameter for precision or effective sample size) that maximize the likelihood of the observed data.
Otherwise put, one replaces the Z expression obtained from deterministic replicator dynamics on the (phenotype) 3-simplex by an alternative Z expression obtained from (15) on the (genotype) 6-simplex.
Estimated parameters for diploid model. We estimated the diploid sexual model using data in Table 1 in two runs using different subsets of the data. Table P reports fitted parameters to the diploid model using male genotype data as inputs for both the male and female genotypes. This enables direct comparison to the featured replicator model, which only uses information from male genotypes. Table Q reports runs using both observed male and female genotypes as inputs. This allows us to assess the additional contribution of female genotypes per se to the payoffs of the male game. Using male only genotypes as a test is justified on the basis of the significant correlation observed between the sexes for each genotype (oo: P < 0.05, t = 2.08; bo: P < 0.0001, t = 7.11; yo: P < 0.0001, t = 4.62; bb: P < 0.0001, t = 4.92; by: P < 0.0001, t = 10.08; bo: P < 0.0001, t = 7.11; yy:        We used this information to estimate the reproductive period of the 1st clutch at nearby sites, at a comparable latitude, longitude, and elevation. The hatchling period spanning 3 months was estimated to start 2 months after the start of the reproductive period, since it takes 2 months to yolk a clutch and for eggs to incubate. We computed average h r for each site using the T max from worldClim.org for conditions from 1950 to 2000, the T b from the regression relationship across the Uta range and we standardized h r to the h r computed for the Los Banos site to obtain τ in the climate driven payoff matrix for each site across the species range. Values for fitted T b and fitted h r , observed OBY alleles, and month for the start of the reproductive period are reported in Table S