A guide to using a multiple-matrix animal model to disentangle genetic and nongenetic causes of phenotypic variance

Non-genetic influences on phenotypic traits can affect our interpretation of genetic variance and the evolutionary potential of populations to respond to selection, with consequences for our ability to predict the outcomes of selection. Long-term population surveys and experiments have shown that quantitative genetic estimates are influenced by nongenetic effects, including shared environmental effects, epigenetic effects, and social interactions. Recent developments to the “animal model” of quantitative genetics can now allow us to calculate precise individual-based measures of non-genetic phenotypic variance. These models can be applied to a much broader range of contexts and data types than used previously, with the potential to greatly expand our understanding of nongenetic effects on evolutionary potential. Here, we provide the first practical guide for researchers interested in distinguishing between genetic and nongenetic causes of phenotypic variation in the animal model. The methods use matrices describing individual similarity in nongenetic effects, analogous to the additive genetic relatedness matrix. In a simulation of various phenotypic traits, accounting for environmental, epigenetic, or cultural resemblance between individuals reduced estimates of additive genetic variance, changing the interpretation of evolutionary potential. These variances were estimable for both direct and parental nongenetic variances. Our tutorial outlines an easy way to account for these effects in both wild and experimental populations. These models have the potential to add to our understanding of the effects of genetic and nongenetic effects on evolutionary potential. This should be of interest both to those studying heritability, and those who wish to understand nongenetic variance.

In this tutorial, we demonstrate how to run animal models that include non-genetic causes of similarity between individuals. This is shown using the proprietary R package ASREML-R, and so users must have a licence to use this software. For those who do not have a licence, we also have a brief tutorial for the same analyses in MCMCglmm at the end of this document.
For each of the models in the main article, we show how the data is prepared for analysis, and the code to run the models. Additionally, we then show how the proportion of variance explained by the different variance parameters can be calculated, including narrow sense heritability.

Simulated Data
We have simulated a population of Mermaids, over 10 generations, with the population size varying at random between 150 and 250 individuals. Individuals reside in a lagoon defined by a 50x50 grid, over which 5 environmental measures vary. Here, and in the main text, we demonstrate the models using three traits that have all been simulated to have additive genetic and direct environmental effects. The second and third traits also include maternal enviromental effects, and the third trait also includes maternal genetic effects (for those who are interested in seeing how these were simulated in detail, we provide separate Data Simulation Documentation).
In order to show how other non-genetic factors could be used to account for variance in a trait, we have also simulated epigenetic differences between individuals, as well as a social network. In the first case, individuals vary in the number of methylated CpG loci at 230 CpG islands. Of these 230, 100 vary at random, 100 are based upon the environment, and 30 are based upon the environment, but are transmitted (or reconstructed) from mothers to offspring. Individuals inherit epigenetic states depending on the mother's environment, with a 50% probability of reset to a state determined by the individual's own environment For the social network, we simulate interactions between individuals in the same generations, and between individuals in adjacent generations (e.g. those in generation 3 interact with individuals in generation 2, 3 and 4, but no others). Interactions are most likely within generations, particularly between those in close spatial proximity. Between generations, interactions are most likely between mother-offspring pairs, and again with weighting towards those closer together in space. Least likely are interactions between non-related pairs in separate environments (with this likelihood declining with distance).
Both the epigenetic information and the social network are simulated to follow the environment (and thus explain some variance in the trait if included in the model). However, the epigenetic and social information are not used to simulate traits directly.

Reading in the data
The following code shows the data frames needed for this tutorial and loads the data that are needed. Modify the filepaths to where you have saved the relevant files. Mermaids.csv contains the pedigree, environmental, and phenotypic data. Columns id, dam and sire identify an individual, its mother, and its father. The column generation is an individual's generation/cohort, sex is 1=female or 0=male. Spatial coordinates are given by Xloc and Yloc, with disp_dist the distance an individual dispersed away from its mother. The three phenotypic traits are given by trait_ae (tail-fin colour), trait_mee (body size) and trait_mgee (swimming speed). Finally, the 5 environmental variables are in the columns e_var_1 to e_var_5. # libraries needed for these analyses library(asreml) library(nadiv) library(igraph) library(tnet) library(pedantics)

Setting up
To connect phenotypic and pedigree data, the pedigree columns must be factors (rather than characters). We also generate duplicates of these columns, so that they can be linked to the similarity matrices separately (from the environment, epialleles, or social network). In order to run an animal model we need to provide relatedness information to the model. This is derived from the pedigree data in this case (genomic relatedness matrices are also increasingly being used, see page 25). Asreml-r requires the inverse of the relatedness matrix between individuals: # gets pedigree inverse for the model # first get the pedigree from the data # asreml-r expects this in the order "id","sire,"dam"" pedigree<-Mermaids[,c("id","sire","dam")]

# inserts any missing individuals into the pedigree
pedigree<-insertPed(pedigree, missing) } # Orders the pedigree so that parents come before their offspring pedigree<-orderPed(pedigree) # asreml.Ainverse then creates the inverse of the additive genetic relationship matrix # from the pedigree data ainv<-asreml.Ainverse(pedigree)$ginv We can visualise the (raw) pedigree using the drawPedigree function from the Pedantics package: # drawPedigree expects "id" "dam" "sire" # so change order of columns drawPedigree(pedigree[,c(1,3,2)]) Alternatively, we can visualise the pedigree in 3D space, so that dispersal can be seen. Here, the physical locations of individuals are plotted on the X and Y axes (locations on spatial grid). The Z-axis shows the generations (with the 1st generation at the bottom of the axis, and the 10th generation at the top). Individual points are coloured by sex (blue for males, red for females). Lines connect offspring to their mothers (red lines) and fathers (blue lines). Note that the image we show below is a 2D capture of the full 3D plotrunning the code below will produce a moveable plot wherein details are more evident.

Basic model
We show first a basic animal model, which includes only the additive genetic variance and residual variance, using the first phenotypic trait of tail-fin colour. Here, the only fixed effect is the mean/intercept of the reponse variable (trait_ae~1 ). A single random effect (additive genetic) is included (random=~ped(ANIMAL)). The identity of the individual in the data is linked to the inverse relatedness matrix using the ginverse argument (ginverse=list(ANIMAL=ainv)). This model estimates additive genetic effect variance, assuming that it has a mean of 0 and variance-covariance A * V a , such that the additive genetic effects are estimated according to the distribution a ∼ N (0, A * V a ). The covariance between individuals is determined by the additive genetic relatedness matrix (A, derived from the pedigree), # Basic model -estimates only additive genetic variance and residual variance m_basic_1<-asreml(fixed=trait_ae~1 , random=~ped(ANIMAL) , data=Mermaids, ginverse=list(ANIMAL=ainv), na.method.X="omit", na.method.Y="omit") We can see the estimated variance components by calling ). In this model, we fitted just one random effect, and so the variance is partitioned into two components. The ratio of the value of a component to its standard error (the z-ratio) gives us an indication of the significance of the random effects. As the z-ratio of V a is greater than 2 (ie. the parameter is more that two standard errors from zero), it looks to be highly significant.

Random effect significance
We can formally test the significance of random effects by rerunning the model with the random effect of interest removed, and comparing the log-likelihoods of the full and reduced model. We demonstrate this here for the above model only, but the same has been done to generate the tables of results for all the models we show in this tutorial (and has been done to generate the significance of random effects in the main paper).

Heritability
Heritability in the simplest case is calculated as the proportion of total variance explained by the additive genetic variance. As there are only two components of variance in this model, this is simply calculated as V a /(V a + V r ). This can be done directly with the nadiv package in R to obtain both the estimate and the standard error for heritability: nadiv:::pin(m_basic_1, herit~V1 / ( V1 + V2 ) ) ## Estimate SE ## herit 0.5378937 0.04249887 Table 1 shows the absolute and proportional values of the variance explained for the additive genetic and residual variance, generated using the codes shown above.

Including environmental similarity
In order to account for non-genetic similarity between individuals, we can fit additional random effects that account for variance in tail-fin colour. In the first instance, we show how this can be done using continuous environmental measures.
Each individual in the Mermaids data frame has 5 measurements corresponding to the 5 environmental variables, in the columns named e_var_1 to e_var_5.

rownames(enm)<-Mermaids$id
To obtain the environmental similarity between individuals, we first calculate the Euclidean distance between each pair of individuals in the data, calculated as d (i,j) = k 1 (p k − q k ) 2 where p and q are the scaled environmental measures (from environment measures 1 to k) of individuals i and j. Note that this is not the physical distance between the locations of individuals in space. We then scale this matrix of distances so that they are between 0 and 1. There must be 1's on the diagonal, as an individual has an identical environment to itself. Values closer to 0 show individuals with more dissimilar environments. Thus elements of the matrix S n are: where max(d) is the maximum euclidean distance found between all pairs of individuals. # similarity matrix between individuals # calculates the euclidean distance between each individual s environment parameters p<-as.matrix(dist(enm, method="euclidean", diag=TRUE,upper=TRUE)) # then scales so that the values are between 0 and 1 p<-  ID1  ID2  ID3  ID4  ID5  ID6  ID7  ID8  ID9 ID10  ## ID1  We then invert the matrix using the solve() function so that the matrix inverse can be used in ASReml-R. # gets the matrix inverse p<-as(solve(p),"dgCMatrix") p<-as.matrix(p) We can include this matrix in the animal model in an analagous manner to including the additive genetic relatedness. The identities of individuals in the phenotypic data are linked to the inverse of the environmental similarity matrix by inlcuding it in the ginverse, and the random effect is specified as giv(ANIMAL2).
Note that as these models take longer to converge, and require more memory to run, we have increased the maxiter (number of iterations), workspace and pworkspace arguments. As a consequence, these models may struggle to run on computers with limited memory. For example, the model containing only additive genetic variance takes around 3 seconds on a Windows 10 desktop with 32GB RAM and 3.40GHz CPU, using only around 2% of the computers CPU. The model below takes around 7 minutes, and uses uses around 15% of the CPU. As more random effects are included in the sections that follow, computing requirements increase futher.
For tail-fin colour, the model specification is: We can then see the variance components that have been estimated from this model (Table 3). This model estimates each variance to be close to 1, which was the variance that was simulated. Thus, the variances in tail-fin colour are accurately estimated using this model. As is shown in the main manuscript ( Figure 3), the inclusion of the environmental similarity matrix reduces the proportion of variance in additive genetic and residual components.

summary(m_env_1)$varcomp
As before, we can estimate the proportion of variance explained by the different components. For heritability: nadiv::: Giving a heritability close to the expected value of 0.33 (i.e. 1/3 of the total variance).

Body size and swimming speed
As before, we can run this model for the other two traits, to estimate the variance caused by the environments that individuals experience. For body size: m_env_2<-asreml(fixed=trait_mee~1 , random=~ped(ANIMAL) + giv(ANIMAL2), We see an estimate of environmental variance that is close to the simulated variance of 1. However, the additive genetic and residual variance estimates remain above the simulated values of 1. Thus, this model does not account for all the variance (as the trait also includes maternal effect variance).
Similar overestimation of variances will be found for an equivalent model of swimming speed. Again, this is because there are other sources of variance in the trait which are not accounted for in this model.

Parental effects
Parental effects occur when an individual's phenotype is partially determined by the environment provided by their parents. This could occur early in development, such as provisioning of micronutrients in eggs, or later, such as through post-natal food provisioning. Siblings raised together will typically experience similar parental effects, which can increase phenotypic similarity between relatives, beyond that caused by shared genetic effects. The effects that parents have on their own offspring can be driven by their own genes, or their own environments. Thus, we show here some different ways to include parental variance in the animal model. In all cases, maternal effects were simulated, and so maternal effect variances are estiamted in the models. Paternal effects, however, can also be estimated in the same way.
Both body size and swimming speed were simulated to have maternal effects on trait values. Specifically, body size is affected by maternal environmental effects, and swimming speed is affected by both maternal environmental and maternal genetic effects. We show first how maternal effect variance (with no assumption as to the cause of the maternal effect) can be included in the model, and then how specific maternal environmental and genetic variances can be estimated.
In the simplest case, maternal effect variance can be estiamted by including maternal identity as an additional random effect in the animal model. This accounts for variance shared by individuals that share a mother (full siblings and maternal half-siblings), above that which is due to additive genetic variance. The maternal effect is assumed to be normally distributed, with mean=0 and variance= V M . There are no covariances between mothers, thus the variance-covariance matrix of this effect is the identity matrix (I).
To separate maternal effect variance from additive genetic variance, the data needs to have related individuals that do not share mothers (e.g. paternal half-siblings, cousins, or cross-fostered individuals). Equivalent models can be run for paternal effect variances (not shown in this tutorial), which requires there to be related individuals that do not share fathers (e.g. maternal half-siblings).
As the maternal effect is derived from the environment that mothers experience, and limited dispersal means that related mothers experience similar environments, related individuals experience similar maternal effects. Thus, we expect that mothers covary in their effects on offspring trait due to their shared environments, and we need to account for this covariance.
And for swimming speed: m_dam_3<-asreml(fixed=trait_mgee~1 , random=~ped(ANIMAL) + giv(ANIMAL2) + dam, data=ped, ginverse=list(ANIMAL=ainv, ANIMAL2=p), na.method.X="omit", na.method.Y="omit", workspace=300e6,pworkspace=300e6, maxiter=50) Again, there is an overesmtation of additive genetic variance. Maternal variance is estimated to be 1.3, but this underestimates the total amount of maternal variance, as there are both environmental and genetic maternal effects, each simulated with a variance of 1. As for body size, relatives have mothers that share environments, and so experience similar maternal environmental effects. Additionally, swimming speed includes a maternal genetic effect, so relatives have mothers that share genes affecting their maternal effects. Thus, covariance between mothers in their effects on offspring is caused both by environmental similarity and additive genetic relatedness. Although they are not simulated here, traits measured in real populations may have permanent maternal environment effects, in which case the maternal effect variance estimation above (V m ) should be retained.

Maternal environmental variance
We can estimate maternal environmental variance in the same way that we estimated direct environmental variance. This makes use of the S n matrix, where each element describes the similarity of environment that two individual's experience. The maternal environmental variance is estimated with mean 0 and variance V M n , with the variance-covariance matrix determined by the S n matrix, u Mn ∼ N (0, S n * V M a . Here, instead of considering the similarity in the trait between individuals i and j with environmental covariance S n(i,j) , we instead consider the similarity in the trait between offspring of mothers k and l with environmental covariance S n(k,l) .

Maternal genetic variance
We know from the simulation of swimming speed that there is a maternal genetic effect upon this trait. Neglecting this effect causes overestimation of additive genetic variance (because relatives share both direct genetic effects, and can share maternal genetic effects that they experience). We therefore need to include maternal genetic variance in the model of this trait. In this case, we include a random effect with the variance-covariance of effects being determined by the additive genetic relatedness matrix (A), as was done for the additive genetic variance. The difference here being that the variance is considered at the level of the mother, rather than at the level of the individual, such that for individual i with mother j the matrix is used to find the relatedness A (j,k) , i.e. the relatedness between the mother j and another individual k. Thus the random effect for maternal genetic variance is assumed to be normally distributed with a mean of zero and variance A * V M a (i.e. u Ma ∼ N (0, A * V M a )).
To estimate maternal genetic variance in the model, we include the maternal identity (MOTHER) as a random effect, where this then has the variance-covariance structure determine by the inverse additive genetic relatedness matrix in the ginverse argument. Note that this model takes around 11 minutes to run on the same computer as before.

Including epiallelic similarity
As well as environmental variables, we simulated epigenetic marks (CpG islands) for all individuals (described in the data simulation supplement). For each epigenetic island, the number of methylated CpG points varied between 0 and 20, with some determined by the environment an individual experienced.
As some of the epigenetic states varied according to the environments, they should account for variance in the phenotypic traits. We can therefore use them here to estimate (direct and maternal) epigenetic variance. We do this in a similar way to which we estimate environmental variance, with a matrix of epigenetic similarity between individuals.
In order to generate the epiallele similarity matrix, we first center and scale the epigenetic data, so that each island has a mean of 0 and variance of 1, as was done for environmental measures: # mean centre, scale to have variance of 1 epl<-scale(epiallele, center=TRUE, scale=TRUE) From the scaled epigenetic data, we generate the similarity matrix by estimating the Euclidean distance between all individuals. We then normalise values so that similar individuals have values closer to 1, and dissimilar have values closer to 0: EP<-as.matrix(dist(epl, method="euclidean", diag=TRUE,upper=TRUE)) EP<-1-EP/(max(EP, na.rm=TRUE)) round(EP[1:10,1:10], digits=3) ##  ID1  ID2  ID3  ID4  ID5  ID6  ID7  ID8  ID9 ID10  ## ID1

# The matrix inverse is then taken for use in the model. EP<-as(solve(EP),"dgCMatrix") EP<-as.matrix(EP)
We can then include the model of epigenetic similarity between individuals, which accounts for its effects on the direct and parental environmental effects. For tail-fin colour we estimate the additive genetic variance and the direct epigenetic variance: m_epiallele_1<-asreml(fixed=trait_ae~1 , random=~ped(ANIMAL) + giv(ANIMAL2), data=Mermaids, ginverse=list(ANIMAL=ainv, ANIMAL2=EP), na.method.X="omit", na.method.Y="omit", workspace=300e6,pworkspace=300e6, maxiter=50) The results from this model then show that the epigenetic similarity explains a significant amount of variance.  The epigenetic similarity matrix estimates significant direct epigenetic variance in both traits, and maternal epigenetic variance for body size, although this underestimates true maternal variance.

Including a social network
We simulated a social network wherein individuals had connections within generations and between adjacent generations. Connections between generations were more likely for individuals that were close in space, and decreased with increasing distance. Between adjacent generations there was a higher probability of connections between mothers and offspring than for other pairs of individuals, again with decreasing probability of connections with spatial distance.
We can visualise the social network in 3D space, in a similar way to the pedigree. The physical locations of individuals are plotted on the X and Y axes (locations on spatial grid), and generations on the Z axis. Individual points are coloured by sex (blue for males, red for females). Each line represents an (unweighted) connection in the social network. Running the code below will produce a moveable 3D plot. We show the equivalent plot for just the first 3 generations below.  c(z1,z2), y = c(y1,y2), x = c(x1,x2), col = "grey85", lwd =1) } } Here, we show how social network links between individuals can be converted to a matrix of distances between individuals in the network, and passed to the model to account for social structure and possible cultural inheritance. Initially, we carry this out using unweighted distances between individuals: individuals that have a direct connection have 1 in the matrix, and all other individuals have 0. ### Go from weighted to unweighted distances social2<-social social2[which(social2>0)]<-1 From this, we then want to find the geodesic distances between individuals. This is the length of the shortest path between individuals in the matrix, such that directly connected individuals have a path length of 1, individuals with one intermediate connection get a path length of 2, and so on # gets geodesic distances between individuals # (ie number of steps on the matrix to connect them) net<-graph.adjacency(social2, mode = c( "undirected"), weighted = NULL, diag=FALSE) net2<-distances(net) net2[1:10,1:10] As we want individuals to be connected to themselves, we add 1 to every element of the matrix, so that the shortest paths are from an individual to itself.
We then normalise the matrix so that values lie between 1 and 0, which gives individual proximities between two individuals, with 1 being the closest proximity and 0 the furthest. # add 1 so individuals are connected most strongly to themselves net3<-net2+1 # scale matrix between 0 and 1 net4<-1/net3 We then take the matrix inverse, as with the other matrices. # inverse N<-as(solve(net4),"dgCMatrix") N<-as.matrix(N) This matrix can then be passed to the model to estimate the variance associated with the social network connections between individuals. For tail-fin colour: m_social_1<-asreml(fixed=trait_ae~1 , random=~ped(ANIMAL) + giv(ANIMAL2) , data=Mermaids, ginverse=list(ANIMAL=ainv, ANIMAL2=N), na.method.X="omit", na.method.Y="omit", workspace=300e6,pworkspace=300e6, maxiter=50) The results from this model show that there is a small but significant amount of variance explained by the direct social network effects, but it does not account for all the variance, so the additive genetic and residual variances remain overestimated. Again, whilst this estimates significant social network variance for both direct and maternal cases, these underestimate the true environment variance. This is because of the way the social network was simulated, compared to the actual simulations structure, which was based on environmental variables.

Including a weighted social network
In the example shown above, we reduced the social network to a binary state; individuals were either linked directly or not, and from that we calculated path lengths. Researchers may want instead to include weighted paths, for example with numbers representing the number of times a pair of individuals was recorded interacting.

Using a genomic relatedness matrix
Some readers may wish to use a genomic relatedness matrix (GRM), such as can be created using SNP chips, rather than a pedigree. These can be accomodated in such models easily. We have not simulated one here, but give code that could be used if one exists. Where users have a GRM matrix, it should be a matrix of values between 1 and 0 that give the proportion of genome (e.g. SNPs) shared between each pair of individuals. # read in matrix from file GRM<-read.

Running models in MCMCglmm
Some readers may wish to use MCMCglmm, a freely available software for running Bayesian mixed models, to run these models. Generally, these models have a similar set up to those shown above, with a few differences in the way that the pedigree and similarity matrices are prepared for the models.
It should be noted that these models are considerably slower to run in MCMCglmm than in ASreml-R. Due to this time constraint, we would recommend that researchers consider carefully whether they wish to use MCMCglmm or ASreml-R. For the models including a non-genetic similarity matrix we found that 1000 iterations took around 24 hours to run on a windows computer with 32GB of RAM and 3.40GHz CPU. Evidently, a more powerful computer would reduce this time to some extent.
As such, we include a minimal tutorial here, with models run for only a small number of iterations, to demonstrate how to utilise such models. To carry out a complete analysis using this tutorial, however, we recommend increasing the number of iterations, the burn-in period, and the thinning interval used. This is done by changing the arguments nitt, burnin, and thin in the model codes. MCMCglmm has a default of 13000, 3000, and 10 for each of these measures. We recommend increasing these considerably to carry out robust analyses.
MCMCglmm requires priors to be passed to models. We do not intend this tutorial to be a complete guide to priors, and we include simple priors for the models here. For those who are interested, we recommend reading Hadfield (2016) and Villemereuil (2012).

Additive genetic relatedness
In order to run the model, we then must take the inverse of the relatedness matrix between individuals. Note that this uses a different function than the equivalent function for models run in ASreml-r. # gets pedigree inverse for the model # first get the pedigree from the data pedigree<-Mermaids[,c("id","dam","sire")]

Basic model
We show first a basic animal model, which includes only the additive genetic variance, and residual variance, on the trait. Here, we fit only the mean/intercept of the reponse variable, by calling trait_ae~1 , and fit the single random effect of the pedigree with random=~ANIMAL. This is linked to the inverse relatedness matrix with the ginverse argument (ginverse=list(ANIMAL=ainv)).  The values given under G-structure in the model summary are the values for the random effects (in this case additive genetic variance, the ANIMAL term) and R-structure gives the residual variance estimates (units).

# Basic model -includes only additive genetic variance
The posterior means of estimates (post.mean) are given, along with the lower (l-95% CI ) and upper (u-95% CI ) limits of the 95% confidence intervals for these estimates. eff.samp shows the effective sample size for each parameter. Ideally these should be over 1000, thus the values for the random and residual variance estimates are low. This can be improved by running the model for a longer period of time, and increasing the thinning interval to reduce autocorrelation. This is done by altering the nitt (number of iterations) argument, burnin (the burn-in period, which is discarded) and thin (the thinning interval -how often iterations are sampled). Be aware that this means that the model runs for a longer period of time: # Basic model -includes only additive genetic variance prior1 <list(G = list(G1 = list(V = 1, nu = 0.002)), R = list(V = 1, nu = 0.002)) MCMC_basic2<-MCMCglmm(trait_ae~1 , random=~ANIMAL , data=Mermaids, ginverse=list(ANIMAL=Ainv), prior=prior1, nitt=13000*10, thin=10*10, burnin=3000*10) This gives effective sample sizes that are much larger, although there is not much change to the vraiance estimates.

Including environmental similarity
In order to account for non-genetic similarity between individuals, we can fit an additional random effect that account for variance in tail-fin colour. We show how this can be done using continuous environmental measures, as above.
As with the matrix shown for the ASreml-R models, we first scale and centre the data, so that the mean is 0 and variance is 1 for each of the environmental measures. The Euclidean distances between individuals are then taken, and values normalised to be between 0 and 1, just as above. # First gets the relevant columns from the data enm<-Mermaids[,c("e_var_1", "e_var_2", "e_var_3", "e_var_4", "e_var_5")] # Centers the data, and scales by standard deviation enm<-scale(enm, center=TRUE, scale=TRUE)

# similarity matrix between individuals # gets the euclidean distance between each individual s environment parameters
p<-as.matrix(dist(enm, method="euclidean", diag=TRUE,upper=TRUE)) # then scales so that the values are between 0 and 1 p<-1-p/max(p, na.rm=TRUE) We then need to pass the matrix inverse to MCMCglmm, so it is inverted using the solve() function. This is the same as the way we generate the matrix for ASreml-R, but we do not need the additional step of changing back into a matrix (as.matrix(p) argument is included after this step for ASreml-R only). # gets the matrix inverse p<-as(solve(p),"dgCMatrix") The full model that includes additive genetic and environmental variance is then run with the following code. The environmental similarity matrix is now passed as an additional argument to the ginverse, and connected to the phenotypes using the giv(ANIMAL2). Before starting this model, it should be noted that this is considerably slower than the models above -this took over 7 hours to run on a computer using 32GB of RAM and 3.40GHz CPU. The model below is run with the default number of iterations, thinning interval, and burn-in period, but o give reliable estimates these parameters should be increased further, which will increase the running time. # Environmental similarity model MCMC_env<-MCMCglmm(trait_ae~1 , random=~ANIMAL + ANIMAL2 , data=Mermaids, ginverse=list(ANIMAL=Ainv, ANIMAL2=p), prior=prior2) The variance estimates are now both around 1 (ANIMAL is additive genetic variance as before, ANIMAL2 is environmental variance): Nevertheless, the effective sample sizes are small, particularly for the environmental variance (ANIMAL2 ). Thus, the nitt, thin and burnin values should be increased. We do not present that here due to time constraints (for example, increasing the iterations by 10 as shown above will increase the run time to around 70 hours).

Including parental environmental variance
Finally, maternal environmal variance can also be included in the model, in the same way as the ASreml models shown above. We run this model only for 1000 iterations sing this took around 1.5 hours. In reality this model should be run much longer (likely upwards of 130000 iterations). # Environmental similarity model