Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Generalized structural equations improve sexual-selection analyses

  • Sonia Lombardi ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Department of Biology, University of Florence, Sesto Fiorentino, Florence, Italy, National Research Council-ISC (The Institute for the Complex Systems), Sesto Fiorentino, Florence, Italy

  • Giacomo Santini,

    Roles Conceptualization, Funding acquisition, Methodology, Supervision, Writing – original draft, Writing – review & editing

    Affiliation Department of Biology, University of Florence, Sesto Fiorentino, Florence, Italy

  • Giovanni Maria Marchetti,

    Roles Conceptualization, Formal analysis, Funding acquisition, Methodology, Software, Supervision, Writing – original draft, Writing – review & editing

    Affiliation Department of Statistics, Computer Science, Applications, University of Florence, Florence, Italy

  • Stefano Focardi

    Roles Conceptualization, Formal analysis, Funding acquisition, Methodology, Project administration, Supervision, Validation, Writing – original draft, Writing – review & editing

    Affiliation National Research Council-ISC (The Institute for the Complex Systems), Sesto Fiorentino, Florence, Italy

Generalized structural equations improve sexual-selection analyses

  • Sonia Lombardi, 
  • Giacomo Santini, 
  • Giovanni Maria Marchetti, 
  • Stefano Focardi


Sexual selection is an intense evolutionary force, which operates through competition for the access to breeding resources. There are many cases where male copulatory success is highly asymmetric, and few males are able to sire most females. Two main hypotheses were proposed to explain this asymmetry: “female choice” and “male dominance”. The literature reports contrasting results. This variability may reflect actual differences among studied populations, but it may also be generated by methodological differences and statistical shortcomings in data analysis. A review of the statistical methods used so far in lek studies, shows a prevalence of Linear Models (LM) and Generalized Linear Models (GLM) which may be affected by problems in inferring cause-effect relationships; multi-collinearity among explanatory variables and erroneous handling of non-normal and non-continuous distributions of the response variable. In lek breeding, selective pressure is maximal, because large numbers of males and females congregate in small arenas. We used a dataset on lekking fallow deer (Dama dama), to contrast the methods and procedures employed so far, and we propose a novel approach based on Generalized Structural Equations Models (GSEMs). GSEMs combine the power and flexibility of both SEM and GLM in a unified modeling framework. We showed that LMs fail to identify several important predictors of male copulatory success and yields very imprecise parameter estimates. Minor variations in data transformation yield wide changes in results and the method appears unreliable. GLMs improved the analysis, but GSEMs provided better results, because the use of latent variables decreases the impact of measurement errors. Using GSEMs, we were able to test contrasting hypotheses and calculate both direct and indirect effects, and we reached a high precision of the estimates, which implies a high predictive ability. In synthesis, we recommend the use of GSEMs in studies on lekking behaviour, and we provide guidelines to implement these models.


Sexual selection is a fundamental evolutionary force that operates either through (i) direct competition between males or (ii) female mate choice which leads to the evolution of forms of exaggerated and useless ornaments in males (e.g. the peacock’s tail). The ornaments are supposed to display male genetic quality or the absence of sexually transmissible diseases [1]. Albeit a long record of studies since Darwin’s time have addressed this problem, many questions about sexual selection remain open, and this continues to be a major research theme. For the present contribution, the main question is how to investigate the factors affecting male copulatory success in lek mating. In lekking species, the two sexes interact mainly during the rut [2, 3] when males defend small display territories inside an arena or lek. For males, lekking is a high cost—high benefits strategy, in which the risk of injuries and even death is high, but a few dominant males may monopolize most of the copulations [4, 5]. On the other hand, females are supposed to benefit from visiting a lek, since they can choose among several potential partners [6, 7].

Lekking has been described in many different taxa (reviewed by Hoglund & Alatalo [3]) such as insects, fishes, amphibians, reptiles, birds [3, 8, 9, 10] and mammals [2, 7, 11]. In leks, male mating success is highly skewed [12]. However this features is not unique to leks and it is found in other reproductive systems as well [13]

As a specific example, in fallow deer (Dama dama) [11, 14], the breeding system is highly variable and lekking is not the only strategy [5, 15]. Independtly from the breeding system in this species the skew of male copulatory success appear always very high. Two main hypotheses have been proposed to explain the observed asymmetry in copulatory success: female choice, FCH, and male dominance, MDH, [16, 17, 18]. FCH [19] assumes that the females select mates on the basis of the phenotypic traits of males, while according to MDH the copulatory success is determined by lek attendance and a high dominance rank [20] In fallow deer, several studies pointed out that female choice is the most likely determinant of copulatory skew [15, 21, 22, 23]. However, Clutton-Brock et al. 11] argue that copulatory success may not be solely related to female preferences for specific male traits, but it may also arise from different reasons, such as the need to minimize the risk of predation or harassment. Other authors, on the contrary, suggested that copulatory success strictly depends on male dominance rank [5, 24, 25, 26, 27, 28].

A number of different statistical techniques have been used to investigate the copulatory success in lekking species (e.g. [12]). Most papers have applied standard linear models (e.g., [12, 29, 30]), mixed models to account for repeated observations (e.g. [31, 32]), or Generalized Linear Models to manage non-normal distributions. Finally, a few papers have used different approaches, such as logistic regression [33], path analysis [34], and partial correlations [35]. A detailed list of the methods used in the literature is reported in S2 Table. A critical reading of this literature puts into light several methodological shortcomings: i) multicollinearity among explanatory variables [35], (ii) erroneous handling of non-normal and non-continuous distributions of the response variable, and (iii) problems in inferring cause-effect relationships, so that no firm decision on the prevalence of female choice or male dominance could be established [34].

Multicollinearity, which occurs when two or more predictors in a multiple regression model are highly correlated, leads to variance inflation and increase type-I errors, thus making some of the coefficients appear significant when they are not [36].

Another important source of bias depends on erroneous handling of non-normal and non-continuous distributions of the response variable. Copulatory success is a classic example of such a variable; in leks, only a few males have access to mating, and this process leads to a zero-inflated distribution of copulations. In many cases, this problem is dealt with using square root or logarithm transformations [12, 33, 35, 37], but despite this procedure being recommended in general biometry textbooks (e.g., [38]), its validity is restricted to cases when deviations from normality are only to limited extent. Moreover, discrete response variables containing many zeros cannot be transformed into normal distributions, and inference is doomed to be severely biased [39, 40].

There are concerns related to the link between correlations and causation, which are tricky to deal with. Explanatory variables and copulatory success may, in fact, appear unrelated when they are related, or on the contrary, they may be correlated even when no causal link is present. A spurious or missing correlation may arise for several reasons which include (i) a common causation that induces a false relationship or cancels out an existing association, (ii) a reciprocal association loop, (iii) a conditional relationship between explanatory and response variables following the value of a third control variable, or (iv) a non-linear association between dependent and independent variables [41, 42, 43, 44]. When a correlation between two variables is detected, cause-effect relationships cannot be easily deduced without further assumptions [45,41]. The best way to test causal relationships is to use a proper experimental design where the hypothetical cause is directly manipulated [45]. However, manipulative experiments are difficult to achieve, and researchers have to rely mainly on observational studies [12, 3, 46].

The problem of inferring cause-effect relationships among variables can be addressed by path analysis or Structural Equation Models (SEM) [47]. In field studies often the variables of interest cannot be directly recorded by the observers. For instance, we cannot measure the “sex appeal” of males [48]. However, we can measure some traits we expect to be correlated to “sex appeal” and so obtain an indirect evaluation of the variable of interest. This is the same done in principal component analysis: a reduced number of meaningful factors are estimated from the correlations among a large number of descriptors. In SEM terminology, we refer to the unobservable factors as latent and to the observed descriptors as manifest (a detailed discussion is presented in S3 Text and in S3 Fig). A SEM is a combination of a measurement model that defines latent variables using one or more manifest variables and a structural model that imputes causal relationships between latent variables [41]. The development of a measurement model is also important to control for the errors introduced during observations, i.e., it represents a state space model for the unobserved variables of interest. In this way, a latent variable is not directly observed, but its existence is inferred by the way it influences manifest variables that can be directly observed [41].

One known limitation of standard SEM is to assume that all variables are normally distributed [49]. The introduction of Generalized Structural Equations Models (GSEM), may overcome this limitation. In GSEM, it is possible to have a model with both continuous and discrete variables grouped together in the same latent construct. As such, GSEM combines the power and flexibility of both SEM and GLM in a unified modeling framework. The advantages of GSEM are: (i) to evaluate potential causal relationships with the “structural model”; (ii) to consider both direct and indirect effects of multiple interacting factors, simultaneously [41, 47, 50, 51]; (iii) the possibility of using appropriate probability density functions other than the normal one for manifest indicators and latent constructs.

In this paper, we contrast the main statistical methods used in literature to GSEM using data from a specific study case about fallow deer lekking behaviour. First, we reviewed the available literature on lekking behaviour to obtain an overview of the statistical methods used. Secondly, we fitted the main types of models used. Third, within a SEM framework, we formulated two models, one describing the FCH and the other the MDH hypotheses, and fitted them using both SEM and GSEM, for comparison. Finally, we compared the predictive performances of the different methods using information theoretic indexes (AIC and BIC), residual analysis, and precision of regression coefficients.

Materials and methods

Study area and data collection

Field observations were carried out during 1991 and 1992 ruts (September-October) in the Preserve of Castelporziano near Rome (Italy) (coordinate), an area covering 42 km2. The habitat is characterized by an old-growth natural oak wood, with both evergreen (Quercus ilex and Q. suber) and deciduous (mainly Q. cerris and Q. frainetto) tree species. A detailed description of the vegetation of the study area can be found in Bianco et al. [52]. Information on ungulate populations are given in Focardi et al. [53] and Imperio et al. [54]. The dataset was used to estimate two different dominance indexes: (a) Dom [55]; (b) David’s score, Ds [56]. To obtain index values comparable across years, Dom and Ds were relativized to the number of fights observed in each year. The number of observed copulations achieved by a buck in one rut was used as a measure of copulatory success (CopS).

Two measures of lek attendance were computed: LA1, is the number of total days in which an animal was seen at the lek and LA2 is the number of days the animal was able to hold a territory. Finally, we estimated the average number of females observed in one buck’s territory (harem size—HS) and courtship success (CourtS) as the number of courtships terminated with a copulation divided by the total number of attempts (number of copulations /number of courtship events, for every male).

Two variables were used: a) the total number of spellers (TotS) and b) a measure of fluctuating asymmetry for small spellers [57, 58] ASST.

Further details on study area, data collection, data validation and measures computations are provided in S1 Text, S1 Table, S1 and S2 Figs in Supporting information.

Ethic statement

This work does not imply animal handling or capture. The “Segretariato alla Presidenza della Repubblica” was the authority responsible for the permission to work in the Preserve of Castelporziano, Rome, (Italy). The fieldwork was based on a research and management agreement between the I.S.P.R.A -The Italian National Institute for Environmental Protection and Research (ex I.N.F.S. National Institute for Wildlife) (former institution of SF 1988–2011), the Director of the Preserve of Castelporziano, Dr. A. Demichelis, the Preserve research responsible, Dr. A. Tinelli, in collaboration with the Presidential Estate rangers, and the Corpo Forestale dello Stato (C.F.S.) under the combined prescriptions of the Italian law which regulates studies on wild species and does not require that the I.S.P.R.A. obtain permits from any other authorities. The field study did not involve endangered or protected species and this implied that it was not required any approval from Institutional Animal Care and Use Committee. The study was not carried out on private land.

Statistical analysis

We compared several modelling approaches described in the literature. We were aware that some of these approaches are inherently flawed, but we decided to use them due to their widespread use in the pertinent literature on leks (cfr. S2 Text and S2 Table). All the tested models have CopS as the response variable. Note that CopS is discrete by definition (because it is a count) and hence cannot be assumed to be normally distributed.

Linear Models and Generalized Linear Models

The copulatory success of the i-th buck (CopS) is modelled as: (1) where the xp,i are predictor variables, the βs regression coefficients and εCopS is the error term.

Following the approaches described in the literature, we first used ordinary least squares regression where the response variable CopS. was untransformed, log-transformed, or square-root transformed. Secondly, we used GLMs for count data. The following models were considered:

LM1, multiple regression model without CopS transformation;

LM2 where the dependent variable is log(CopS +1);

LM3 where the dependent variable is log(CopS +0.5);

LM4 where the dependent variable is log(CopS +0.1);

LM5 where the dependent variable is CopS 0.5;

GLM1, Generalized Linear Model where CopS follows a Poisson distribution;

GLM2, assuming that CopS follows a Negative Binomial distribution;

GLM3, assuming that CopS follows a Zero Inflated Poisson distribution (ZIP);

GLM4, assuming that CopS follows a Zero Inflated Negative Binomial distribution (ZINB);

GLM5, assuming that CopS follows a Hurdle at Zero Distribution (Hurdle). In the Hurdle models a Bernoulli probability governs the binary outcome of whether a count variable has a zero or positive realization. When the realization is positive the conditional distribution is modelled by a truncated at zero count data model.

For each type of model we considered both the full model, which includes all significant (P<0.05) and non-significant coefficients and the Minimal Adequate Models (MAM) which include only significant values [59]. MAMs, hereafter denoted by the suffix r (e.g. GLM4,r) were obtained using a p-value selection procedure [60].

Akaike information criterion (AIC) and Bayesian information criterion (BIC) were also computed to assess model performances.

Statistical analysis was carried out in R [61], using the packages fitdistrplus, gamlss, pscl, vcd.

Generalized Structural Equation Models.

A. A SEM requires the a-priori definition of links among model variables in the form of a regression equations system. The goal of this class of models is minimize the difference between estimates and expectations variance-covariance matrix of data.

Latent variables are unobserved factors denoted,η1,η2,….,ηn that represent an hypothetical construct that can be inferred by the way it influences manifest or observed variables (continuous, Yi = y1, y2,..,yn) [41, 51].

A SEM model is composed by two sub-models: a measurement model that describes the relationships between latent variables and their manifest variables and a structural or causal model that constitutes a directional chain system that describes the hypothetical causal relationship between the constructs of theoretical interest (latent variables) using path diagrams (Fig 1a and 1b).

Fig 1. Path diagrams for a) the “dominance male” model (MDH) and b) “female choice” model (FCH).

Variable names are: ASST = the fluctuating asymmetry of small antler’s spellers; TotS = total number of small and large antler’s spellers; Dom = Dominance Index (Clutton-Brock Index [55]) divided by the total number of bucks of each year; Ds = the David’s score (Gammel et al.) [56] divided for the total number of bucks of each year; LA1 = number of days in which the animal was present in the lek. LA2 = total number of days of presence/territory in different locations of the same lek. HS = average number of females in a male’s territory; CourtS = the fraction of courtship events terminated with a copulation (number of copulations / number of courtship events, for every male); CopS = total copulatory success of the i-th buck in one rut. The number of observations is the same for all models (N = 118). Symbols and variables are described in the text and in S1 Table.

Structural coefficients or regression coefficient (γ, β, λ) represent the effects of each independent variable on the dependent variable (Fig 1a and 1b).

A manifest variable, in a SEM with latent variables, plays a role of endogenous variable if it is predicted by another variable in the model and is therefore a response variable; it is assumed to be generated as a linear function of its latent dimension and the residual error term represents the imprecision in the measurement process. An exogenous variable whose variation is not explained in a model (i.e. fluctuating asimmetry of small spellers ASST or Dom). A description of SEM modelling is reported in S3 Text, S3 Table and S3 Fig.

GSEMs represent a generalization of SEMs by allowing the use of discrete variables and non-Gaussian distributions. They combine observed (or manifest) and latent variables representing unmeasured constructs. A GSEM [62] reads: (2) where x and y are vectors of manifest variables and η, ξ, ζ represent the latent variables, while δ, and ε denote the error terms. The functions (fη, fy, fx) provide a general way to represent the connections between the variables within the parentheses to those on the left hand side of each equation. We developed and compared two different causal models, one assuming that copulatory success is determined by MDH and the other one based on FCH.

We verified that the number of parameters is identifiable according to rules 1 and 3 of Shipley [41]. We used a robust maximum likelihood estimator and a sandwich estimator [63]. We fitted GSEMs with both Mplus [64] and STATA [65]. We used both softwares to check that the results are identical. Further STATA provided case-specific residuals which are not outputted by Mplus. On the other hand, Mplus returns the standardized path coefficients and total, direct, and indirect effects which STATA does not compute. The STATA and Mplus codes used to generate SEM and GSEM models are presented in S4 Text.

Models’ comparison.

Unfortunately, there is no a simple method for comparing these different sets of models. GLMs and LMs can be compared by AIC or BIC, but only if the dependent variable is not transformed [66]. To overcome this problem and make all LMs and GLMs comparable, we calculated the maximum likelihood estimates from the log-transformed or root square—transformed model applying the formula reported in Weiss [67] (see S5 Text for details).

The comparison of SEM or GSEM with AIC is questionable due to the presence of latent variables which increase AIC values making these models not comparable to GLMs [68]. On the other hand, the use of absolute fitting indexes is vulnerable to criticisms [69, 70]. We compared models by two different approaches. First, we measured the precision of each estimated regression coefficient by computing its coefficient of variation (, where T is the statistic test and is the chi-square test with one degree of freedom). For a more general evaluation of the model’s precision, we computed the median CV for the parameters estimated by each model [71]. Second, we performed an analysis of case-specific residuals. In principle, if a model correctly fits the data, the residuals are expected to have zero mean, normal distribution, without any pattern or structure. We visually checked residual distributions and computed their mean, variance, and kurtosis. The best distribution is the one with the smallest variance of residuals, symmetrical and centered around zero.

Definition of working hypotheses.

In this paper, we contrast two working non-nested hypotheses, “male dominance” (MDH) and “female choice” (FCH). The structure of the models corresponding to the Male Dominance Hypothesis (MDH) and the Female Dominance Hypothesis (FDH) is shown in Fig 1. We have assumed, according to literature, the existence of four latent variables: ξ1 represents the effect of antler shape and is described by ASST and TotS, ξ1a represents male dominance and is described by Dom and Ds, η1 represents lek attendance (LA1 and LA2). Finally, η2 represents courtship and is measured by HS, CourtS, and CopS. The use of latent variables allowed us to reduce the unavoidable errors in the measurement of manifest variables. For MDH we assume that ξ1a influences η1, or in other words the fighting ability of bucks determines their lek attendance and territory holding. Being able to defend a territory allowed a buck to keep a harem and finally to sire females. For the FCH we assume that male phenotypic quality, ξ1, which represents its health and physical fitness, allows the buck to stay in the lek for a long time and to be selected by wandering females.

Note that SEM allows us to study the effects of remote and proximate causes of male copulatory success in the same statistical framework. Further, the use of latent variables reduces the unavoidable errors in the measurement of manifest variables. Once the measurement model is defined, we can establish appropriate causal relationships among latent variables.

The MDH is implemented by the following system of regression equations (Fig 1a): (3)

The model for FCH is represented in Fig 1b and reads: (4)

FCH and MDH used 21 and 19 free parameters, respectively, which are identifiable, according to Shipley [41].


The distribution of CopS is showed in Fig 2. Most of the bucks (68.6%) had no copulations. The number of copulations per individual ranged from 0 to 43, and the distribution has high kurtosis (32.33) and skewness (4.99). The distribution of CopS is best fitted by a negative binomial distribution (χ2 = 0.28, P = 0.595), which is much better supported than alternative models (ZINB, ΔAIC = 33.39; ZIP, ΔAIC = 152.47; Poisson, ΔAIC = 535.02). Data transformation changes the discrete CopS distribution into a continuous one, which remains, however, non-normally distributed (Shapiro-Wilk Test: log(CopS+1), W = 0.648, P<0.001; log(CopS+0.5), W = 0.659, P<0.001; log(CopS+0.1), W = 0.662, P<0.001; CopS0.5, W = 0.635, P<0.001).

Fig 2. Frequency distribution of number of copulations achieved by each buck (CopS) before (upper left panel) and after transformation.

The continuous red line shows the theoretical normal curve for reference.

Linear and Generalized Linear Models

The AIC and BIC values associated with LMs with untransformed response variables and GLMs are reported in Table 1. LM1 and LM1r have considerably higher AIC and BIC than GLMs. Among the different GLMs, GLM2,r exhibits the lowest AIC and BIC values, while the corresponding full model, GLM5,r has higher AIC and BIC values. GLM4,r presents the same AIC values as GLM2,r, but a higher BIC values. As expected, MAMs show lower fit indexes than corresponding full models, except in Hurdle model. The different models identify different sets of significant variables, and the unstandardized coefficients for all models are given in S4 Table. In synthesis, among the eight variables considered, only HS and CourtS (except in GLM5,r) are always detected as significant, whereas TotS, Ds, and LA1 were only put into light by some of the GLMs. Note, however, that their estimates are nonsensical since they are always negative, whereas positive values are expected. This is an example of Simpson’s paradox, which Pearl (e.g. [47]) has discussed as a common problem with non-SEM studies.

Table 1. AIC and BIC values associated with linear (untransformed) and GLM models.

The models with linear transformed response variables (Table 2) have erratic AIC and BIC values varying from a minimum for LM4,r (AIC = -67.6 and BIC = -59.3) to a maximum associated to LM5. (AIC = 347.7 and BIC = 372.6). AIC and BIC values vary in an unpredictable way depending on the value of the constant added to the transformed variable (or in the calculation of maximum likelihood in the case of the square root transformation). Due to the complete unreliability of data transformations, this approach will not be considered further in this paper.

Table 2. AIC and BIC values associated with linear models with transformed response variables.

Structural Equation Models

The variance-covariance/correlation matrix used in SEM and GSEM is reported in S5 Table.

To select the appropriate distribution of CopS for GSEM, we first selected the discrete distributions available both in Mplus and STATA. It resulted that only two of these distributions, Poisson and Negative binomial, were supported. According to the results of Table 1, we first tested the negative binomial distribution, but the model did not converge in either software. Thus we were forced to use the Poisson distribution.

If we implement the SEM for MDH with Mplus, convergence is not achieved, because the residual covariance matrix is not positive definite [72] and the residual variances associated with LA1 have negative values. Note that the AIC values yielded by Mplus are biased. Indeed, in GSEM the convergence of the MDH model is only achieved by fixing the path-coefficients for Dom, LA1, and HS to a predefined value. The MDH model (SEM or GSEM) does not converge with STATA. With these problems of convergence, GSEM, was always better than SEM (ΔAIC = 433.4 and ΔBIC = 436.2). On the contrary, the FCH converges using both SEM and GSEM. Even for FCH, GSEM provided a better fit than SEM (ΔAIC = 438.9, ΔBIC = 441.7). In synthesis, this analysis shows that FCH is always preferred to MDH by having lower AIC and BIC values both when fitted using SEM and GSEM (ΔAIC and ΔBIC >140 always). Due to these results, the MDH model will not be considered in the following analyses. Path coefficients for GSEM-FCH models are shown in Table 3. All coefficients are highly significant (P<0.003). Noteworthy, the path coefficient for ASST is positive and not negative as expected.

Table 3. Standardized path coefficients, SE, and p-value for FCH in GSEM.

Model comparisons

The comparison of the models is reported in Table 4. It clearly appears that the precision of MAM models for LMs and GLMs is higher than that of the corresponding full models. Considering the median CV values, the two less precise models are GLM4 (median CV = 1.089) and LM1 (median CV = 0.828), while the more precise models are GLM4,r (median CV = 0.162) and GLM2,r (median CV = 0.148). LMs and GLMs were clearly outperformed by both the SEM (median CV = 0.079) and, to a larger extent, by GSEM (median CV = 0.059), whose coefficient CV values range from 0.02 to 0.319.

Comparable results are obtained when analysing the distribution of residuals (Table 5, Fig 3). In LMs, the variance is very large, and the distribution is strongly leptokurtic with heavy tails (Fig 3). As a comparison, statistics of the distribution of residuals for LMs with transformed response variables are shown in S6 Table. These distributions are characterised by large variances and kurtosis, and none is centred on zero.

Fig 3. Model validation graph.

a) Distribution of standardized residuals of GLMs, SEM, and GSEM models. For LMs and GLMs, both full (a) and reduced models (b) are shown. Models are in Table 1. The respective descriptive statistics of the different distribution models considered in this paper are reported in Table 5.

Table 5. Mean, variance, and kurtosis for residual distributions of the different models considered in this paper.

GLMs perform better than LMs (Table 5), distributions remain leptokurtic, but variances are smaller, and the mean is slightly biased low (Fig 3a and 3b). The residuals associated with SEM-FCH (actually to the relationship between CopS and η2), although their mean is close to 0, have a strongly leptokurtic distribution and have a variance much larger than that of GLMs (but not LMs). Finally, the residuals associated with GSEM-FCH have a low variance and the least value of kurtosis among the studied models.

Interestingly the number of regression coefficients that are significant is maximal in SEM and GSEM (Table 4). Since results indicate that GSEM-FCH is the model more appropriate for our data (lower AIC/BIC, lower residuals’ variance, and lower CV median), it is interesting to investigate total effects (cfr. S3 Text) for this model (Table 6). Noteworthy, the impact of ξ1 and η1 on CopS is of similar size with respect to η2, while ξ1 and η1 have much smaller effects on CourtS or HS than η2, which suggests a remote causation for CopS. The impact of ξ1 on both ASST and TotS, but to different degree, is more relevant for ASST than TotS.


The data collected at Castelporziano on the mating behaviour of fallow bucks represents a typical example of the many studies performed on the leks of this species [11, 12, 34, 23] and other species of vertebrates [9, 12, 29]. These behavioural studies are important not only to identify the proximate causes of mate selection, but also for determining the intensity of sexual selection and understanding the evolution of exaggerated traits in males.

A literature review (cfr. S2 Text and S2 Table) allowed us to select the more popular methods used in previous research and to contrast them with innovative GSEMs. The use of the same dataset to compare different statistical methodologies is useful for evaluating their relative efficiency in data fitting. In general, LMs appear to be severely biased, and although GLMs may improve the reliability of the results, they overlook several important effects and the estimated coefficients still have low precision, which severely jeopardizes their predictive capacity. It is worth stressing that data transformation is not appropriate to normalize data distribution, since results appear extremely sensitive to the specific function used. This problem is exacerbated by the large number of zeros in the distribution of male copulatory success.

The introduction of GSEMs in the analysis of lek mating appears to represent a relevant leap ahead in the field. Our study provided evidence of several advantages of GSEMs compared to GLMs. First, the collinearity of predictors is no longer a nuisance provided that an appropriate measurement model is built, so we save part of the information collected in the field, which is usually lost in GLMs to reduce variance inflation [36]. Second, GSEMs are a flexible tool since they allow contrasting different casual models (e.g. using AIC, BIC, or other fit indexes) which must be formulated a–priori. In comparison to both LM and GLM, a proactive model formulation improves the awareness of the biological significance of the mechanism to be tested and allows scholars to modify a basic theoretical construct by introducing specific paths which are known or thought to be relevant in each particular study condition. This feature of SEMs allows us to include both general theoretical statements and specific conditions in the same model, which are then evaluated together. The publication of the variance-covariance matrix has the advantage of allowing other scholars to replicate the results easily and to propose different theoretical models pertinent to the system of interest, and in doing so, improve the transparency of the research and the full reproducibility of the results. However the availability of rough data can be useful to adjust the standard errors. Finally, SEM/GSEM help to control for measurement errors, a much neglected flaw in most quantitative analyses.

GSEM represents a bridge between the descriptive approach developed in LM and GLM and experimental tests with manipulative treatments; indeed the consistency of alternative causal paths can be tested, and when possible, the results can be used to develop more stringent experiments.

The importance of using GSEMs is well represented by the between-method comparisons reported in this study. First, we were able to show that, with respect to GLMs and even more to LMs, GSEMs suggest the potential influence of a larger number of predictors, in other words more informative models can be developed. This may have a strong impact on the interpretation of the study. For instance, both LMs and GLMs (except for the Poisson models) were unable to detect any effect of predictors referring to male dominance, which are however present, albeit with a small effect. Indeed in the literature, several authors were unable to detect these effects at all (e.g. [8,10, 14, 73]).

The second relevant aspect of GSEMs is the increased precision of the estimates of the regression coefficients. For several predictors GLMs yielded CV values >50% which are clearly unacceptable, while with GSEMs, CVs were often <10%, a precision we consider “acceptable” for a field study. The analysis of residuals in GSEMs and GLMs confirmed that the former allowed a better fitting of the data than the latter.

While these results are not meant to disprove the available results about lek breeding of fallow deer based on linear models, the analysis of our dataset illustrates some advantages in using GSEMs for discrete responses. SEMs are more flexible and have more parameters than GLMs and may better fit the data of interest. Indeed, the formal definition of contrasting working hypotheses, such as FCH and MDH in this study, is illustrative of the potentiality of SEM for hypotheses testing. On the other hand, with respect to LMs and GLMs, SEM are data hungry and Shipley [41] gives a rule of thumb to decide the number of parameters that can be safely estimated given a certain sample size.

The practical use of GSEM presents several difficulties. The main problem is that the likelihood of SEMs with latent variables is generally multimodal, and there is a need for a general algorithm to locate the global maximum. Moreover, the algorithm sometimes does not converge to a proper solution and this usually suggests that the model is not identifiable (at least in some parts). A partial remedy is to include reasonable identifiability constraints. In path analysis or with GLMs, the problems of non-convergence are generally absent.

One drawback that may limit a wider diffusion of GSEM is that the possibility of modelling non-normal variables is not yet implemented in widespread statistical packages, such as SAS, R, or S-plus. In this paper, GSEMs have been implemented in Mplus and STATA. We support the importance of using both packages, because they present complementary advantages and disadvantages. For instance STATA provides case-specific residuals, which are not outputted by Mplus, but Mplus returns the standardized path coefficients and total, direct, and indirect effects which STATA does not compute. The use of Mplus requires caution, because to get convergence, it automatically constrains the value of some path coefficients to be one. In STATA, constraints have to be specifically applied, which is a feature that improves awareness for the user. In our experience, STATA is much slower than Mplus, but it is well-documented; in some cases STATA, unlike Mplus, failed to converge (e.g. with MDH). However, STATA implements only a limited GSEM procedure, for example it does not support ZIP or ZINB distributions despite the greater flexibility in model specification.

The analyses in this paper were developed under a frequentist approach. A Bayesian analysis of our data with GSEM is outside the scope of the present study and would require further research especially as far as the choice of priors is concerned. For an introduction to Bayesian SEMs see Kaplan & Depaoli [74].

The importance of this study lies in the fact that, to our knowledge, it is the first comparative study of SEM and GSEM models. We believe that past work should be reviewed in the light of the results obtained here. Specifically, the results from studies using LMs should be considered with great caution, particularly in those cases where assumptions were clearly violated and transformations to normalise non-normal variables were applied. Interestingly, Grace et al. [75] analysed the species richness-productivity relationships using SEM and showed that an integrative model has an higher explanatory power than traditional linear models, since SEM allows us to integrate competing hypothesis into a single model. Furthermore, SEMs help to solve the Simpson’s paradox [47]. Finally, it is important to stress that the use of GSEMs can be extended to other behavioural and ecological contexts characterised by non-normal distributions of variables. SEMs are getting traction in behavioural studies and in ecology. According to the WOS (accessed on the 13/5/2016), the number of ecological and zoological papers using SEM is increasing by 7% per year. Thus, GSEM can find wider and wider opportunities for application. In particular, the possibility of using SEMs to test hypotheses in competition and investigate both remote and proximate effects is of particular interest in ecological and evolutionary studies. The present study can therefore stimulate the application of GSEM to different study cases.

Supporting information

S1 Text. Details on data, validation and measures computation.


S2 Text. A review of pertinent papers about lek in mammals and bird.


S3 Text. A description of SEMs modelling including the total effects.


S4 Text. Mplus and STATA codes used to generate SEM and GSEM.


S5 Text. Maximum likelihood estimates from the log and root square transformed model including R code.


S2 Table. Table of papers review about lek in mammals and bird.


S3 Table. Table with the complete list of variables name of the models and the respective path coefficients.


S5 Table. Variance-covariance and correlation matrix used in SEM and GSEM for FCH and MDH models.


S6 Table. The results of a comparison of residuals statistical analysis for LMs with transformed response variable.


S1 Fig. Study area in Castelporziano (Rome, Italy).


S2 Fig. Phenological characters of fallow deer buck (Dama dama) in Castelporziano (Rome, Italy).


S1 Dataset. Data set to implement SEM and GSEM with STATA and Mplus.



We thank the Segretariato alla Presidenza della Repubblica for permission to work at Castelporziano, A. Demichelis, A. Tinelli, the Preserve and National Forest rangers, the Corpo Forestale dello Stato (C.F.S.), Marco Festa-Bianchet for useful comments, A. M. De Marinis, E. Pecchioli, students, and researchers for their important contributions.


  1. 1. Davies NB, Krebs JR, West SA. An introduction to behavioural ecology, 4th ed. Wiley-Blackwell; 2012.
  2. 2. Wiley HR. Lekking in birds and mammals: behavioural and evolutionary issues. Advances in the Study of Behaviour. 1991; 20: 201–291.
  3. 3. Höglund J, Alatalo RV. Leks. Princeton University Press; 1995.
  4. 4. Bradbury JW. The evolution of Leks. Alexander R.D. and Tisskl D. ed. Natural Selection and Social Behaviour, New York: Chiron Press; 1981. pp 138–169.
  5. 5. Apollonio M, Festa-Bianchet M, Mari F, Mattioli S, Sarno B. To Lek or not to Lek: mating strategies of male fallow deer. Behavioural Ecology. 1992; 3: 25–31.
  6. 6. Bradbury JW, Vehrencamp SL, Gibson RM. Leks and unanimity of female choice. In: Greenwood P. J., Harvey P. H. and Slatkin M. ed. Evolution. Essays in honour of John Maynard Smith. Cambridge: University Cambridge; 1985. pp. 301–314.
  7. 7. Clutton-Brock TH, Deutsch JC, Nefdt RJC. The evolution of ungulate leks. Animal Behaviour. 1993; 46: 1121–1138.
  8. 8. Rintamaki PT, Hoglund J, Alatalo RV., Lundberg A. Genetic and behavioural estimates of reproductive skew in male fallow deer. Ann. Zool. Fennici. 2001; 38: 99–109.
  9. 9. Sardell RJ, Kempenaers B, Duvall EH. Female mating preferences and offspring survival testing hypotheses on the genetic basis of mate choice in a wild lekking bird. Molecular Ecology.2014; 23: 933–946. pmid:24383885
  10. 10. Kervinen M, Alatalo RV, Lebigre C, Siitari H, Soulsbury DC. Determinants of yearling male lekking effort and mating success in black grouse (Tetrao tetrix). Behavioral Ecology. 2012; ars104, 1–9.
  11. 11. Clutton-Brock TH, Green D, Hiraiwa-Hasegawa M, Albon SD. Passing the buck: resource defence, lek breeding and mate choice in Fallow deer. Behav. Ecol. Sociobiol. 1988; 23: 281–296.
  12. 12. Fiske P, Rintamaki PT, Karvonen E. Mating success in lekking males: a meta-analysis. Behavioural Ecology. 1998; 9: 328–338.
  13. 13. Lukas D, Clutton-Brock T. Costs of mating competition limit male lifetime breeding success in polygynous mammals. Proc. R. Soc. B. 2014; 281: 20140418. pmid:24827443
  14. 14. Apollonio M, Festa-Bianchet M, Mari F. Correlates of copulatory success in a fallow deer Lek. Behav. Ecol. Sociobiology. 1989; 25: 89–97.
  15. 15. Thirgood SJ. Alternative mating strategies and reproductive success in fallow deer. Behaviour. 1990; 116: 1–10.
  16. 16. Mackenzie A, Reynolds JD, Brown VJ, Sutherland WJ. Variation in Male Mating Success on Leks. The American Naturalist. 1995; 145: 633–652.
  17. 17. Kokko H, Brooks R, Jennions MD, Morley J. The evolution of mate choice and mating. Proc. R. Soc. Lond. B. 2003; 270: 653–664.
  18. 18. Ryder TB, Parker PG, Blake JG, Loiselle BA. It takes two to tango: reproductive skew and social correlates of male mating success in a lek-breeding bird. Proc. R. Soc. B. 2009; 276: 2377–2384. pmid:19324732
  19. 19. Clutton-Brock TH, Hasegawa MH. Mate choice on fallow deer leks. Nature. 1989; 340: 463–465. pmid:2755506
  20. 20. McElligott AG, Mattiangeli V, Mattiello S, Verga M, Reynolds CA, Hayden TG. Fighting tacticts of fallow bucks (Dama dama, Cervidae): reducing the risks of serious conflict. Ethology. 1998; 104: 789–803.
  21. 21. Ciuti S, Apollonio M. Ecological sexual segregation in fallow deer (Dama dama): a multispatial and multitemporal approach. Behav. Ecol. Sociobiol. 2008; 62: 1747–1759.
  22. 22. Clutton-Brock TH, McAuliffe K. Female mate choice in mammals. Quarterly Review of Biology. 2009; 84: 3–27. pmid:19326786
  23. 23. Apollonio M, De Cena F, Bongi P, Ciuti S. Female preference and predation risk models can explain the maintenance of a fallow deer (Dama dama) lek in it’s ‘handy’ location. Plos One. 2014; 9: 1–11.
  24. 24. Say L, Naulty F, Hayden TJ. Genetic and behavioural estimates of reproductive skew in male fallow deer. Molecular Ecology. 2003; 12: 2793–2800. pmid:12969481
  25. 25. Vannoni E, McElligott AG. Low frequency groans indicate larger and more dominant fallow deer (Dama dama) males. PLoS One. 2008; 3: e3113. pmid:18769619
  26. 26. Farrell ME, Briefer E, McElligott AG. Assortative mating in fallow deer reduces the strength of sexual selection. Plos One. 2011; 6: 1–9.
  27. 27. Jennings DJ, Elwood RW, Carlin CM, Hayden TJ, Gammell MP. Vocal rate as an assessment process during fallow deer contests. Behav. Processes. 2012; 91:152–158. pmid:22820323
  28. 28. Pitcher BJ, Briefer EF, Vannoni E, McElligott AG. Fallow bucks attend to vocal cues of motivation and fatigue. Behavioral Ecology. 2014; 25: 392–401.
  29. 29. Kokko H, Lindstrom J, Alatalo RV, Rintamaki PT. Queuing for territory positions in the lekking black grouse (Tetrao tetrix). Behavioral Ecology. 1998; 9: 376–383.
  30. 30. McElligott AG, O’Neill KP, Hayden TJ. Cumulative long-term investment in vocalization and mating success of fallow bucks, Dama dama. Animal Behaviour. 1999; 57: 1159–1167. pmid:10328804
  31. 31. Fričová B, Bartoš L, Bartošová J, Panamá J, Šustr P, Jozífková E. Comparison of reproductive success in fallow deer males on lek and single temporary stands. Folia Zool. 2008; 57: 269–273.
  32. 32. Bro-Jørgensen J. Queuing in space and time reduces the lek paradox on an antelope lek. Ecology & Evolution. 2011a; 25: 1385–1395.
  33. 33. Bro-Jørgensen J. The impact of lekking on the spatial variation in payoff to resource-defending topi bulls, Damaliscus lunatus. Animal Behaviour. 2008; 75: 1229–1234.
  34. 34. Focardi S, Tinelli A. A structural-equations model far the mating behaviour of bucks in a lek of fallow deer. Ecology & Evolution. 1996b; 8: 413–426.
  35. 35. McElligott AG, Gammell MP, Harty HC, Paini DR, Murphy DT, Walsh JT et al. Sexual size dimorphism in fallow deer (Dama dama): do larger, heavier males gain greater mating success? Behavioral Ecology and Sociobiology. 2001; 49: 266–272.
  36. 36. Zuur AF, Ieno EN, Elphick CS. A protocol for data exploration to avoid common statistical problems. Methods in Ecology & Evolution. 2010; 1: 3–14.
  37. 37. Ciuti S, De Cena F, Bongi P, Apollonio M. Benefits of a risky life for fallow deer bucks (Dama dama) aspiring to patrol a lek territory. Behaviour. 2011; 148: 435–460.
  38. 38. Sokal RR, Rohlf FJ. Biometry: The Principles and Practices of Statistics in Biological Research. 3rd ed. New York: W. H. Freeman and Co;1995.
  39. 39. O’Hara RB, Kotze DJ. Do not log-transform count data. Methods in Ecology and Evolution. 2010; 1: 118–122.
  40. 40. Zuur AF, Saveliev AA, Ieno EN. Zero Inflated Models and Generalized Linear Mixed Models with R. Highland Statistics Ltd; 2012.
  41. 41. Shipley B. Cause and Correlation in Biology: A User’s Guide to Path Analysis, Structural Equations and Causal Inference. 4th ed. Cambridge: Cambridge University Press; 2000.
  42. 42. Navidi W. Probabilità e statistica per l’ingegneria e le scienze. McGraw-Hill; 2006.
  43. 43. McDonald JH. Handbook of biological statistics, 3th ed. Sparky House Publishing. Baltimore, Maryland, U.S.A: University of Delaware; 2014.
  44. 44. Kendall BE. In: Fox GA, Negrete-Yankelevich S, Sosa VJ. Ecological Statistics: Contemporary Theory and application. 1th ed. Oxford: Oxford University Press; 2015.
  45. 45. Shipley B. Testing causal explanations in organismal biology: causation, correlation and structural equation modelling. Oikos. 1999; 86: 374–382.
  46. 46. Jiguet F, Bretagnolle V. Manipulating Lek Size and Composition Using Decoys: An Experimental Investigation of Lek Evolution Models. The American Naturalist. 2006; 168: 758–768. pmid:17109318
  47. 47. Pearl J, Glymour M, Jewell NP. Causal Inference in Statistics: A Primer. Wiley; 2016.
  48. 48. Sih AD, Lauer M, Krupa JJ. Path analysis and relative importance of male-female conflict, female choice and male-male competition in water striders. Animal Behaviour. 2002; 63: 1079–1089.
  49. 49. Grace JB, Schoolmaster DR, Guntenspergen GR., Little AM, Mitchell BR, Miller KM et al. Guidelines for a graph-theoretic implementation of structural equation modeling. Ecosphere. 2012; 3: 1–44.
  50. 50. Agresti A. Categorical data analysis. 1th ed. New York: Wiley; 1990.
  51. 51. Muthén B, Asparouhov T. Causal Effects in Mediation Modeling: An Introduction With Applications to Latent Variables. Structural Equation Modeling: A Multidisciplinary Journal. 2015; 22: 12–23.
  52. 52. Bianco PM, Lucchese F, Tescarollo P. La vegetazione della Tenuta Presidenziale di Castelporziano: la flora”. In AA.VV. Il sistema ambientale della Tenuta Presidenziale di Castelporziano. Accademia Nazionale delle Scienze detta dei XL, scritti e documenti XXVI. 2001;II: 641–676.
  53. 53. Focardi S, Franzetti B, Ronchi F, Imperio S, Montanaro P, Aragno P et al. Monitoring populations of a guild of ungulates: implications for the conservation of a relict Mediterranean forest. Rend. Fis. Acc. Lincei. 2015.
  54. 54. Imperio S, Focardi S, Santini G, Provenzale A. Population dynamics in a guild of four mediterranean ungulates: density-dependence, environmental effects and inter-specific interactions. Oikos. 2012; 121: 1613–1626.
  55. 55. Clutton-Brock TH, Albon SD, Gibson RM, Guinness FE. The logical stag: adaptative aspects of fighting in red deer (Cervus elaphus L.). Animal behaviour. 1979; 27: 211–225.
  56. 56. Gammel MP, De Vries H, Jennings DJ, Carlin CM, Hyden TJ. David’s score: a more appropriate dominance ranking method than Clutton-Brock et al.’s index. Animal Behaviour. 2003; 66: 601–605.
  57. 57. Møller AP. Fluctuating Asymmetry in male sexual ornaments may reliably reveal male quality. Animal Behaviour. 1990; 40: 1185–1187.
  58. 58. Møller AP, Saler JJ, Zamora–Munoz C. Horn asymmetry and fitness in gemsbok, Orix g. gazzella. Behavioural Ecology. 1996; 3: 247–253.
  59. 59. Crawley MJ. Statistical Computing. An Introduction to Data Analysis using S-Plus. John Wiley and Sons, New York; 2002.
  60. 60. Murthaug P. In: Forum: P values, hypothesis testing, and model selection: it’s déjàvu all over again. In defense of P values. Ecology. 2014; 95: 611–617.
  61. 61. R Core Team. R: A Language and Environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2016; URL
  62. 62. Bollen KA, Pearl J. Eight myths about causality and Structural Equation Models. In Morgan S.L. ed. Handbook of Causal Analysis for Social Research, Springer; 2013. pp. 301–328.
  63. 63. Cameron AC, Trivedi PK. Microeconometrics using stata. College Station, TX: Stata Press; 2009.
  64. 64. Muthén B, Muthén L. Mplus 7 ver. 7.4, 2012–2015; 2015.
  65. 65. STATA StataCorp. Stata Statistical Software: Release 14.1. College Station, TX: StataCorp LP, USA; 2015.
  66. 66. Burnham KP, Anderson DR. Model selection and multimodel inference. A pratical information- Theoretic approach. 2th ed. New York: Springer; 2002.
  67. 67. Weiss J. Statistical Methods in Ecology. University of North Caroline. 2010; [accessed 24 May 2016].
  68. 68. Skrondal A, Rabe-Hesketh S. Generalized Latent Variable Modeling: Multilevel, Longitudinal, and Structural Equation Models. CRC Press; 2004 pp.267.
  69. 69. Bentler PM, Bonett DG. Significance tests and goodness of fit in the analysis of covariance structure. Psycological Bullettin. 1980; 88: 588–606.
  70. 70. Hooper D, Coughlan J, Mullen MR. Structural Equation Modelling: Guidelines for Determining Model Fit. Electronic Journal of Business Research Methods. 2008; 6: 53–60.
  71. 71. Lande R. On Comparing Coefficients of Variation. Systematic Zoology. 1977; 26: 214–217.
  72. 72. Kolenikov S, Bollen KA. Testing negative error variances: Is a Heywood case a symptom of misspecification? Sociological Methods and Research. 2012; 41: 124–167.
  73. 73. Loyau A, Gomez D, Moureau B, Théry M, Hart NS, Jalme MS et al. Iridescent structurally based coloration of eyespots correlates with mating success in the peacock. Behavioural Ecology. 2007; 18: 1123–1131.
  74. 74. Kaplan D, Depaoli S. Chapter 38 in Handbook of Structural Equation Modeling. Edited by Hoyle Rick H.. The Guilford Press; 2012.
  75. 75. Grace J B, Anderson T M, Seabloom E W, Borer T E, Adler P B, Harpole W S et al. Integrative modelling reveals mechanisms linking productivity and plant species richness. Nature. 2016; 529, 390–393. pmid:26760203