Use of multivariate analysis as a tool in the morphological characterization of the main indigenous bovine ecotypes in northeastern Algeria

Sustainability in livestock farming requires monitoring of autochthonous breeds which are well adapted to the local environment. The morphometric measurements seem to be the first approach which can provide useful information on the suitability of animal genetic resources for selection. In this work, thirteen morphometric variables were used for the phenotypic characterization of 130 adult autochthones cattle randomly selected from 30 local farms in Guelma. There were cases from four commonly accepted and traditional ecotypes: Guelmois, Cheurfa, Sétifien and Fawn. The results showed several and significant positive correlations between the different variables. Correlations were analyzed using Varimax orthogonal rotation PCA and three factors were extracted, which explain more than 75% of the total variation in the four ecotypes. Stepwise discriminant analysis showed that 6 of the 13 variables had discriminatory power to define the phenotypic profile of the ecotypes. Canonical discriminant analysis indicated that the Sétifien ecotype is separate from the other three ecotypes. Mahalanobis distances were significant between the different ecotypes except for the distance between the Guelmois and Fawn ecotypes. The cross-validation procedure assigned 91.42% of the Sétifien animals to their genetic group, while the percentages of animals assigned to the Cheurfa, Guelmois and Fawn ecotypes were 80.00%, 65.71% and 53.33% respectively. The multivariate approach has proven to be effective in differentiating the four ecotypes, with clear morphological differences from the Sétifien ecotype that may benefit from a genetic improvement program for more sustainable genetic resources preservation.


Introduction
In order to deal with the effects of globalization, urbanization, mechanization, increase in world population, global warming and climate change, it is urgently needed to transform our  (Fig 2). Local farms included in this study are characterized, on the one hand, by small herds that do not exceed ten animals in 30% of the farms, and on the other hand, the practice of extensive breeding, with low use of feed supplements.

Studied variables
Thirteen body measurements were assessed following the FAO guidelines for phenotypic characterization of animal genetic resources (Table 2) [40]. Animals' age was determined according  to the number of permanent incisors 2, 4, 6 and 8 and for the oldest subjects based on the declaration of the breeders. The body measurements were carried out using a measuring stick, a measuring tape or wooden caliper when the animals were standing on a level surface and maintained in upright posture by their respective owners. All the measurements were recorded by the same operator to avoid effects between operators. All traits were recorded from the left side of each animal placed on flat ground, and held by two assistants.

Statistical analyses
Collected data were processed and analyzed using XLSTAT Version 2016.02, Addinsoft, Paris, France (www.xlstat.com) and the IBM SPSS Statistics for Windows, Version 20 (IBM SPSS Statistics for Windows, IBM Corporation, Armonk, NY). The descriptive statistics (mean, standard deviation and coefficient of variation) were calculated for morphometric variables. Data were analyzed with the Weighted least squares general linear model (WLS-GLM) which uses the least squares method, which included ecotype group, sex (male, female) and age group as fixed effects.
Sex (i) j is the fixed effect of male and female sex (j = 1 for the male, n1 = 30 and j = 2 for the female, n2 = 100).
Animals were reared on farms with almost the same farming and grazing practices. Consequently, the effects of husbandry and localities were confounded with the ecotype and were not included in the model.
Interactions between effects were analyzed and significant effects were reported. For each body measurement and age, comparisons of the least squares means were carried out using the Bonferonni test after examining the effect of significance on the observed variables.
Pearson correlation coefficients were estimated for different body measurements. The Cronbach's alpha coefficient (α) was calculated to assess the homogeneity and internal consistency of a set of variables for a given group.
In order to reduce dimensionality of our data set, principal component analysis (PCA) was performed. According to Everitt et al. [41], PCA is a method of reducing a set of multivariate data variables, X1, X2,. . ., XP, to new variables, y1, y2.. . ., yp (axis or factor), which are not correlated with each other and explain decreasing proportions of the total variance of the initial variables defined as follows: The coefficients being chosen so that y1, y2. . . .., yp take into account decreasing proportions of the total variance of the original variables, x1, x2,. . . ., xp. In order to test the validity of the factor analysis of the data set, the Kaiser-Meyer-Olkin (KMO) sampling adequacy criterion was calculated. The low values of KMO indicate that the correlations between any variable and the other variables are unique, and are not related to the remaining variables outside of every single correlation. Pundir et al. [20] reported that variables with a KMO of less than 0.50 should be eliminated before the factor analysis. Bartlett's sphericity test was calculated to establish the validity of the dataset at a significance level of 1% [42]. Kaiser's rule was used to determine the number of factors retained whose eigen value is greater than 1 [43]. The Orthogonal Rotation Method (Varimax) was used to minimize the number of variables with strong correlations, especially with the first main component. The common variance between each variable and all the components selected was calculated. The commonality for each variable measures the variance of this variable explained jointly by the set of factorial components retained [43].
Stepwise discriminant analysis was carried out to identify the morphological traits having a discriminating power between the studied ecotypes. The importance of these morphological features was assessed according to their partial R 2 value (P <0.05). Linear combinations of body measurements were performed by canonical analysis, which used to perform multivariate analysis that calculated the Mahalanobis distance (D 2 ) between the different cattle ecotypes.
The phenotypic discrimination among breed populations was assessed with a cross-validation test. This required the removal of 1 individual from the first matrix, and then a discriminant analysis was performed with the remaining observations to classify the omitted individual. Performance was evaluated according to the percentage of correctly and incorrectly classified bovine. The morphological distinctness of each ecotype was defined using the percentage of correctly-classified individuals.

Comparison of measured morphometric traits among different ecotypes
A total of 130 animals which raise two different ecotypes (73%), or one ecotype (27%) were used. Differences between the different ecotypes for morphometric variables are presented in S1 Table. The body measurements (CG, BL, HW, PW and PL) showed similar means for the Guelmois and Fawn ecotypes. These results suggest that the Guelmois and Fawn ecotypes have small stature compared to the Cheurfa and Sétifien ecotypes. The later ecotypes have better skeletal and muscular development, which indicates that they are better for meat production.
For the Sétifien ecotype, the average of the chest measurement is equal 175.88 cm which is the largest in comparison with the other ecotypes. These results indicate good conformation and harmonious development of the internal organs. The body measurements observed in Sétifien and Cheurfa cattle are related to the geographic location where they are generally found in the plain while the Guelmois and Fawn ecotypes are often found in mountainous areas. This observation was confirmed by the dactyl-thoracic index which evolves in the opposite direction to the animal walking ability.
The coefficients of variation for the different measures showed a generally low variability (i.e. 70% of the variables studied recorded a coefficient of variation which does not exceed 10% which indicates that the cattle from each ecotype have similar size). These coefficients vary from 3.98% (PL) to 23.15% (HOL) for the Guelmois ecotype, from 3.05% (CG) to 15.10% (HOL) for the Cheurfa ecotype, from 3.38% (PL) to 19.82% (HOL) for the Sétifien ecotype and from 3.25% (PL) to 19.46% (HOL) for the fawn ecotype.
The length of the horns (HOL) represents the most heterogeneous biometric parameter in the ecotypes studied, which is in concordance with results reported by Boujenane [24], who showed that the homogeneity of body measurements observed from the local cattle breeds (Oulmes-Zaer and Tidili) in Morocco might be due to natural selection, favoring particular shape and size that is well adapted to the local harsh environment, which characterized by under scarce feed and fodder. Moreover, this adaptation ability was also observed in Mursi indigenous cattle breed in Ethiopia, which presents a very large morphological variability that can help the breed to survive and produce in its natural environment [44].
The analysis of variance obtained by GLM indicated that "ecotype" has a significant effect (P <0.05) for parameters with economic importance such as CG, HW, HC, PW and PL. These parameters are very sought in genetic improvement programs. However, no significant effect was recorded for parameters which are not important in selection programs, such as HOL, EL, CC and DTI (S1 and S2 Tables). The differences between ecotypes are in favor of "Sétifien" cattle comparing with other ecotypes for CG, HW, MC, HC, PW, PL, HL and MW.
Also, in our experimental conditions, results of all measurements show a significant effect of sexual dimorphism in the four ecotypes, which is in agreement with the data reported by Aguirre-Riofrio et al. [23] for Creole Cattle in Southern of Ecuador. However, these results should be interpreted with caution given the small sample size in male groups.
Except for a few variables studied such as HW, MC and CC "age" has no significant effect on the four ecotypes. This can be explained by the fact that the all the measurements were taken on adult animals.
A total of 78 correlations by ecotype between the different body measurements were calculated, 52 were significant, and 42 of them were positive for the Guelmois ecotype. For the Fawn ecotype, only 3 correlations were negative out of a total of 52 significant combinations. The number of significant combinations for the Sétifien ecotype was 57 with 52 positive correlations. The lowest number of significant combinations was recorded in the Cheurfa ecotype with 44 significant correlations.
As a result, most of the correlations between the variables measured varied from 36.6% to 92.6%, from 33.5% to 92.3, from 36.4% to 96.9% and from 36.7% to 94% for the Guelmois, Sétifien, Cheurfa and Fawn ecotypes respectively.
For the Guelmois ecotype, ear length (EL) was not correlated with BL, HW, MC, HC, and PW, indicating the independence of these variables with EL. This independence was more observed in the Sétifien ecotype. Also, other correlations but not significant were recorded between the length of the horns (HOL) and the other measured variables except for the Muzzle Circumference (MC) which was negatively correlated with the length of the horns (HOL) for the Cheurfa and Fawn ecotypes.
The correlations found in this study are stronger than those obtained in other studies across the world [15,20,24]. Analysis of the relationships between the 13 morphometric variables performed shows the existence of two distinct groups (Fig 3). The first group was formed by variables which define the size of the animal (CG, HW and BL), which are significantly correlated, and the second group was formed by all the others variables.

Factorial analysis in principal components
The results obtained allow us to note an internal consistency of (0.816), (0.816), (0.875) and (0.864) for the Guelmois, Cheurfa, Sétifien and Fawn ecotypes respectively, which indicates better internal coherence, especially of the Sétifien sample. Homogeneity is satisfactory when the value of the coefficient is at least equal to 0.80 [45].
The overall significance of the correlations tested with Bartlett's test of Sphericity for the biometric traits was significant (P <0.0001) which allows a reduction in the dimension of all thirteen morphometric variables.
In this study, for Guelmois cattle, factor 1 represents 46.53% of the total variance, while the sum of the first three factors (1, 2 and 3) represents 75.79% of the total variance. For Cheurfa cattle, factors 1, 2 and 3 represent respectively 47.56%, 15.78% and 11.92%, and 75.27% of the total variance. Also, for the Sétifien and Cheurfa cattle, the first 3 factorial axes explained a cumulative amount of information of 77.48 and 76.47%, respectively (S3 Table).
Tolenkhomba et al. [14] extracted seven factors representing 64.31% of the total variance in local cows from Manipur in India. Pundir et al. [20] found three factors explaining 66.02% of the total variation from eighteen biometric traits of Kankrej cows, as well as Yakubu et al. [28] retained two factors which represent 85.37% of the total variation in White Flauni cattle.
S3 Table shows factor loadings for each principal component of biometry traits. The first factor assigned negative coefficients for MC, HC, EL and HOL for the Guelmois ecotype. The morphometric characters strongly correlated with the first factor (r > 0.7) are MW, CC, DTI and HOL. The second factor attributes positive coefficients to all the morphometric characteristics except for the ear length (EL), and this factor characterizes CG, BL, HW, MC and HC. The third factor provides information about PL; it characterizes the Guelmois ecotype according to the length of the body (BL).
For the other ecotypes, the coefficients show that the highest relative contributions to factors 1, 2 and 3 were HW, PL and MC for the Cheurfa ecotype, CG, DTI and EL, for the Sétifien ecotype and BL, PL, and HOL for the Fawn ecotype. The variables most closely associated with factor 1, regardless of the ecotype studied, were CG, BL, and HW. These variables tend to describe the general size of the animal, while the factors 2 and 3 indicate the shape of the animal, which is in concordance with the literature data [28,46]. Also, the commonality of the variables, which represents the proportion of the variance of each of the 13 variables shared by the factor axes selected, vary from 0.619 (HC) to 0.882 (EL), from 0.449 (HC) to 0.918 (BL), from 0.630 (HOL) to 0.933 (CC) and from 0.519 (HW) to 0.918 (MC) for the Guelmois, Cheurfa, Sétifien and Fawn ecotype respectively (S3 Table).
The lower commonality of certain traits, such as the circumference of hock (HC) and the length of head (HL) in the Guelmois ecotype, the circumference of hock (HC) and the length of the horns (HOL) in the Cheurfa ecotype, the width of muzzle (MW) and the length of the horns (HOL) in the Sétifien ecotype, the height at the withers (HW), the width of muzzle (MW) and the length of the horns (HOL) in the Fawn ecotype indicate that these traits are less effective, to explain the total variation in body conformation compared to the other traits. Similar results have been reported by Tolenkhomba et al. [14]. The three factors obtained can be used to select animals based on a group of variables rather than on isolated traits. This is in agreement with the conclusions of Altarriba et al. [47] who predicted the effect of genetic improvement programs using reduced data on morphological traits which are sensitive to correlated responses to selection.

Stepwise discriminant analysis
The stepwise discriminant analysis showed that 6 of the 13 body measurements have potential discriminatory power. Their partial R 2 varied from 0.261 to 0.095. Then, the other traits (BL, MW, HC, EL, HOL, CC and DTI) were removed from the final model. These traits were the PL, followed by MC, CG, HL, PW and HW in decreasing order of discriminating power (Table 3). In addition, these traits are easy to monitor and can be used to assign the four ecotypes studied in separate populations, thereby reducing selection errors in future genetic improvement programs [27]. Boujenane [24] indicated that the discriminant step-by-step analysis shows that the cannon turn was the most discriminating variable between Oulmes -Zaer and Tidili cattle, their partial R 2 was 0.6. While the rump width was the most discriminating variable between Bunaji and Sokoto Gudali cattle with a partial R 2 of 0.5824 [16].

Canonical discriminant analysis
Canonical discriminant analysis was used to obtain the function of all body measurements necessary for the separation of the four ecotypes. The analysis identified two significant canonical axes (p <0.0001) CAN1 and CAN2 representing respectively 53.55 and 45.46% of the total variation, these axes indicate a great reduction in the sampling space, with little loss (1%) to explain the total change. The third canonical axis (CAN3) was not taken into account because its values were very low (Table 4).
Several studies show a reduction in the sampling space with a very small loss of the total variation by conserving the first two canonical vectors [24,27,48]. The different tests used in multivariate analysis, namely Wilks' lambda λ, Pillai's trace, Lawley-Hotelling trace and Roy's test showed a significant difference between the four ecotypes (P <0.0001).
The values of the first and second canonical axis enabled us to see that certain variables are correlated with the first canonical axis. Indeed, BL, HW, MC and PL are positively correlated (Table 4). This first axis represents the linear combination of the six morphological traits (CAN1 = -0.059 CG + 0.087 HW + 0.226 MC + 0.087 PW + 0.844 PL + 0.037 HL-51.235) which could separate the Sétifien ecotype from other ecotypes. For axis 2, the CG is positively Evaluations of cattle within each ecotype and their interrelations with other ecotypes have been shown in the projection of the four ecotypes in the plane defined by the first and the second canonical axes (Fig 4A). This projection shows a significant overlap between individuals of the Guelmois and Fawn ecotypes, which probably due to the geographic and topography  zone where they are generally found in mountainous areas. The projection of the barycentres of the different ecotypes in the plane from the canonical axes illustrates two very distinct phenotypic groupings which correspond to the Sétifien and Cheurfa ecotypes. According to the first canonical axis, the Sétifien ecotype was clearly separated from the other ecotypes ( Fig 4B). This was supported by the Mahalanobis distance values estimated between the four ecotypes. These distances are estimated across all characters; their comparison was carried out using the F test. The highest value was recorded between the Sétifien and Cheurfa ecotypes, while the Guelmois and Fawn ecotypes were the lowest. From per pairs comparison, significant distance were recorded between all ecotypes (P <0.001) except for the one between the Fawn and Guelmois ecotypes (Table 5).
In order to examine the relationships between the ecotypes, a phylogenetic tree was constructed from morphometric data according to the UPGMA method (Fig 5). The phylogenetic tree showed two distinct clusters, the first one contains the Guelmois, Fawn and Cheurfa ecotype, while the second cluster contains the Sétifien ecotype which is clearly separated from the other ecotypes. The representation of the similarity between the four ecotypes studied is in agreement with the results of the Mahalanobis distance (Table 5).
Using the cross-validation option in assigning each individual to their original ecotype, we show a very significant reclassification error rate (46.67%) in Fawn ecotype, followed by Guelmois, Cheurfa, and Sétifien ecotypes with 34.3%, 20% and 18% respectively (Table 6). Indeed, in the Fawn ecotype, only 53.33% of individuals have been classified in their original population, this is explained by the great heterogeneity of the Fawn ecotype which has a strong connection with the other ecotypes and in particularly the Guelmois ecotype. In addition, the Fawn ecotype cattle are localized in dispersed geographical zones which often encourage anarchic crossings with the other ecotypes. The error in reclassifying the animals is due to the significant genetic mixing between the different ecotypes.
The allocation results are in perfect agreement with results obtained from the molecular characterization of four Algerian cattle breeds (Guelmois, Cheurfa, Tlemcenienne and Zebu) using 22 microsatellite markers [49]. The authors have shown poor genetic differentiation in the cattle populations studied with F ST mean value of (0.039) and high heterozygosity value (0.84). Indeed, the total genetic variation is dominated by differences among individuals (97.10%) while 2.90% among populations.
Currently, several autochthones bovine ecotypes are of mixed origin, which makes it difficult to determine the phenotype of the original breed [10]. Most individuals of the Sétifien ecotype were classified according to their original population (91.42%), indicating a reduction in the genetic flow between this ecotype and the other ecotypes studied. This situation offers a morphometric characteristic specific to the Sétifien ecotype; therefore, most of the individuals of this ecotype cannot be wrongly classified in the other ecotypes. The proportion of correctly reassigned individuals is considered to be a factor in the morphological distinction of the population [32]. Indeed, the morphological differences observed between the Sétifien ecotype and the other ecotypes studied may support the hypothesis that a large part of the morphological variation is under genetic control, indicating that they can be the subject of a genetic improvement program.

Conclusion
In summary, the principal component analysis provided a mean of reducing the number of morphometric traits to be recorded into groups of variables that can be used to explain body conformation for animal selection program. Therefore, it is necessary to exploit specific traits of each ecotype to establish programs for sustainable genetic improvement. The characters HW, CG, MC, BL, PL and HL were the most discriminating variables to separate the four ecotypes (Guelmois, Cheurfa, Sétifien and Fawn). The study showed that the Sétifien ecotype was larger than the other ecotypes for the majority of morphometric traits. The approach used in this study could be very interesting in the assessment, management and conservation of different autochthones bovine populations, where the primary goal is to obtain local genetic resources with a pure phenotype for future selection, which is beneficial for breeding strategies improvement. Further studies, especially in molecular genetic analysis is necessary for the validation of current results.
Supporting information S1