Genotype by environment interaction, correlation, AMMI, GGE biplot and cluster analysis for grain yield and other agronomic traits in sorghum (Sorghum bicolor L. Moench)

Genotype by environment (G×E) interaction is a major factor limiting the success of germplasm selection and identification of superior genotypes for use in plant breeding programs. Similar to the case in other crops, G×E complicates the improvement of sorghum, and hence it should be determined and used in decision-making programs. The present study aimed at assessing the G×E interaction, and the correlation between traits for superior sorghum genotypes. Three hundred twenty sorghum landraces and four improved varieties were used in alpha lattice experimental design-based field trial across three environments (Melkassa, Mieso and Mehoni) in Ethiopia. Phenotypic data were collected for days to flowering (DTF), plant height (PH), panicle length (PALH), panicle width (PAWD), panicle weight (PAWT) and grain yield (GY). The results revealed that the variance due to genotype, environment and G×E interaction were highly significant (P < 0.001) for all traits. GY and PAWT were highly affected by environments and G×E whereas DTF, PALH, PAWD and PH were mainly affected by genotypic variation. Therefore, multi-environment testing is needed for taking care of G × E interaction to identify high yielding and stable sorghum landraces. GY and PAWT revealed highly significant positive correlations indicating the possibility of effective selection of the two traits simultaneously. Among the studied populations, South Wello, West Hararghe and Shewa zones had highly diverse genotypes that were distributed across all clusters. Hence, these areas can be considered as hotspots for identifying divergent sorghum landraces that could be used in breeding programs. Melkassa was the most representative environment whereas Mieso was the most discriminating. Five genotypes (G148, G123, G110, G203 and G73) were identified as superior across the test environments for grain yield with farmer-preferred trait, such as plant height. The identified stable and high yielding genotypes are valuable genetic resources that should be used in sorghum breeding programs.

Introduction simple evaluation of grain yield, and less effort has been given to advanced and more informative analyses of traits using MET data. Although G×E interaction has been performed to assess the stability of improved varieties of sorghum using MET data [5,[20][21][22][23], information is not available on the stability of sorghum landraces through the application of AMMI and GGE biplot models. Therefore, the objectives of this study were to evaluate G×E interaction, the performance and stability of sorghum landraces, correlation of grain yield and agronomic traits and to determine representativeness and discriminating ability of different environments where sorghum is cultivated.

Plant materials
In this study, 324 sorghum landrace accessions (320) and improved varieties (4) grown in Ethiopia were used (S1 Table). Among the 320 landrace accessions, 261 were obtained from Melkassa Agricultural Research Center (MARC), but originally collected by Ethiopian Biodiversity Institute (EBI), whereas 59 accessions were newly collected from farmers' fields in drought prone areas. The four improved varieties (Melkam, Argiti, ESH4 and B35) were obtained from MARC. Hereafter, both landrace accessions and varieties are referred to as "genotypes" for the sake of simplicity.

Study locations
This research was carried out in three locations in Ethiopia, namely Melkassa (MK), Mieso (MS) and Mehoni (MH), during the main crop growing season in 2019 (Table 1, Fig 1). These sites represent moisture stress areas in the country where sorghum is predominantly grown by smallholders.

Experimental design and field managements
The experiment was laid out as a 27 × 12 alpha lattice design with two replications across three environments. Each plot had an area of 2.25 m 2 (3 m × 0.75 m) and seeds were sown in a single 3 m long row on each plot. Planting was done manually followed by thinning to 0.20 m space between plants. The recommended amount of DAP fertilizer (100 kgha -1 ) was applied during planting and urea (50 kgha -1 ) was side dressed 40 days after planting. All necessary agronomic practices were applied following standard procedures for sorghum (maybe a reference here also).

Collecting phenotypic data
All phenotypic data were collected from five randomly selected and tagged plants in each plot. Days to flowering (DTF) were recorded as the number of days from planting to flowering of 50% of the plants on a plot. Panicle length (PALH) and panicle width (PAWD) were measured as the length of the panicle from the base to the tip of the panicle and as the width of the panicle at its widest section, respectively. Plant height (PH) was measured as the height of the plant from the base to the tip of a panicle at maturity. Grain yield (GY) was recorded as the weight of seeds from an individual plant's panicle whereas panicle weight (PAWT) was measured as the weight of the un-threshed panicle.

Data analysis
The phenotypic data collected from the three environments were subjected to a combined ANOVA using mixed linear model in R software [25]. The significance level of genotype, environment and G×E interaction effects were then determined. AMMI model was used to determine the G×E interaction effect, assess adaptability and stability of the sorghum landrace across the three environments. The ASV was calculated as described in Purchase et al. [26] to measure and rank the sorghum genotypes based on their stability. GSI was also calculated as described in Purchase et al. [26] using R software. Singular value decomposition (SVD) of the first two principal components was used to fit the GGE biplot model [27]. The AMMI and GGE biplot analyses were done using GENSTAT software [28]. For correlation analysis, BLUPs (Best Linear Unbiased Predictors) were calculated for all traits across the three environments using META-R software [29] and the Pearson correlation coefficient and graphs were generated in R software. Cluster analysis was performed using DendroUPGMA [30], and the tree generated was visualized using MEGA X [31].

Combined analysis of variance
The combined ANOVA showed a significant variation for genotype, environment and G×E interaction for all traits studied (P < 0.001) ( genotypes for yield and agronomic traits ( Table 2 and Fig 2). The grand mean values were 108 days for DTF, 272.1 cm for PH, 21.3 cm for PALH, 9.6 cm for PAWD, 104.5 g for PAWT and 78.1 g for GY across the three environments (Table 2).

AMMI analysis of variance
The result of AMMI ANOVA (Table 3) showed that the genotype, environment and G×E interaction effects were highly significant (P < 0.001) for DTF, GY, PH, PALH, PAWD and PAWT.

Genotype by environment interaction
Based on the AMMI analysis, the mean values of each trait in each environment-the IPCA1 and IPCA2 scores-and the four top ranking genotypes for each trait at each environment are  presented in Table 4. Low IPCA1 scores shows low contribution to the G×E interaction and high contribution to genotype stability [32]. In this study, environments contributed differently to the genotype stability for different traits. The IPCA1 scores indicated that Melkassa (MK) was a main contributor to the stability of genotypes in terms of panicle length (PALH) and width (PAWD). On the other hand, Mehoni (MH) contributed the most to genotype stability in grain yield (GY) and panicle weight (PAWT). The AMMI2 biplot revealed environment scores with IPCA1 and IPCA2 for grain yield, panicle weight, plant height, panicle length and width (Figs 3 and S1). In the AMMI2 biplot, environments with low IPCA1 and IPCA2 scores that are placed close to the origin have high contribution to the stability of genotypes and low contribution to GE interaction. In this study, AMMI2 biplots indicated that all environments were positioned far from the biplot origin for grain yield, panicle weight, plant height, panicle length and width.

Genotype performance and AMMI stability analysis
Genotype performance and AMMI stability analysis were conducted and the top and bottom ranking genotypes based on their mean values (Table 5) and genotype selection index ( Table 6) are presented. Analysis of the AMMI indicated that genotypes G306, G239, G313, G201, and G213 had high mean grain yield of 150.2, 1363, 133.8, 133.5 and 131.2 g, respectively, while G142, G168 and G321 were the least in grain yield as well as in panicle weight. The high yielding genotypes, G239 and G306 had higher panicle weight, 176.7 and 174.5 g respectively. With regard to panicle length, G244 (39.6 cm) and G118 (38.4 cm) had longest

PLOS ONE
Genotype by environment interaction and stability of sorghum landraces using AMMI and GGE biplot analysis panicle whereas G93 had shortest panicle, 9.9 cm. G12 (21.8 cm) followed by G255 (20.3 cm) had widest panicle while G157 (5.6 cm) were the least. Similarly, G255 (382.8 cm) followed by G244 (372.1 cm) were the tallest whereas G321 and G322 were the shortest genotypes, 90 cm and 133 cm respectively (Table 5). Low AMMI stability value (ASV) indicates high stability of genotypes and low G×E interaction [26]. Genotypes G70, G162 and G254 with mean grain yield of 64.8, 68.8, and 66.6 g, respectively, showed high stability having low ASV (S2 Table), but not high yield, and therefore should not be selected. On the other hand, the following genotypes were identified as having high stability and grain yield based on their genotype selection index (GSI): G148, G123, G110, G203 and G73 (Table 6). Among these genotypes, G148 and G73 had high stability and panicle weight. Genotypes G213, G306 and G201 had high mean grain yield but were placed far from the biplot origin suggesting that they were not stable (Table 5). These genotypes appeared to be specifically adapted to environment MK. The positive interaction of G207, G306, G183 and G313 with environment MH, and G20, G163, G226 and G30 with environment MS indicated the specific adaptation of the genotypes for grain yield to the respective environment (Table 4, Fig 3B).

GGE biplot analysis
Which-won-where polygon view of GGE biplot. The polygon view of GGE biplot showed the interaction patterns between genotypes and environments and visualized the best performing genotypes (Figs 4 and S2). In this GGE biplot, a polygon was drawn by joining the vertex genotypes, which were placed far from the origin, with red straight lines and hence, all the other genotypes were enclosed within the polygon. The vertex genotypes for grain yield were G163, G306, G313, G194 G168, G142, G209 and G20. Whereas, genotypes G163, G306, G313, G194, G168, G142, G262 and G20 took the vertices for panicle weight (Fig 4). Hence, these two sets of genotypes were the most responsive to environmental interactions for grain yield and panicle weight in that order. The most responsive genotypes forming the vertices of  In "which-won-where" GGE biplot, lines from the origin divide the biplot into different sectors and create different mega environments (MGEs) [14,33]. In this study, two MGEs were Genotype ranking based on their mean performance and stability. Ranking biplots were used to rank the genotypes according to their performance and stability using the average environment coordinate (AEC) [13]. An average environment axis (AEA) in the ranking biplot represented by a single arrowhead line that passes through the origin shows higher mean performance of a genotype. In this study, the ranking biplot AEC showed that genotypes G306, G239, G201, G213, G207 and G105 had high mean GY and genotypes G163, G239, G164, G105 and G213 had high mean PAWT. On the other hand, genotypes G321, G168 and G142 had the lowest grain and panicle weight in that order (Fig 5). In PALH, genotypes G244, G118 and G149 came out on top whereas G12, G255 and G5 were the top ranking in PAWD. Genotypes with the shortest panicle were G178, G93 and G279 whereas G28, G157 and G199 were the bottom ranking in PAWD. In plant height, G255, G244 and G261 were the top ranking whereas G121, G322 and G32 where the shortest genotypes (S3 Fig). The stability of genotypes were evaluated based on the length of the vector (dotted line in the graph) between the genotype positions and the AEA in ranking biplot (Figs 5 and S3). The best performing and stable genotypes are those that are far from the origin but on the AEA or close to it. Hence, G119, G105, G213, G239 and G207 were the most stable genotypes with high mean GY that had shorter vector from AEA whereas G20 and G163 were the least stable genotypes having longest vector from AEA (Fig 5B). For PAWT, G207 and G213 were the most stable whereas G20 were the least stable genotypes (Fig 5A). Genotypes G244 and G118, for PALH, genotypes G261 and G5 for PAWD, genotypes G255 and G244 for PH were the most stable with a shorter vector from AEA. Genotype G313 was the least stable for both panicle length and width and genotype G153 was the least stable for PH (S3 Fig).  Evaluation of environments in comparison biplots. Environment-focused scaling of comparison the GGE biplot shows AEA, AEC and concentric circles which helps to evaluate the tested environments. The concentric circles on the comparison GGE biplot graph (Figs 6 and S4) showed the distance of the environments to AEA, AEC and the biplot origin. The ideal environment is the one that is close to the center of concentric circles. In this study, environment MK was the ideal environment (representative) for GY, PALH and PAWD as it is the closest to the center of concentric circles and having the smallest angle with AEA. The environments that were placed far from the comparison biplot origin indicated the discriminating ability of the environments and hence all three-tested environments had strong discriminating ability for all traits as they were placed far from the biplot origin.

Correlation among traits
Significant positive and negative correlations were detected between traits studied (Fig 7). Grain yield showed that highly significant (P < 0.001) and high positive correlation with panicle weight (0.91). Significant (P < 0.01) and negative correlation were detected between grain yield vs. panicle length (-0.44) and panicle weight vs. panicle length (-0.50). Grain yield revealed a non-significant negative correlation with days to flowering (-0.12) and positive correlation with plant height (0.16).

Cluster analysis
The dendrogram was generated from cluster analysis of the 324 sorghum genotypes based on the six traits. The cluster analysis grouped the genotypes in to five clusters (Fig 8). Cluster-IV

PLOS ONE
was the largest one consisting of 158 genotypes, followed by Cluster-II that comprised 67 genotypes. Cluster-I, III and V contained 8, 33 and 58 genotypes, respectively. The cluster analyses showed good correspondence with the stability and the performance of genotypes obtained from the AMMI and GGE biplots. For instance, genotypes G163, G306, G239, G105, G119, and G201 in Cluster-I were among the high yielding genotypes according to the AMMI2 and GGE ranking biplots. Furthermore, all the four improved varieties were grouped together in Cluster-II. However, the cluster analysis did not clearly group genotypes based on the proximity of their geographical locations where they were initially collected. Significant numbers of genotypes from different regions were grouped together and genotypes from the same regions were placed under different clusters. For instance, nine genotypes originally collected from Shewa were grouped in Cluster-IV, while eight genotypes from the same zone were grouped in Cluster-III. Most of the genotypes from Central Tigray were grouped in Cluster-II whereas genotypes from North Wollo, East Harerge and West Harerge were grouped in Cluster-IV. South Wello, West Hararghe and Shewa had highly diverse genotypes that were distributed across all clusters (Fig 8).

Discussion
Ethiopia is considered as the center of origin and diversity of sorghum [34,35], due to the presence of its wild and cultivated forms. In this study, combined and AMMI analysis of variance revealed a highly significant variation among the 324 sorghum genotypes (landraces and improved varieties) for the assessed traits. The high genetic variation revealed in this study and previous studies in sorghum landraces [36,37], indicated the presence of great opportunity to select and use the landraces for sorghum improvement programs. Genotypic effect had higher contribution to the total variation in DTF (54.9%), PH (61.4%) and PAWD (53.9%) as compared to environment and G×E interaction effects. However, the effect of the environment was higher (33.6%) than the genotypic effect for variation in PAWT, and the effect of G×E interaction was higher (31.3%) than genotypic effect for variation in GY. The higher contribution of environment and G×E interaction to variation in grain yield were reported in sorghum [5,22] and other crops [38,39]. The significant effect of G×E interaction for the traits implies that different sorghum genotypes responded differently to variation in environmental conditions, leading to the necessity to identify and select environment specific genotypes. Higher contribution of G×E interaction as compared to genotype to variation in grain yield indicated the possible existence of different mega-environments across the testing environments [40,41]. The significant effect of the environment suggests the need to generate MET data that can lead to the identification of stable and top performing genotypes that have wide adaptation as well as for selection of genotypes with good adaptation to specific agroecology.
The variance due to genotype and G×E interaction helped to select the best genotypes for target traits, and in such cases, minimizing the impact of environmental main effects is important [10]. AMMI2 model was the best model to understand genotype stability and performance, genetic variation between genotypes and association with environments [42]. In the AMMI2 biplot, environments with low IPCA1 and IPCA2 scores (placed close to the origin) have high contribution to the stability of genotypes but with low contribution to the G×E interaction [14]. Thus, environment MH and MK were the top two contributors to the stability of genotypes in GY, PAWT, PALH and PH. Genotypes located far from the center and close to a given testing environment in AMMI2 biplot are considered well-adapted and high-performing in that environment [14]. For GY genotypes G20, G163, G226 and G30 were close to environment MS in this study, indicating their high performance and better adaptability are to this environment than the other two. On the other hand, genotypes G213, G306, G201 and G313 performed better in environment MK for GY. The difference in relative performance of genotypes at different environments is also a strong indicator of the existence of G×E interaction, and variation in environmental conditions such as temperature, rainfall, and soil type. This, therefore, suggests that environment-specific sorghum genotypes should be selected for different agro-ecologies and environmental conditions. High yielding genotypes under specific environments have been previously reported in sorghum [21,43] and barley [44].
Genotypes with low ASV and positioned close to the origin in AMMI2 biplot are generally regarded as highly stable [26]. In line with this, G70, G162 and G254 were identified as stable genotypes in grain yield. However, these genotypes had low mean grain yield, and should not be prioritized for use in breeding programs. In this study, GSI was used for selecting top ranking genotypes both in mean performance and stability [15,16], based on the ASV parameter (accounting for IPCA1 and IPCA2) and genotype mean ranking. This approach identified G148, G123, G110, G203 and G73 as stable genotypes with high grain yield across environments. Interestingly, these stable and high yielding genotypes were also top ranking in other farmer preferred traits such as high panicle weight (125, 122, 133, 118, 136 g) and plant height (309, 301, 308, 317, 279 cm), respectively. Moreover, these genotypes were collected from South Wello, West Hararghe and Shewa zones, which had highly diverse genotypes. Hence, these genotypes should be prioritized for use in sorghum breeding programs for further improvement in grain yield and other desirable traits. This method has been successfully used in other crops [40,45,46].
The GGE 'Which-Won-Where' biplot was used to identify top performing genotypes through interpreting the G×E interaction, MGE clustering and particular adaptation [10,11,14,22]. The genotypes which placed far from the biplot origin (vertex genotypes) are the poorest or best performing in some or in all tested environments [47], which were more responsive to environmental change and are considered specifically adapted genotypes. Based on 'Which-Won-Where' biplot, the testing environments were grouped into two MGEs with different high performing genotypes for GY, PAWT and PAWD. For instance, for GY, MGE1 was represented by MK and MH environments containing G306, G313 and G194 as top grain yielding genotypes whereas MGE2 contained only environment MS where G163 and G20 were top performers in grain yield. This indicates that there were specific adaptations of genotypes to MGEs and hence positive exploitation of the G×E interaction [40]. The clustering of the target environments into meaningful MGEs and selecting different genotypes for different MGE is the best way to exploit the positive G×E interaction [33]. Such clustering of environments into MGEs and identification of top best performing genotypes adapted to a specific MGE have been reported in several crops [22,43,48,49].
The top ranking and stable genotypes can be identified by GGE ranking biplot through AEC [13]. In the present study, the ranking biplot AEC indicated genotypes G306, G239, G163, G201 and G213 as the top ranking in grain yield. However, the high yielding landraces such as G306, G163 and G201 were less stable landraces due to G×E interaction effect. Previous reports on forage and grain sorghum also showed that the high yielding genotypes are not necessarily the most stable [22,43]. A remarkable character of the GGE biplot graph is the visualization of genotypes that combine high mean performance and stability. The best genotypes could have larger projection on AEC (highest mean) along with shorter vector on AEA (high stability) [13,47,50]. Accordingly, genotypes G105, G213, G207, G239 and G119 were identified as high yielding and stable for grain yield. It implies that identification of ideal genotype through GGE biplot analysis is a suitable tool for detecting the most stable and the highest yielding genotypes. By using this method, several authors identified high yielding and stable genotypes in sorghum [21,43] and other crops including barley [40], soybean [41] and wheat [42].
The GGE biplot approach ranked genotypes G163, G239, G164, G105, G119, G207 and G213 on top for grain yield and panicle weight. Among these, genotypes G207 and G213 were identified as desirable for their stability and high mean panicle weight. The selection of the same genotypes for both GY and PAWT is mainly due to the positive association between the traits. On the other hand, different genotypes were identified as having high mean performance and stability for PALH, PAWD and PH. This study clearly indicated that a stable and high performing genotype in one trait does not necessarily mean that it combines stability and high performance in other related traits. This is largely the case because different traits are regulated by different genes and due to differential expression of genes among the genotypes as a response to environmental conditions, such as temperature variation and moisture stress. Similar results were reported in previous studies in sorghum [22,43] and wheat [51].
The AMMI analysis has been shown to be effective in capturing a large portion of the G×E interaction, by clearly separating the main and interaction effects using ANOVA and PCA [10]. GGE biplot is an effective statistical model for the identification of top ranking and stable genotypes across environments and best genotypes for adaptation to particular mega-environment [47]. The present study showed that the AMMI and GGE biplot models had similar results in the discriminating ability of the environments. Similarly, in both analyses, the environments were somehow similar in their discriminating ability as they were placed far from the biplot origin. However, somewhat different results were obtained in the contribution of the environments for genotype stability. The top ranking genotypes were similar in both AMMI and GGE biplot analysis. However, the ranking of genotype stability were somewhat different in the AMMI and GGE analysis. These results are in line with results obtained in some other studies [42]. Such a difference is possible because of different statistical basis of IPCA in AMMI2 and PC in GGE biplot.
Grain yield showed that highly significant positive correlation with panicle weight (0.91). This is in agreement with previous studies on sorghum [52,53]. Hence, the positive correlation of grain yield with this trait showing possibility of simultaneous improvement of both traits through effective selection. Grain yield also revealed a positive, but non-significant correlation with plant height (0.16) and a negative correlation with days to flowering (-0.12). Amare et al. [53] also reported non-significant positive correlation between grain yield with plant height. Similar results of negative, or non-significant correlations between grain yield and days to flowering was reported by Akatwijuka et al. [54]. Negative correlation between grain yield and panicle length and width in this study were in contrast with other studies in sorghum [54,55]. This is mainly due to the variation in panicle shape and compactness of sorghum genotypes used in this study.
The present study revealed that the clustering patterns of genotypes were not largely a result of their geographic origin where they were originally collected in Ethiopia. The clustering of sorghum genotypes collected from the same geographical area in different clusters were also reported in previous studies [36,37,56]. This indicates genotypes in the same geographical region differ considerably in their agro morphological traits, indicating high genetic diversity in sorghum. The clustering of genotypes from different regions in the same groups is likely the results of gene flow across regions through market channels as well as a gradual exchange of seeds among farmers. The four improved varieties included in this study were grouped in the same cluster. These varieties were early maturing and short in height. Similar clustering of improved varieties from Ethiopian sorghum landraces were reported in previous studies [37,56]. The clustering of the best genotypes for grain yield identified through AMMI and GGE biplot analyses in the same group suggests that such genotypes were selected for the same traits (mainly grain yield) that led them to be more similar but showed higher differentiation from the other genotypes.

Conclusions
This study determined G×E interaction effect, stability of genotypes and representativeness and discriminating ability of environments for days to flowering, plant height, panicle length, panicle width, panicle weight and yield in diverse sorghum genotypes grown in Ethiopia. Grain yield and panicle weight were highly affected by environmental variation and genotype by environment interaction whereas days to flowering, panicle length, panicle width and plant height were mainly affected by genotypic variation. The results obtained in this study clearly showed that the sorghum landraces are excellent genetic resources that contain high variation in grain yield and farmer-preferred traits such as plant height, which should be utilized for developing new high yielding cultivars with various desirable traits. The AMMI and GGE biplot models are effective in visualizing the G×E interaction and identifying stable and high performing genotypes. Among the 324 genotypes, G148, G123, G110, G203 and G73 were the best in terms of providing high and stable grain yield in combination with farmer-preferred traits. Among the studied populations, South Wello, West Hararghe and Shewa zones had highly diverse genotypes and hence these areas can be considered as a potential area for screening high yielding and other agronomic traits. Environment MK was the most representative environment whereas environment MS was the most discriminating, and hence should be used for capturing superior genotypes and for identification of high yielding genotypes for adaptation to specific agro-ecologies.

S1 Fig. AMMI2 biplot of 324 sorghum genotypes and three environments for (A) Panicle length (PAWT), (C) Panicle width (PAWD) and (B) Plant height (PH).
Genotypes placed close to a given environment, had top performance in that environment. Each vector shows the discrimination power of the environment (the longer the vector the more discrimination power that environment has). (TIF)