Phenotypic variation in biomass and related traits among four generations advanced lines of Cleome (Gynandropsis gynandra L. (Briq.))

Gynandropsis gynandra (spider plant) is an African traditional leafy vegetable rich in minerals, vitamins and health-promoting compounds with potential for health promotion, micronutrients supplementation and income generation for stakeholders, including pharmaceutical companies. However, information on biomass productivity is limited and consequently constrains breeders’ ability to select high-yielding genotypes and end-users to make decisions on suitable cultivation and production systems. This study aimed to assess the phenotypic variability in biomass and related traits in a collection of G. gynandra advanced lines to select elite genotypes for improved cultivar development. Seventy-one advanced lines selected from accessions originating from Asia, West Africa, East Africa and Southern Africa were evaluated over two years with two replicates in a greenhouse using a 9 x 8 alpha lattice design. Significant statistical differences were observed among lines and genotype origins for all fourteen biomass and related traits. The results revealed three clusters, with each cluster dominated by lines derived from accessions from Asia (Cluster 1), West Africa (Cluster 2), and East/Southern Africa (Cluster 3). The West African and East/Southern African groups were comparable in biomass productivity and superior to the Asian group. Specifically, the West African group had a low number of long primary branches, high dry matter content and flowered early. The East/Southern African group was characterized by broad leaves, late flowering, a high number of short primary branches and medium dry matter content and was a candidate for cultivar release. The maintenance of lines’ membership to their group of origin strengthens the hypothesis of geographical signature in cleome diversity and genetic driver of the observed variation. High genetic variance, broad-sense heritability and genetic gains showed the potential to improve biomass yield and related traits. Significant and positive correlations among biomass per plant, plant height, stem diameter and leaf size showed the potential of simultaneous and direct selection for farmers’ desired traits. The present results provide insights into the diversity of spider plant genotypes for biomass productivity and represent key resources for further improvement in the species.

Introduction of the European Commission through a PhD scholarship awarded to Aristide Carlos Houdegbe. The scholarship was for academic training and research mobility and a research grant to complete a PhD degree at the University of KwaZulu-Natal (South Africa). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. germination, and treating seeds with gibberellic acid and preheating were found to be effective [39][40][41], as well as storage for three or more months [42]. Improving leaf yield, early flowering, and insect pest resistance can be achieved by developing improved agricultural practices and high-yielding cultivars. Most previous studies focused on establishing the best agronomic practices for improved yield and included optimal planting density, type and fertilizer application rates, planting date, stage of transplanting, harvesting frequency and techniques (cutting, uprooting whole plants, defoliation), deflowering, sowing depth and net cover colour [43][44][45][46][47][48][49]. In contrast, limited studies thus far have addressed the genetic improvement of the species [13].
Genetic improvement requires a better understanding of the genetic diversity in the species through morphological and genetic/genomic characterization. Many studies have assessed morphological diversity in G. gynandra using a countrywide collection (e.g., Ghana [50], Burkina-Faso [51], Kenya [52,53]), regionwide germplasm (e.g., Kenya and South Africa [54], East and Southern Africa [4]) and worldwide collection [3,55]. It is worthwhile to highlight that some of these characterization studies were extended to nutritional values, including minerals [4], vitamins [3], and physiological traits [56]. Significant variations were observed among accessions with a strong association between their morphology and geographical origins [3,55]. East-Southern African accessions were observed to have taller plants compared to Asian and West African accessions with shorter plants [3]. Additionally, West African accessions were characterized by small leaves, and Asian and East-Southern African accessions had large leaves [3]. This morphological differentiation was further supported by genomic characterization [57]. Genetic differentiation was also observed between farmer's cultivars and genebank's accessions and advanced lines [58]. The considerable diversity observed represents a valuable resource for a successful breeding program.
However, most studies assessing morphological diversity in G. gynandra did not include leaf biomass yield. Those that included it were limited to regional accessions and advanced lines [4] and countrywide accessions [51,53]. Whereas farmers prefer traits in G. gynandra that include high leaf yield and related traits (plant height and the number of leaves), broad leaves, late flowering, good germination and resistance to pests and diseases [32,[59][60][61]. Among these traits, yield is the most important trait for farmers and breeding programs. Considering farmers' preferred traits in a breeding program is vital in the successful adoption of developed cultivars. Given the availability of worldwide collections, it is, therefore, important to assess the biomass potential of large germplasm collections.
Spider plant is both self-and cross-compatible but predominantly out-crossing [58,62,63], opening the rooms for developing both inbred/pure lines and hybrid cultivars. The outcrossing was observed to be exacerbated by the crop flowers' visits by both diurnal insect pollinators (bees, ants and butterfly) [33,62,63], and nocturnal pollinators (e.g. Hippotion spp, and Nephele aequivalens) [64] due to its flower structure. Spider plant has three types of flowers: staminate with short gynoecium, hermaphrodite with medium gynoecium, and hermaphrodite with long gynoecium, characteristics of an andromonoecious plant [63]. Giving the predominance of out-crossing in the species, hybrids cultivars could be advantageous over inbred lines through the exploitation of heterosis. In hybrid production, an important step is to develop inbred/pure lines. Various methods (e.g. single seed descent, bulk, pedigree, and doubled-haploids) can be used in developing inbred lines and single seed descend (SSD) has the advantage to allow a rapid development of inbred lines in a greenhouse or off-season [65].
Therefore, this study aimed to assess the phenotypic diversity in biomass yield and related traits among a worldwide collection of Gynandropsis gynandra advanced lines developed from SSD method to select elite genotypes for breeding programs and large-scale dissemination. Specifically, the present study: (i) assessed the phenotypic variation in biomass and related traits in G. gynandra using advanced lines selected from Asian, West, East and Southern African accessions; (ii) determined the relationship between biomass yield and related traits; and (iii) identified the best-performing genotypes for biomass yield.

Plant material
In this study, seventy-one advanced lines (Table 1) selected from accessions originating from Asia (18), West Africa (19), Eastern Africa (14) and Southern Africa (20) were evaluated. The accessions were obtained from the Laboratory of Genetics, Biotechnology and Seed Science of the University of Abomey-Calavi (Republic of Benin); the World Vegetable Center (Taiwan); the Kenya Resource Center for Indigenous Knowledge (Kenya); the Lilongwe University of Agriculture and Natural Resources (Malawi); the Namibia Botanical Gardens (Namibia); the Wageningen University and Research (Netherlands) and the University of Ouagadougou (Burkina-Faso) ( Table 1). Accessions were self-pollinated for four generations to develop the advanced lines using a single seed descent method. Briefly, only one seed was picked from each selfed plant per original accession. The single seed was then planted in the next generation of selfing, and the procedure was repeated until the fourth generation. Seeds of the fourth selfing generation pods were bulked for evaluation.

Experimental design and growth conditions
The advanced lines were evaluated in 2020 (September to December) and 2021 (January to April) under greenhouse conditions at the Controlled Environment Facility (29˚46 0 S, 30˚58 0 E) of the University of KwaZulu-Natal, Pietermaritzburg Campus, South Africa. Each year, the evaluation was laid out in a 9 x 8 alpha design with two replications. Seeds were pretreated by heating at 40˚C for three days to improve germination before sowing in seedling trays filled with growing media. The seedling trays were established in the greenhouse, and germination was observed three days after planting. Seedlings were grown for four weeks in a nursery and transplanted in 10 litre pots with three plants per pot. Pots were filled with composted pine bark growing media. Basal fertilizer composed of N:P:K (2:3:2) at a dose of 150 kg ha -1 was applied before transplanting, and limestone ammonium nitrate (28% N) was applied as topdressing two weeks after transplanting at a dose of 100 kg ha -1 . Automated drip irrigation was used to water the plants with 1 litre per pot daily, while weeds were controlled manually. In 2020, the average temperature and relative humidity were 28˚C day/20˚C night and 78.5%, respectively. The average temperature and relative humidity were 31˚C day/22˚C night and 77.4%, respectively, in 2021.

Data collection
Fourteen agronomic traits, including days to 50% flowering (DFlow), stem diameter (StDiam), plant height (PHeight), number of primary branches (NPBr), primary branch length (PBrLeng), central leaflet length (CtLleng), central leaflet width (CtLwid), leaf width (Lwid), petiole length (Ptilleng), leaf area (LfArea), total fresh biomass per plant (FBiom), edible fresh biomass per plant (EDBiom), harvest index (HI) and dry matter content (DM), were assessed four weeks after transplanting. Days to 50% flowering were recorded as the number of days from the sowing date to the day when 50% of the plants in each pot flowered. The central leaflet length (cm), central leaflet width (cm), leaf width (cm) and petiole length (cm) were collected on a fully developed primary leaf randomly selected on each plant using a ruler. The selected leaf was scanned using a Canon PIXMA G2411 scanner (Canon INC; Tokyo, Japan), and the resultant image was used to calculate leaf area using the R package "LeafArea" [66]. Plant height (cm) was measured from the base to the top of the plant with a tape measure, while the stem diameter was measured using a digital Vernier calliper at the plant collar. Each plant was harvested by cutting at a height of 15 cm above the ground, and the resultant biomass was weighed to determine the total fresh biomass per plant (g plant -1 ). The edible part of the total biomass was separated and weighed to record the edible fresh biomass per plant (g plant -1 ). The ratio of edible biomass to total fresh biomass was computed and reported as the harvest index (HI). All phenotypic traits measurements were taken on two plants out of the three plants per pot, except days to 50% flowering and dry matter content. An average value from the two individual plants per pot was computed and used in the data analysis. For dry matter content (DM), edible biomass of the plants per genotype in each replicate was bulked, and a sample of 20 g was taken and oven-dried at 65˚C for 72 h. DM (%) was computed as DM = (dry weight)/(fresh weight) x 100. The phenotypic data are presented in S1 Table. Data analysis The quality of data was assessed for outlier detection following Bernal-Vasquez et al. [67] using the Bonferroni-Holm test based on studentized residuals at the significance level of 5%. The mean, minimum, maximum, coefficient of variation and standard deviation were generated to characterize the plant material using the function describe of the R package "psych" [68]. The difference among regions of origin was tested using an analysis of variance or Kruskal-Wallis test, when necessary. Data were first analyzed separately per year by fitting a linear mixed model according to the following statistical model: in which y ikl was the phenotypic observation of the i th line in the l th incomplete block within the k th replicate, μ was the overall mean, B l (R k ) was the random effect of the l th incomplete block within the k th replicate, G i was the random effect of the i th line, and ε ikl was the random residual.
Variance components across years were estimated by fitting a linear mixed-effect model using the restricted maximum likelihood (REML) implemented in the ASReml-R package version 4.1.0.160 [69] according to the following statistical model: in which y ijkl was the phenotypic observation of the i th line in the l th incomplete block within the k th replicate at the j th year, μ was the overall mean, Y j was the random effect of the j th year, R k (Y j ) was the random effect of the k th replicate within the j th year, B l [R k (Y j )] was the random effect of the l th incomplete block within the k th replicate at the j th year, G i was the random effect of the i th line, GY ij was the random effect of the interaction between the i th line and the j th year, and ε ijkl was the random residual. Heterogeneous variances were assumed for residual effects in different years. The likelihood ratio test [70] was used to test the significance of the variance components for single year and across years analyses using the function lrt implemented in the ASREML-R package. Standard broad-sense heritability across years [71] was calculated as follows: where s 2 G is the genotypic variance of the lines, s 2 G�Y is the line × year interaction variance, s 2 e is the residual variance, r is the number of replications, and n is the number of years. The phenotypic best linear unbiased predictors (BLUPs) were generated from model 2. BLUPs were used because they have good predictive accuracy over the best linear unbiased estimators (BLUEs) due to their high correlation with the true values and their ability to handle environmental effects and have been recommended for phenotypic selection in plant breeding [72][73][74]. The values refer to mean genotypic values and were used in further analyses. Pearson's correlation coefficients among all traits and their level of significance were calculated using the function corr from the R package "Hmisc" [75]. Genotypic correlations among traits were estimated using META-R software [76]. Both genotypic and phenotypic correlations were plotted using the "metan" R package [77]. A principal component analysis was performed using the PCA function implemented in the R "FactoMineR" package [78] to assess the relationship among the lines and the biomass and related traits. Furthermore, we performed hierarchical clustering on principal components (HCPC) to group the genotypes based on the measured traits, and the results were visualized using the fviz_cluster and fviz_dendogram functions of the R package "factoextra" [79] for factor map and dendrogram, respectively. The significant difference among the means of the clusters was tested using one-way analysis of variance according to the following statistical model: in which y mi was the phenotypic observation of the i th line on the m th cluster, μ was the overall mean, C m was the fixed effect of the m th cluster, and ε mi was the random residual. In addition, the means of clusters were separated using Tukey's honestly significant difference test (Tukey's HSD post hoc test) at the 0.05 probability level using the R package "agricolae" [80]. The genetic advance (GA) for each trait was computed as GA = i × H 2 × σ P , where σ P was the phenotypic standard deviation, H 2 was the broad-sense heritability, and i was the standardized selection differential at the selection intensity of 5% (i = 2.06) [81]. Genetic advance over mean (GAM) was further computed as GAM = (GA/μ) × 100, where μ was the overall mean and GA was the genetic advance of the trait. Genotypic, phenotypic and error coefficients of variation (GCV, PCV and ECV, respectively) were estimated according to Burton and DeVane [82] as follows: in which s 2 G was the genotypic variance, s 2 P was the phenotypic variance, s 2 e was the residual variance, and μ was the overall mean. R software version 4.1.1 [83] was used to perform all statistical analyses.

Quantitative variation in biomass and related traits
A highly significant variation (p < 0.001) was observed among genotypes for all agronomic traits each and across years (Tables 2 and 3). Blocks did not significantly affect all the agronomic traits within and across years except days to 50% flowering. Similarly, year effects were not significant for all traits except stem diameter and days to 50% flowering. Replicates effects were significant only for plant height, leaf width (in 2020 or 2021 and across years), dry matter content (in 2021 and across years) and central leaflet width (across years). The genotype × year interaction effects were significant for stem diameter, primary branch length, number of primary branches, leaf width and area, petiole length, harvest index and days to 50% flowering ( Table 3).
The coefficient of variation evolved between 14.01% and 82.48%. Overall, lower values for dry matter content and higher values for primary branch length were observed. The average plant total fresh biomass and edible fresh biomass were 67.19 ± 2.67 g and 28.34 ± 1.08 g, respectively. As the second most variable trait, the plant total fresh biomass (CV = 63.59%) ranged from 2.10 g to 248.40 g, while the edible fresh biomass (CV = 61.08%) ranged between 1.20 g and 101.90 g per plant. The harvest index was 0.47 ± 0.01 on average with a range of 0.24-0.91. The spider plant genotypes flowered on average 60.14 ± 0.90 days after sowing, and days to 50% flowering ranged between 32 and 95 days after sowing. The plant height ranged from 13 cm to 117. 5 Table). East African genotypes followed by the Southern African genotypes outperformed West African and Asian genotypes in stem diameter, number of primary branches, petiole length, total fresh biomass and edible fresh biomass, and days to 50% flowering. The Southern African genotypes had longer central leaflet and broader leaf. In contrast, the West African genotypes had longer primary branch and higher dry matter content, whereas the Asian genotypes had broader central leaflet and a higher harvest index (Fig 1).

Variance components, heritability and genetic gain estimates of biomass and related traits
Significant genotypic variances (s 2 G ) were observed for all traits (across years, Table 4 and each year, S3 Table), while genotype × year interaction variances (s 2 G�Y ) were significant for stem diameter, primary branch length, number of primary branches, leaf width and area, petiole length, harvest index and days to 50% flowering (Table 4). For all traits, genotype × year interaction variances were lower than genotypic variances (s 2 G ). The broad-sense heritability was high for all traits and ranged between 0.64 ± 0.09 (edible biomass per plant) and 0.87 ± 0.03 (petiole length) ( Table 4). Genetic gains at 5% selection intensity were variable (Table 4, S3  Table). Estimates of genetic gains over the mean of the current population were low for dry matter content (13.75%) and high for primary branch length (117.36%). Specifically, significant genetic gains (> 50%) were observed for the number of primary branches, leaf area, and total and edible fresh biomass. Variable genotypic and phenotypic coefficients of variation were observed for all fourteen traits. Dry matter content had low phenotypic and genotypic coefficients of variation (< 10%), while days to 50% flowering, central leaflet width and harvest index had medium phenotypic and genotypic coefficients of variation (ranging between 10 and 20%). Other traits displayed high phenotypic and genotypic coefficients of variation. In comparison, trends in error coefficients of variation for all traits were similar to those of phenotypic and genotypic coefficients of variation (Table 4).

Association among plant biomass and related traits
Significant phenotypic and genotypic correlation coefficients were observed among the fourteen agronomic traits (Fig 2). While the phenotypic correlation coefficients ranged from -0.77 to 0.95, the genotypic correlation coefficients varied between -0.89 and 0.99. Similar trends were observed for the two types of correlation. For instance, a highly significant and positive correlation was observed between edible and total fresh biomass per plant at both phenotypic (r = 0.94, p < 0.001) and genotypic (r = 0.95, p < 0.001) levels (Fig 2). Total and edible biomass per plant had strong and positive correlations with plant height and stem diameter and positive and moderate correlations with all leaf-related traits (central leaflet length, central leaflet width, leaf width, petiole length and leaf area) and primary branch length. There were moderate to strong positive correlations among leaf traits, with leaf area being strongly and positively correlated with central leaflet length, central leaflet width and leaf width. Days to 50% flowering had moderate and positive correlations with the number of primary branches and petiole length but had a strong and negative correlation with the primary branch length and a moderate and negative correlation with dry matter content (Fig 2). The harvest index had negative and significant correlations with most traits, with strong correlations with stem diameter, plant height, and total fresh biomass. Additionally, the harvest index had moderate and negative correlations with edible plant biomass, dry matter content, primary branch length and leaf traits (central leaflet length, leaf width, and leaf area). Dry matter content had moderate and positive correlations with plant height, stem diameter, number of primary branches, total and edible fresh biomass per plant and leaf traits. The number of primary branches had a strong and negative correlation with primary branch length. A strong and positive correlation was  observed between stem diameter and plant height. In addition, stem diameter and plant height had a moderate to strong positive correlation with leaf traits (Fig 2).

Multivariate analysis of biomass and related traits in spider plant
To assess the relationship among genotypes, we first performed a principal component analysis. The results of the principal component analysis revealed that the first two components explained 72.43% of the total variation in the biomass and related traits and correlated with most traits (Fig 3A). Traits significantly associated with the first principal component (explaining 49.79% of the total variation) included stem diameter, plant height, leaf traits (central leaflet length, central leaflet width, leaf width, petiole length and leaf area), biomass (total and edible fresh biomass par plant) and harvest index. Principal component 1 was negatively correlated with harvest index but positively correlated with all other traits. Principal component 2 was positively and significantly associated with days to 50% flowering and the number of branches but negatively correlated with the primary branch length (Fig 3A).
Clustering pattern analysis using hierarchical clustering on principal components classified the lines into three clusters (Figs 3B and 4). A significant difference was observed among the clusters for all traits (Table 5). Cluster 1 (29.58% of all lines) encompassed mainly Asian lines (66% of all Asian lines) with some from other regions and was characterized by less vigorous  plants, with a moderate number of short primary branches, low biomass productivity and dry matter content, relatively late flowering time, small leaves, and high harvest index (Table 5). Cluster 2 included mainly lines originating from West Africa (73.68% of all West African lines) and some from other regions. Genotypes in cluster 2 had high dry matter content, long primary branches, high biomass productivity, low number of primary branches, moderate vigor, medium leaf size and flowered early. Cluster 3, mainly composed of lines from East and Southern Africa (88.46% of all lines in the cluster), was characterized by late flowering and vigorous plants, a high number of short primary branches, high biomass productivity, broad leaves, moderate dry matter content and a low harvest index (Table 5).

Discussion
Genetic variation is the foundation of any plant breeding program. Significant and origindriven variation has been reported in Gynandropsis gynandra for plant morphology [3,55], secondary metabolite concentrations [9], seed germination, mineral composition and morphology [39], leaf vitamin contents [3], antioxidant activity [12], and photosynthesis traits [56]. Morphological traits with significant variation were related to plant architecture (plant height, number of primary branches, plant habit, stem hairiness and colour), leaf size (leaf area, leaflet length and width, petiole length, leaflet shape), leaf colour, days to 50% flowering, germination (percentage and mean time), pod characteristics (pod length and width, number of seeds per pod), seed size (length, width, perimeter, area), 1000-seed weight, flower traits (androphore length, filament length, pedicel length, gynophore length), and biomass (total shoot fresh and dry weight, leaf fresh and dry weight) [3,4,39,55]. In addition, phenotypic differentiation among diverse accessions of G. gynandra was found to be associated with the genetic makeup of the genotypes [57,58]. While Omondi et al. [58] differentiated advanced lines and genebank's accessions from farmer cultivars using simple sequence repeats (SSR) markers, Sogbohossou [57] observed genomic differentiation among accessions from West Africa, East/ Southern Africa and Asia. Our study revealed that four generations of selfing maintained significant variation and membership in their group of origin, strengthening the hypothesis of geographical signature in cleome genetic diversity. We observed highly significant variation among advanced lines for biomass productivity, growth traits, leaf traits and flowering time in Gynandropsis gynandra. Similar observations for morphological traits have also been reported for worldwide accessions [3,55], East and Southern African accessions and cultivars [4], and accessions from South Africa and Kenya [54], Ghana [50], and Burkina-Faso [51]. This significant variation represents a valuable resource for sustainable and successful breeding programs for the species.
On the other hand, the average and the highest total fresh biomass in the present study were higher than those reported by Omondi et al. [4] in East-Southern African genotypes but slightly lower than those of Kiebre et al. [51] for accessions from Burkina-Faso. The difference might be attributable to the genotypes, agricultural practices, and environment since those authors evaluated their germplasm in the field. For instance, agronomic practices such as planting density, type and fertilizer application rates, planting date, stage of transplanting, harvesting frequency and techniques (cutting, uprooting whole plants, defoliation) significantly affect growth and biomass yield in G. gynandra [43][44][45][46][47][48][49]. Therefore, genotype performance should be investigated under different agricultural practices considering farmers' practices in target environments.
The clustering analysis identified three groups, each dominated by lines derived from accessions originating from different geographical regions. Clusters 1, 2 and 3 were dominated by lines derived from Asian, West African, and East/Southern African accessions, respectively. The clustering results were supported by the significant differences observed among regions of origin for all fourteen investigated traits. These results align with previous reports on the association between the geographical origin and the morphology of the accessions of G. gynandra [3,55]. Specifically, Sogbohossou et al. [3] identified three distinct groups similar to those of this study: East-Southern African accessions (tall plants with broad leaves), Asian accessions (short plants with broad leaves) and West African accessions (short plants with small leaves). Furthermore, the genetic constitution could be the main driver of this clustering, as Sogbohossou [57] reported genomic differentiation between Asian, West African and East/Southern African accessions. This clustering pattern might reflect the local adaptation of the species in response to environmental/climatic factors and different uses by local communities.
Farmers' preferred traits in G. gynandra include high leaf yield and related traits (plant height and the number of leaves), broad leaves and late flowering [32,[59][60][61]. We observed that East and Southern African lines combined several farmers' preferred traits such as broad leaves, late flowering and high biomass, while West African genotypes had high biomass and dry matter content. Based on biomass productivity, East, Southern and West African genotypes were similar and outperformed the Asian accessions, which could be in response to ancient domestication or advanced selection for biomass occurring in these regions compared with Asia. Intensive utilization of the species as a leafy vegetable has been reported in Africa rather than Asia. In several Asian countries, the species was mainly reported as weeds and rarely cultivated [84,85] and primarily used in traditional medicine [19,22]. In contrast, although the species still grows as weeds, it is cultivated in many African countries for its leaves as vegetables [29]. In Africa, the semi-cultivated status of G. gynandra was reported earlier in the 1950s [86]. The domestication of the species might have first started in Eastern and Southern Africa, as its weed status was quickly converted to cultivated species [87]. West African genotypes had similar biomass yields as the East and Southern African genotypes, suggesting West Africa as a secondary domestication hotspot for the species, while domestication and selection are still at the earlier stage or might not have started in Asia. Feodorova et al. [26] support these findings, by suggesting that the speciation event of G. gynandra might have occurred in South Africa. Using genome sequencing, Sogbohossou [57] suggested the African origin of the species, with Asian and West African populations being closed and recently divergent from East and Southern African populations. More investigations are needed to clarify the origin of the species as well as its route of colonization.
Heritability is important in breeding, as it helps in predicting the efficiency of the selection. Broad-sense heritability (H 2 ) measures the proportion of the total phenotypic variation attributable to the variance of genetic values [88]. High broad-sense heritability estimates (> 0. 60) were observed for all investigated traits, showing that phenotypic variation observed among genotypes is mostly due to genotypic variation. More importantly, we also observed low genotype × year interaction variance compared with genotypic variance. We therefore hypothesize that phenotypes can accurately predict genotypes, but this should be confirmed with multi-environmental trials. Similarly, high broad-sense heritability estimates were reported for stem diameter, plant height, number of primary branches, leaf biomass, leaf area, leaflet length and width, and days to 50% flowering in the species [54,89]. This suggests that high genetic advancement is achievable for biomass and related traits in the species. As a consequence, we observed significant expected genetic gain at a selection intensity of 5%, showing that significant improvement would be possible through direct phenotypic selection, particularly for total fresh biomass, edible fresh biomass, the number of primary branches and leaf area. These findings concur with earlier reports in G. gynandra for biomass yield and related traits [89]. The low genetic gain observed for dry matter content might suggest that selecting this trait might be difficult, as low variability was also observed. More genetic material is needed to broaden the available variability.
Genotype × year interaction variances were significant for stem diameter, primary branch length, number of primary branches, leaf width and area, petiole length, harvest index and days to 50% flowering. This is showing that these traits were influenced not only by the genotype but also by the interaction between genotype and year. As agronomic practices were the same during the two years, the differential environmental conditions between 2020 and 2021 could play a significant role in the significance of genotype × year interaction. Potential environmental factors that might influence these traits could include the temperature, the relative humidity, the light intensity and the day length (photoperiod). Imbamba and Tieszen [90] found that the photosynthesis rate in spider plant increase with light intensity (from 200 to 2000 μmol m -2 s -1 ) and that 2000 μmol m -2 s -1 , which is close to full sunlight, does not saturate photosynthesis in G. gynandra as it is a C4 plant. Similarly, Kocacinar [91] observed an increase in the net photosynthetic rate and stomatal conductance with increasing light intensity. Specifically, the genotype × year interaction variance was highly significant for flowering time compared to other traits. This is attributable to the day length sensitivity of the species. Zorde et al. [92] observed significant variation in days to flowering between the greenhouse (10-182 days) and field (20-57 days) trials in Arusha due to the differential day length and light intensity. In fact, the plants were grown under daylight conditions between 11:52-12:17 hours of daylight (field) compared to 14 hours in the greenhouse. The authors pointed out that light intensity may have further explained this as differences in light intensity significantly affect flowering time and yield [93] and the field evaluation might receive more intense light. The leaf temperature also significantly influences the rates of CO 2 assimilation, and the species requires high temperature (30-40˚C) to attain maximum photosynthesis, playing a key role in the species' growth and biomass productivity. On the other hand, the year significantly affected some of these traits, implying that these traits might vary with year. In addition, the significant genotype × year interaction indicated that the genotypes' performance was not consistent across environments, and selection should consider the interaction effect when selecting genotypes. However, evaluation in additional environments, particularly in field conditions, is required to better decipher the genotype by environment interaction in the species.
Understanding the association between traits offers an opportunity for efficient and simultaneous selection. Both phenotypic and genotypic correlations showed similar trends. In the present study, the correlation between total fresh biomass and edible fresh biomass was strong, positive and significant. In addition, these two traits were highly and positively correlated with plant height and stem diameter, suggesting that selection for vigorous and tall plants will lead to high-yielding cultivars. This might be accompanied by broad leaves resulting from the positive and moderate association between biomass and leaf-related traits (central leaflet length, central leaflet width, leaf width, petiole length, leaf area). Previous findings corroborated these results as a positive and strong correlation of leaf biomass with plant height, stem diameter, leaf length and width and petiole length [89]. Similarly, Kangai Munene et al. [54] and Mosenda et al. [53] observed a positive and strong association between the number of leaves per plant and plant height [54]. Such a positive association between these traits imply that simultaneous and direct selection for such farmers' desired traits would be possible. This association could result from pleiotropic or linked genes controlling biomass, plant height, stem diameter, and leaf traits in the species. Using an F2 population, Sogbohossou [57] found a single QTL for plant height and two for leaf area, and this plant height QTL and one QTL for leaf area were colocalized on the same linkage group, with potential pleiotropic effects of a candidate gene, although the author recommended the validation of the QTLs.
The number of primary branches was positively correlated with days to 50% flowering, suggesting that late flowering plants had more branches. In contrast, primary branch length had a negative and significant correlation with days to 50% flowering and number of branches, showing the existence of a trade-off between the number of primary branches, the primary branch length and days to 50% flowering in the species. After flowering, plants allocate resources for lateral branch growth, therefore, the plant can achieve higher biomass either by flowering early and developing long branches or delaying flowering to produce more branches. This might explain why West African genotypes had similar biomass yields to East/Southern genotypes, which are late flowering with a high number of short branches. This calls for an indepth investigation to understand resource allocation in the species and genes involved in flowering time, branch development, and plant architecture. To this end, developing mapping populations using genotypes from all clusters will be insightful.
In this study, the harvest index was negatively associated with plant biomass and most other agronomic traits, suggesting that selection for the harvest index might be difficult. However, using appropriate agronomic traits, such as early harvesting, could help improve the harvest index. Frequent harvesting (e.g., every week or two weeks) might increase biomass productivity and extend the harvesting period. This would strongly depend on the regrowth ability of the genotype. An evaluation of the germplasm under different agronomic practices, including harvesting techniques and frequency, is required, as suggested by Houdegbe et al. [43]. Assessing the regrowth ability would be crucial, particularly in West Africa, where cutting is the frequent harvesting technique employed by farmers and genotypes with several cuttings are desired [94]. In this case, the ability to predict yield for the subsequent harvest should be investigated through genetic correlation analysis.
Dry matter content is associated with shelf life and determines the vegetable's post-harvest behaviour [95][96][97]. The moderate and significant association of dry matter content with plant biomass, growth traits and leaf traits suggested that increasing the leaf area might not affect dry matter content in the species. In contrast, the negative association between days to 50% flowering and dry matter content showed that late flowering plants might have low dry matter content with reduced shelf life, suggesting plausible linkage drag between flowering time and dry matter accumulation in the species. Similarly, a negative correlation was observed between dry matter content and days to silking in maize for biogas production [98]. Such an association could be investigated using mapping populations developed between West and East/Southern African genotypes. In addition, broadening the narrow genetic variation for dry matter content is needed through extensive germplasm collections, introductions and characterization.
Overall, considering farmers' preferred traits, genotypes in cluster 3 and somewhat cluster 2 are good candidates for cultivar release and breeding programs. Superior genotypes from these clusters with multiple improved traits included SA10, SA12, SA17 (edible biomass, stem diameter, number of primary branches, plant height, and leaf area and dry matter content), EA12, EA3 (fresh and edible biomasses and days to flowering), AS14, AS16, WA17, WA3 (fresh biomass, dry matter content and primary branch length), EA7, SA2, and EA9 (days to 50% flowering time and number of primary branches). An intensive field evaluation of these genotypes through multi-environment trials within each region would help in understanding the genotype-by-environment interaction in the species and whether to breed for specific or broad adaptation. Furthermore, establishing the link between the phenotype and genotype is required to help implement marker-assisted selection in the species. Genome-wide association studies (GWAS) can be implemented to decipher genes associated with functional and farmers' preferred traits and would serve in the validation of QTLs reported by Sogbohossou [57] on flowering time, plant height, and leaf area. The best genotypes from each cluster could be involved in studies to estimate the narrow-sense heritability and determine gene action controlling the key traits using factorial mating designs such as diallel and North Carolina mating designs. In addition, assessing the potential hybrid vigour in the species would help design efficient breeding strategies for ideal cultivar development. Association of these traits with nutritional traits is needed. Evaluation of these genotypes under different disease and pest pressures and biotic stresses is required, particularly in the current changing climate.

Conclusion
The present study revealed that the biomass potential of advanced lines of spider plant was associated with their geographical origin, thus strengthening the hypothesis of geographical signature in cleome diversity. West and East/Southern African lines had higher biomass productivity than Asian lines, suggesting advanced selection and domestication in Africa than Asia for biomass. The significant genetic variation, high broad-sense heritability, genetic gain and positive correlation between plant biomass and related traits provides the opportunity for positive and simultaneous selection, especially for farmers' preferred traits such as biomass yield, leaf size, flowering time and the number of branches. The genotypes SA10, SA12, SA17, EA12, EA3, AS14, AS16, WA17, WA3, EA7, SA2, EA9 are superior for multiple farmers' desired traits and good candidates for breeding programs and cultivars release. Further studies should target multi-environment trials to determine genotype by environment interaction effect, determine the genotypes' response to different agronomic practices such as cutting, fertilization considering the locally available resources, identify gene action and genes controlling farmers preferred traits and evaluate the germplasm tolerance to biotic and abiotic stress. Additionally, the association of plant biomass and related traits with key nutritional traits such as minerals is required to ensure the quality of the end products for users.