Germplasm of Brazilian winter squash (Cucurbita moschata D.) displays vast genetic variability, allowing identification of promising genotypes for agro-morphological traits

Winter squash fruits (Cucurbita moschata D.) are among the best sources of vitamin A precursors and constitute sources of bioactive components such as phenolic compounds and flavonoids. Approximately 70% of C. moschata seed oil is made up of unsaturated fatty acids, with high levels of monounsaturated fatty acids and components such as vitamin E and carotenoids, which represent a promising nutritional aspect in the production of this vegetable. C. moschata germplasm expresses high genetic variability, especially in Brazil. We assessed 91 C. moschata accessions, from different regions of Brazil, and maintained at the UFV Vegetable Germplasm Bank, to identify early-flowering accessions with high levels of carotenoids in the fruit pulp and high yields of seed and seed oil. Results showed that the accessions have high variability in the number and mass of seeds per fruit, number of accumulated degree-days for flowering, total carotenoid content, and fruit productivity, which allowed selection for considerable gains in these characteristics. Analysis of the correlation between these characteristics provided information that will assist in selection to improve this crop. Cluster analysis resulted in the formation of 16 groups, confirming the variability of the accessions. Per se analysis identified accessions BGH-6749, BGH-5639, and BGH-219 as those with the earliest flowering. Accessions BGH-5455A and BGH-5598A had the highest carotenoid content, with averages greater than 170.00 μg g-1 of fresh mass. With a productivity of 0.13 t ha-1, accessions BGH-5485A, BGH-4610A, and BGH-5472A were the most promising for seed oil production. These last two accessions corresponded to those with higher seed productivity, averaging 0.58 and 0.54 t ha-1, respectively. This study confirms the high potential of this germplasm for use in breeding for promotion of earlier flowering and increase in total carotenoid content of the fruit pulp and in seed and seed oil productivity.

4 99 agents of this crop [34], and for its improvement in terms of production [20] and nutritional 100 aspects of fruits and seed oil [12,35]. The potential of this germplasm as a source of genes for 101 the improvement of this crop, along with the possibility of elucidating the genetic mechanisms 102 linked to important production parameters, justifies the continuation of studies on its assessment 103 and use. 104 This study therefore aimed to: a) agro-morphologically assess some of the C. moschata 105 accessions maintained by BGH-UFV, b) analyse the genetic relationships of the agro-106 morphological characteristics, and c) analyse the agro-morphological variability, with a view to 107 identifying earlier-flowering genotypes, genotypes with high total levels of carotenoids in the 108 fruit pulp, and those with high potential for seed and seed oil productivity.

111
Origin of germplasm and preparation of seedlings 112 In this study, we assessed 95 genotypes, which comprised 91 accessions of C. moschata 113 maintained in the BGH-UFV, and four control genotypes (Table 1)  Seedlings were produced in a 72-cell expanded-polystyrene tray containing commercial 120 substrate. Seedling transplantation and cultural treatments were carried out according to local 121 recommendations for the cultivation of pumpkins [36].   Table 2. Descriptors involving agronomic aspects, the total carotenoid content of fruit pulp and 150 yields of seed and seed oil used in the assessment of the C. moschata germplasm maintained by 151

Phase/organ Descriptors
Reproductive phase Degree-days accumulated for flowering (DDF).  The letters W, X and Z correspond to the incidence matrices of parameters b, a, and t, 191 respectively, with the data vector y.

210
The correlations were analysed using a procedure known as correlation network, which allows, 211 based on a specific function, the analysis of all relationships between the variables under study.

212
This procedure also allows the direction and magnitude of the correlations to be distinguished.

213
The direction is denoted by colours; a dark green colour is used for the lines that connect The analysis of variability was carried out using both quantitative and qualitative information.

222
For quantitative data, the distance matrix between the genotypes was obtained from the BLUPS 223 estimates in the case of accessions, and BLUES in the case of the controls, and were estimated 224 based on the negative average Euclidean distance, with data standardization.

225
The matrix was obtained from negDistMat, a function of the APCluster package [42] 226 implemented in the R program, version 3.5.1 (R Development Core Team, Vienna, AT). The 227 distances d (x; y) between the accession pairs, exemplified here as any two accessions x (x 1 , ...,

228
x n ) and y (y 1 , ..., y n ), were estimated based on the following equation: In which v corresponds to the number of quantitative descriptors evaluated.

231
The distance matrix for the qualitative data was obtained using the arithmetic complement of the 232 simple coincidence index. The variability analysis was performed from a single distance matrix, 233 obtained from the sum of the distance matrices of the quantitative and qualitative data. For the 9 234 sum of matrices, they were standardised and each received an equal weight in the sum procedure.

235
The variability analysis was performed using the procedure known as the Affinity propagation 236 method [43]. The grouping was carried out from 100 independent rounds, aiming to assess the 237 consistency of grouping.

238
The operation of Affinity initially involves the identification, in a set of components, of samples

248
In the analysis of the present study, the availability was initially established as zero.

276
The greatest contributions from the genotypic variance (σ 2 g ), to the phenotypic (σ 2 p ) also of fresh pulp mass for PF and TC, respectively. The selection gains for PS and SOP were 0.187 285 and 0.072 t ha -1 , respectively (Table 3).

286
Phenotypic amplitude for DDF between accessions was 120.0 to 820.4 and the average was 606. 287 642 (Table 3). The amplitude for PF was 44.6 to 0.7 t ha -1 and the average 12.946 t ha -1 . The 288 amplitude for TC was 43.4 to 187.2 µg g -1 of fresh pulp mass and the average was 65.763 µg g -1

289
of fresh pulp mass. The amplitude for PS was 0.01 to 0.9 t ha -1 and the average was 0.269 t ha -1 .

290
The phenotypic and average amplitudes of the accessions for SOP were 0.004 to 0.40 t ha -1 and 291 0.050 t ha -1 , respectively (Table 3). The greatest amplitude for the coefficient of genotypic 292 variation (CV g %) was observed between the characteristics SOC and MF, while for the 293 coefficient of phenotypic variation (CV P %), the greatest amplitude was observed between DDF 294 and SOP. The estimates of residual variation coefficient ranged from 7.502 to 71.582 for TC and 295 SOP, respectively (Table 3).   The visual pattern of the clustering in heatmap format showed low similarity between the groups 346 formed, something denoted by the predominance of yellow and orange colouring ( Figure 2).

347
Visual analysis of this clustering also shows homogeneity of the distances between groups, In order to facilitate the visualization of clusters with the most desirable means for each 357 characteristic, a grouping of means of clusters was performed by the Tocher method (Table 5).    (Table 6).

385
For PF, the selected accessions expressed averages higher than the general average of the 386 accessions (12.95 t ha -1 ) and the average of the controls (11.85 t ha -1 ). The new predicted 387 averages for this characteristic among the selected accessions ranged from 15.49 to 29.27 t ha -1 .

388
As for TC, the selected accessions also expressed averages much higher than the general average  (Table 6).  The identification per se of the most promising accessions for seed productivity (PS), seed oil 409 content (SOC) and seed oil productivity (SOP), together with their respective genetic gains and 410 new predicted averages for these characteristics is shown in table 7.  As for PS, the new predicted averages for this trait among the selected accessions ranged from 422 0.33 to 0.58 t ha -1 and the genetic gains from 0.06 to 0.31 t ha -1 . Notably, the accessions BGH-423 4610A, BGH-5485A, and BGH-6590 were the most promising for this characteristic (Table 7).

424
The selected accessions displayed small differences between them for the SOC, however, the 425 average of these was higher than that of the controls (16.73%). Finally, for the SOP trait, the new 426 predicted averages for this trait ranged from 0.12 to 0.13 t ha -1 and the genetic gains from -0.07 427 to -0.08 t ha -1 . The accessions BGH-5485A, BGH-4610A, and BGH-5472A were the most 428 promising for this characteristic (Table 7).  (Table 3).

482
The average relationship between the coefficient of genetic variation and the residual coefficient 483 was close to one unit for most of the characteristics. Although the estimates of the residual 484 coefficient of variation for most characteristics were high, in general they tended to be lower in 485 relation to their corresponding coefficient of genotypic variability, which demonstrates that most 486 of the variability expressed by germplasm was due to genetic factors (Table 3).  selection for a primary characteristic by means of a secondary one, it is necessary that the 499 heritability of the latter characteristic be greater than that of the former so that the selection is 500 efficient. In view of this, the selection of genotypes with higher MF seems to be a promising 501 alternative for obtaining higher fruit productivity in C. moschata. It should, however, be 502 highlighted that when selecting genotypes with the aim of increasing fruit productivity in C.

518
Based on these results, the simultaneous consideration of aspects such as higher NFP, higher PF 519 and higher MS/F relationship seems to be a promising alternative for obtaining higher seed 520 productivity in C. moschata. The heritability estimates obtained for these characteristics (>0.42),

521
suggest the feasibility of reasonable gains with the selection for each one of them (Table 3).

559
The accessions of C. moschata assessed in this study displayed high genetic variability in their 560 agro-morphological characteristics, the total carotenoid content of the fruit pulp, and the 561 productivity of seeds and seed oil, resulting in the formation of 16 clusters (Table 4). There was 23 562 low similarity between the clusters formed, as shown by the predominance of yellow colour in 563 the hierarchical clustering in heatmap format (Figure 2). The visual analysis of this cluster also 564 indicates the homogeneity of the genetic distances between clusters, which verified the clustering 565 efficiency. As can also be seen in figure 2, there was uniformity in the yellow colour for the 566 genetic distances between groups, confirming the homogeneity of distances between them. The analysis of averages of the groups using the Tocher method (Table 5)  accessions, respectively, making them the next largest groups formed.

586
Regarding TC, the highest average was in group 7, formed by the accessions BGH-5455A and 587 BGH-5598A (Table 4). These accessions were also identified as the most promising for TC in 588 the identification per se, with new predicted averages greater than 170 μg g -1 of fresh pulp mass 589 (

615
Regarding PS and SOP, the main interest in the assessment of these traits in C. moschata 616 corresponds to the high potential of using oil from its seeds for food purposes. This vegetable 617 has a high oil content, with the lipid fraction of its seeds reaching up to 49% of its composition 618 [78]. The lipid profile of this oil consists of more than 70% unsaturated fatty acids, with a 619 preponderance of fatty acids such as linoleic C18: 2 (Δ 9,12 ) and oleic C18: 1 (Δ 9 ). In this regard, Group 16, formed solely by the control Tetsukabuto, displayed the lowest average DDF (Table   630 5), indicating that this genotype has the earliest flowering period. As can also be seen in this 631 and PF (Figure 1). The accessions BGH-4681A and BGH-5653 were also identified as the most 648 promising for PF in the per se identification, with averages above 20 t ha -1 (Table 6). These 649 averages were much higher than the world average, estimated at 13.4 t ha -1 [8].

650
Although the cultivation of C. moschata is primarily intended for fruit production, as already 651 mentioned, the selection of genotypes for greater fruit productivity in this crop must also 652 consider crucial aspects for the acceptability of fruits such as shape and size. In general, winter 653 squash production must currently prioritise the adoption of cultivars with smaller fruits. In and BGH-6587A, which expressed gains and new predicted averages for fruit productivity above 673 8 and 20 t ha -1 , respectively (Table 6). As can also be seen in this table, these accessions 674 displayed gains and new predicted averages much higher than those of the controls. It should be 675 highlighted that the BGH-5544A accession also expressed high averages for PS and SOP, 676 corroborating the correlations of these characteristics with PF ( Figure 1). This indicates the 677 potential for the dual use of this accession to produce fruit and seed oil.

679
These accessions expressed gains and new predicted averages for TC higher than 108.03 and 680 173.80 μg g -1 of fresh pulp mass, respectively, much higher than those of the controls. For the 681 characteristics of seed and seed oil, it was found that the accessions BGH-4610A, BGH-5485A, 682 and BGH-6590 were the most promising for PS (Table 7). These accessions expressed gains and 683 new predicted averages for seed productivity of up to 0.31 and 0.58 t ha -1 , respectively. The most 684 promising accessions for SOP were BGH-5485A, BGH-4610A, and BGH-5472A, which 685 expressed new predicted averages for seed productivity of 0.13 t ha -1 . It is worth highlighting 686 that these accessions corresponded to those with higher PS, corroborating the strong correlation 687 between PS and SOP ( Figure 1). The accessions of C. moschata assessed in this study expressed high genetic variability for agro-695 morphological characteristics and for agronomic aspects related to the production of seeds such 696 as NSF and MSF, for DDF, and for TC and PF, which allowed the obtainment of considerable 697 gains from selection for each of these characteristics.

698
The network of genetic correlations showed that higher fruit productivity in C. moschata might 699 be achieved from the selection of aspects considered crucial in the production of this crop such 700 as higher NFP, HF and DF. It also showed that greater seed productivity might be achieved with 701 the selection for higher MS/F, NSF and MSF; information that will assist in selection for higher 702 productivity of fruit, seed and seed oil.

703
The clustering analysis resulted in the formation of 16 groups, with low similarity between the 704 groups, which corroborates the variability of these accessions.

705
The grouping of the averages of the clusters and the identification per se allowed the recognition 706 of the most promising groups and accessions for each characteristic, an approach that will guide 707 the use of these accessions in breeding programs.