Agro-morphological and molecular diversity in different maturity groups of Indian cauliflower (Brassica oleracea var. botrytis L.)

The present study analysed the molecular and agro-morphological diversity in a set of 92 diverse cauliflower genotypes and two each of cabbage and broccoli. Field evaluation of the genotypes was done in randomized block design (RBD) at two locations (i.e. IARI, New Delhi and ICAR-RC-NEH Region, Barapani) during Rabi2019-20. Genotypes showed variation for all the eight observed traits at both locations and, the differences in early and snowball groups were distinct. Pusa Meghna, DC-33-8, Pusa Kartiki and CC-14 were earliest for curd initiation. Genotypes showed higher values for curd traits at Delhi. Molecular diversity was detected with 90 polymorphic simple sequence repeats (SSR). Number of alleles ranged from 1 to 9 with mean value of 2.16 and the highest polymorphic information content (PIC) value was observed for primer BoGMS0742 (0.68) with a mean value of 0.18. Cluster analysis using agro-morphological traits substantiated classification of the genotypes for maturity groups. However, SSR analysis revealed four clusters and with a composite pattern of genotype distribution. STRUCTURE analysis also supported the admixture and four subpopulations. The studyindicates for introgression of genetic fragments across the maturity groups, thereby, potential for use in further genetic improvement and heterosis breeding.


Introduction
Cauliflower (Brassica oleracea var. botrytis L., 2n = 2X = 18) is one of the most important vegetable crops grown worldwide. The genus Brassica comprises of around 40 different species including Brassica oleracea. The B. oleracea represents the popular group of 'Cole' crops or 'Cole' vegetables. Cauliflower is one of the important crops of this group and being grown across the world on 1.42 million ha area having the annual production of 26.90 million tonnes. Of this, China (40.5%) and India (33.2%) holds major share [1]. Cauliflower contains glucosinolates which are responsible for its health protective properties and sensory attributes such as pungency, aroma and flavour. Economic portion of cauliflower is a prefloral fleshy apical meristem commonly known as 'curd' [2] which is eaten as vegetable or pickle or various regional culinary recipes. The curd contributes nearly 45% of the gross plant weight [3]. Cauliflower is a thermo-sensitive crop and temperature plays key role in the regulation of curd initiation and development through a group of major genes and modifiers [4]. The cauliflower was originated in Mediterranean region and introduced in different parts of the world by traders and botanists. Genetic changes took place in introduced genotypes for adaptive traits such as plant types, curding traits and flowering behaviour which could help in evolution of eight regional morphotypes of cauliflower [5]. These groups are being recognised as Italians or Original (Mediterranean), Cornish (England), Northerns (England), Roscoff (France), Angers (France), Erfurt and Snowball (Germany and Netherlands), Indian cauliflower (Northern India). Further, cauliflowers also classified according to their phylogeny as Italian, North-West European biennials, North European annuals, Asian and Australian highlighting the significance of regional factors and growing habit [6].
Cauliflower is an introduced crop in India and has been categorised into two broad groups namely (i) European (late or snowball) and (ii) India (tropical) types depending upon their temperature requirement for curding and reproductive phases. Typical Indian cauliflower (group1a & 1b and group 2) forms curd at higher temperature (16-27˚C) than snowball group (10-16˚C) and, an intermediate group designated as mid-late (group-3) requires 12 to 16˚C for curd initiation and development [5]. Further, Indian type cauliflower flower and set seeds profusely in northern plains during winter season while snowball cauliflower does not bolt or set seeds in plains due to its prolonged low temperature requirement. Precisely, the Indian cauliflower is grouped into early (group-1), mid-early (group-2) and mid-late (group-3) on the basis of its specific temperature requirement for curd initiation and development as 20-27˚C, 16-20˚C and 12-16˚C, respectively [7]. Deviation from the demarcated range of temperature leads to loss in yield and quality besides occurrence of various disorders like bracting, yellowing and loose curds (at higher temperature) and buttoning, fuzziness, riciness and pink colouration (at lower temperature) [8].
The tropical type germplasm of Indian cauliflower have strong phenotypic affinity with the European Cornish type for vigorous plant type, open growth habit, long stalk and leaves, loose, irregular shaped cream and yellow curds and strong curd flavour [5]. However, few leaf and curd traits such as presence of protective jacket leaves and good curds are also found to be similar with Roscoff and Italian cauliflower [9] which was due to planned crossing programmes. This, intentional crossing attempts created new set of germplasm in Indian cauliflower having desirable curd features and semi-erect plant stature [10]. This reflects well in its present-day cultivars which have enhanced curd size, white colour and strong compactness [8,11]. Further, lot of exotic germplasm and locally bred inter-group progenies diversified the Indian cauliflower germplasm for use in heterosis and stress resistance breeding [12]. The shift in behaviour of the genetically governed adaptive/consumer traits was due to changes in combination of major and minor genes during the evolution of new genotypes. However, researchers reported good extent of diversity using morphological traits in snowball cauliflower [13][14][15] and different maturity groups of tropical cauliflower groups such as early [16] and mid group [17] but these traits are sensitive to environmental conditions, therefore, their observations are not adequate the level of variation in the genotypes for breeding use. Further, these studied were done in a specific location and with limited set of germplasm, and none of them attempted with a composite set of genotypes from all the maturity groups at diverse locations. Notably, understanding the level of diversity at molecular level and corroborate this information with prominent agro-morphological traits will be of breeders' immediate use. Since, the polymorphism revealed among the genotypes by DNA markers is independent of environmental factors. For molecular diversity analysis, simple sequence repeats (SSR) were of great use, since they are robust, reliable and codominant DNA markers [18]. They are abundant in Brassica oleracea cytodeme and its related species [19] with high extent of cross-transferability in Brassica group [20]. These markers have been used in diversity analysis in different pools of Brassica oleracea [21][22][23] and also in linkage studies [24]. Ram et al. [25] reported effectiveness of SSRs in diversity analysis and genotype identification of snowball cauliflower. Therefore, the present study was planned to assess the molecular diversity using SSR markers and also observe the agro-morphological variation in the genotypes at two diverse locations.

Plant material and growing environment
The experiment material comprised of total ninety-six (96) genotypes out of which 92 of cauliflower, 2 of cabbage and 2 of broccoli. Cauliflower genotypes were from the core set of germplasm representative of all the four maturity groups namely early (37), mid-early (25), mid-late (15) and late/snowball (13) groups. The genotypes of first three groups are developed and maintained by IARI, New Delhi while snowball group genotypes were obtained from IARI Regional Station, Katrain, Himachal Pradesh for use a reference, India. Besides, two specialty type cauliflowers 'GPMT-1' (green mustard type leaves) and 'Orange type' (orange curd) alongwith two each of Broccoli ('DC-Brocco-13' and 'Delhi Purple Broccoli-1') and tropical Cabbage ('PA-1' and 'PA-2') were also included in molecular analysis. 'Orange type' cauliflower genotype was not shared with Barapani centre due to lack of permission. Seedlings were raised on nursery beds transplanted (35 days old) at 60 × 45 cm (no. of plants per plot = 30) in complete randomized block design (RBD) with three replications at IARI, New Delhi (28˚35' N, 77˚12' E, 228.6 m above mean sea level) and ICAR Research Complex for NEH Region, Barapani, Meghalaya (25˚41' N, 91˚55' E, 960 m above mean sea level) during 2019-20, under plain and mid-hill conditions, respectively. Delhi location had wider range of temperature (3-36˚C) than Barapani location (5-27˚C). Similarly, the soil pH of Delhi site was high (6.8) than Barapani (5.2). Standard crop practices were followed for the crop at both locations [7]. Days to 50% curd initiation (DCI), days to 50% curd maturity (DCH), number of leaves/plants, gross plant weight (g), curd traits namely curd length or polar diameter (cm), curd width or equatorial diameter (cm), marketable curd weight (g) and net curd weight (g) were recorded from five random plants in each plot. The curd traits were observed as per the procedure described by [26,27].

DNA extraction and SSR analysis
Genomic DNA was extracted from the fresh leaf samples collected from field grown healthy young plants at Delhi site by using standard CTAB protocol [28]. Appropriate quantification of DNA was done by running on 0.8% agarose gel. Additionally, quality and accurate quantity of the genomic DNA was also analysed by Nanodrop spectrophotometer (Eppendorf) and diluted with TE buffer to yield a working DNA having appropriated concentration of 25-30ng/ μl.
A set of 100 primers from Brassica group were screened for polymerase chain reaction (PCR) amplification. The sequence information of 90 SSRs which generated polymorphic amplicons is given in S1 Table. All the SSRs were amplified by PCR in 10 μl volume having 50 ng genomic DNA, 1.0 U TaqDNA polymerase (Hi media Laboratories, Mumbai, India), 1×PCR assay buffer with 1.5 mM MgCl 2 , 10 pmol of each primer (forward and reverse) and 100 μM of dNTPs mix (Thermo Scientific). All the primers were amplified using touchdown PCR in an Eppendorf Master cycler using the following cycling programme: initial denaturation at 94˚C for 5 min followed by 30 cycles of denaturation at 94˚C for 1 min; primer annealing at 55-65˚C for 1 min (varied with primer); primer extension at 72˚C for 2 min and final extension at 72˚C for 10 min. The programme was made to retain the samples at 4˚C until they were collected and stored at -20˚C.

Electrophoresis and fragment detection
The amplified PCR products were mixed with 1 μl of 1X loading dye (Bromophenol blue) and resolved on 3% agarose gels in 1X TAE buffer and stained with ethidium bromide. The bands were visualised under UV light in a gel documentation unit (Alpha Imager, Cell biosciences, Santa Clara, CA).
The scoring of amplicons was done manually using a reference of 50bp ladder. The data matrix was made using amplicon size. Power Marker v3.25 software was used for analysis of polymorphism information content (PIC), major allele frequency and cluster analysis [29]. Similarity index was calculated using Nei's formula [30]. The UPGMA (unweighted pair group method with arithmetic mean) of the NTSYS software version 2.02i [31] was used to generate the corresponding dendrogram. Population structure analysis was done by STRUC-TURE version 2.3.4. Models were tested for K-values ranging from 2 to 10 with 10 independent runs each and 100,000 Markov chain Monte Carlo (MCMC) iterations. The most likely number of clusters was chosen by plotting the Ln P(D) values against ΔK values with the best K-value selected according to the Evanno test [32] using Structure Harvester (http://taylor0. biology.ucla.edu/structureHarvester/).

Statistical analysis
The data from agro-morphological traits were recorded and subjected to analysis of variance (ANOVA) using online OPSTAT software [33] (http://14.139.232.166/opstat/). Mean and standard deviation for the observations was calculated using Microsoft Excel 2019. DARwin 6 software was used for diversity analysis and generating dendrogram for cauliflower genotypes.

Agro-morphological diversity
Significant variations were observed between and within the maturity groups of cauliflower for all the eight observed agro-morphological traits (Table 1). Location effect was evident on the observed traits in 91 common genotypes. Overall, the mean values of the traits of each maturity group were significantly higher at Delhi location than at Barapani centre (Fig 1). The DCI (days to 50% curd initiation) in cauliflower genotypes was ranged from 48.0 to 93.3 days at Delhi and 16.7 to 123.0 days at Barapani (Table 1) The highest gross plant weight was recorded in KT-22 (2883.3 g) and minimum in P-903 (475.0 g). Descending order was DC-310>BR-2>DC-310-22>KT-25>KT-17. Significantly low gross plant weight was recorded from Barapani centre which ranged from 129.2 g (DC-3025-5) to 1210 g (Pusa Snowball K-1). The descending order was Pusa Snowball K-1>KT-22>KT-6>KT-20>KT-2. Marketable curd weight ranged from 208.3g (P-903) to 1416 g (BR-2) with a grand mean of 690.7g at Delhi location. In Barapani condition, it was ranged from   Table 2). The genotypic effects were significant for all the traits indicating considerable divergence in the cauliflower germplasm. Location influenced all the traits under study and the genotype × location effects were also significant for all the eight traits.
Clustering of 91 genotypes through DARwin 6 software using the observations from eight agro-morphological traits from Delhi condition revealed two main clusters (Fig 2A). Each cluster had two sub-clusters. Sub-cluster 1a had majority of genotypes from early and midearly group of cauliflower while sub-cluster 1b consists of only one genotype DC 903. In cluster 2, most of the genotypes were of mid-late and snowball groups while some of genotypes of mid-early group such as DC-308, DC-309, DC-CCM � HR, DC-310, DC-310-22 and PCF-373. In Barapani condition, there are two distinct clusters ( Fig 2B). In that, cluster 1 consists of all the early, mid-early and mid-late cauliflower along with one genotype from snowball group

Molecular diversity
Molecular diversity was assessed in 96 diverse genotypes including 91 of white cauliflower, one of orange cauliflower and two each of broccoli and tropical cabbage. For this, 232 genomic SSR primers from Brassica oleracea group were screened with 96 genotypes. Among them, 95 showed good amplification in almost genotypes. Of them, 59 were polymorphic. The amplification of selected SSR markers is shown in Fig 3A-3C. Information on different polymorphic primers with their major allele frequency, number of genotypes amplified, number of alleles, gene diversity and polymorphic information content (PIC) value are presented in Table 3. Numbers of bands amplified by 90 primers were ranged from 1 to 9, with the mean value of 2.16.
The highest number of alleles were generated to be 9 by BoGMS0742 followed by BOSF1004 (7), O10B02 (5) and OI10D03 (5). Four alleles were observed from FITO348, BOESSR371, BOSF1613, OI13C12 and OI12F02 markers. Eighteen primers amplified 3 alleles, 31 primers 2 alleles and 32 primers amplified only one allele. Two gene-based STS markers Myb28A09 and Myb28B1 linked to glucosinolates content in Brassica juncea [34,35] also analysed in the germplasm and Myb28B1 marker was found to be monomorphic while Myb28A09 could amplify one band in 92 genotypes.
The polymorphic information content (PIC) was highest for BoGMS0742 (0.68) followed by OI10D03 (0.56) and BoGMS (0.    population consisting of four-sub-populations. These subpopulation groups were denoted as G1, G2, G3 and G4 and comprised of 24, 25, 34 and 13 genotypes, respectively (Fig 5). Almost 52 genotypes including 17 of G1, 9 of G2, 20 of G3 and 6 of G4 showed no admixture while Almost all subpopulations contained genotypes from different groups of curd maturity. These results indicate that the grouping of genotypes on the basis of SSR markers was inconsistent with the traditional curd maturity-based grouping of the genotypes (r 2 = 0.00031). However, main cluster II had all the genotypes from late/snowball group. Sub-clusters in main-cluster III also had genotypes predominantly from mid-early group namely DC 402, DC 309, DC 385, DC 310 and DC 321. Similarly, DC 522, DC 383, DC 325 and DC 308 genotypes of mid-early also made a single sub-cluster. Two of the four newly developed genotypes (DC 3030-2 and DC-3003-1) from open market materials could make part of one sub-cluster. DC 23000 (Pusa Kartiki), DC 754, 30-Early and DC 351aa genotypes of early maturity groups also clustered in same group. However, most of the sub-clusters had genotypes from different maturity groups.

Agro-morphological diversity
Cauliflower is a thermo-sensitive crop and there are four groups of cauliflower made on the basis of temperature requirement for curding traits namely early (20-27˚C), mid-early (16-20˚C), mid-late (12-16˚C) and late or snowball (10-16˚C) [15]. Small curds (200-600 g) of early group and large size curds (>1000 g) are associated to late or snowball group [8] which is due to growing days and temperature factor during curd initiation and development stages. Genotypes of different maturity groups had variation for phenological traits namely curd initiation, curd maturity and curd size at two locations which proved the influence of temperature on expression of these traits. Since, the climate of Barapani centre was relatively cool and damp than Delhi condition, and cool temperature and high relatively humidity favour early curd initiation. The curd maturity is positively correlated with curd traits indicating that the prolonged growth period increases curd weight. Accurate identification and characterization of germplasm is important for cultivar development and breeder's rights protection [36]. In Delhi and Barapani conditions, only two main clusters were observed from eight agro-morphological traits. [13] studies 52 genotypes of snowball cauliflower and reported 10 clusters. [24] studied diversity in 45 genotypes of cauliflower from early (3), mid-early (7), mid-late (8) and snowball (7) groups, besides 20 collections from other countries namely China (6), Russia (5) and Netherlands (8). They used only few genotypes in early and mid-groups which were not sufficient to depict the available diversity in Indian cauliflower. The present study analysed large set of diverse germplasm including early (37), mid-early (25), mid-late (15) and snowball (13) groups of cauliflower. Morphological variation in the genotypes was in the line of previous reports on Indian cauliflower [16,17,37,38], Irish cauliflower collections [39] and snowball cauliflower [13,14]. We found some promising genotypes at both locations for marketable curd weight namely CC-13, Pusa Meghna and DC-903 in early groups, DC-BR-36, DC-476, BR-2 and DC-18-19 in mid-late group, KT-6, KT-20, PSBK-1 and KT-2 in late or snowball group. Of them, CC-13 is a self-incompatible line hence can be of immediate use for hybrid breeding.

Molecular diversity
Molecular markers are useful tools to estimate the extent of genetic diversity present in the germplasm. The SSR markers are still widely used and preferred over other marker systems such as RAPD, RFLP, and AFLP for their procedural requirements, robustness, and reproducibility [40]. The SSR markers have been well employed in understanding the genetic variations Brassica species for diversity studies [20,23,41,42]. Earlier, RAPD, ISSR and SSR markers have been used for linkage analysis in cauliflower for characterization of self-incompatible lines [43], genetic diversity analysis [44] and linkage analysis with downy mildew [45,46] and black rot [47]. The SSRs with high polymorphism are useful for parent selection, mapping of specific traits and their introgression. In present study, 90 polymorphic markers generated good amount of diversity in 92 genotypes of cauliflower and two each of broccoli and cabbage. However, these SSRs could not separate out broccoli and cabbage from cauliflower lines. This might be due to the fact that the cabbage, cauliflower and broccoli are evolved from a common source Brassica sylvestris during the evolutionary process with major and minor mutations for physiological arrests at different stages of growth and developmental stages. Further, the SSRs used in the present study were not specific to the regions of the chromosome(s) which demarcates differences in these three crops. Similar results were also presented by Lowe et al. [48] while explaining use of SSRs in Brassica species. Slipped strand mis-pairing and occurrence of insertion and deletion during the evolutionary processes of these crops might have contributes such morphological changes [41] and the SSRs used in present study were not targeted for these mutations.
The SSR based dendrogram revealed that (i) all the four main clusters and most of their sub-clusters had mixed set of genotypes from three or more maturity groups, (ii) certain subclusters (i.e. sub-cluster-IIa) and few nodes in sub-cluster IVa had genotypes particular to maturity group such as sub-cluster IIa (KT-17, KT-13-01, KT-25 and KT-22 of late group) and three nodes of sub-cluster IVa (one, DC-321, DC-310, DC-385, DC-309; two: DC-402, CCM � HR and DC-3023-2; three: DC-522, DC -383, DC-325 and DC-308), (iii) in one subcluster, the genotypes of mid-late and late group were placed together, and (iv) the genotypes of broccoli and cabbage were grouped alongwith cauliflower and they could not make separate clusters. Similar results were earlier reported by Vanlalneihi et al. [44] while studying genetic diversity using 26 SSR markers in 48 genotypes of three maturity groups of Indian cauliflower. The SSR markers could corroborate the agro-morphological grouping to some extent with respect to Indian and Snowball cauliflower only. But, the distinction was not enough as expected, which could be due to limited extent of variation at genetic level, because most of the growth and developmental traits were common in both groups, however, the level and intensity of expression traits had variation.
Polymorphic information content (PIC) value for primer is important indicator for level of polymorphism to use in molecular studies. The PIC value > 0.5 indicates for high polymorphism, 0.25-0.5 for moderate polymorphism, and <0.25 for low polymorphism [49]. The PIC values in the present investigation indicated low to moderate level of diversity. Moreover, only three markers had PIC value higher than 0.50 namely BoGMS0742, OI10D03 and BoGMS0162 which can be considered to be highly polymorphic while 31 SSRs had PIC value in moderate range. The observations agree with the earlier reports of average PIC value of 0.316 by Zhu et al. [22] while studying diversity in 165 cauliflower inbred lines primarily derived from southeast China. However, our observations were less than the PIC value of 0.571 reported by El-Esawi et al. [21] in Ireland collection which could be due their only 12 SSRs and 25 genotypes from diverse pool.
We observed good extent of diversity which was reflected grouping of cauliflower genotypes, however, it does not establish that the extent of diversity in the investigated genotypes was narrow. Tonguç and Griffiths [24] demonstrated least SSR diversity in cauliflower probably because they used genotypes of a narrow gene pool. Astarini et al. [50] reported diversity in cauliflower genotypes from Lembang in Western Java and showed its relatedness with current cultivars from India and the Australian.
Therefore, it is suggested to use a greater number of markers having well distribution in all the nine chromosomes to make a conclusive study on the extent of diversity in the cauliflower germplasm in India. Overall, the effectiveness of SSR markers in assessing the extent of diversity in cauliflower agreed with earlier reports of Astarini et al. [50], Plieske and Struss [51] and Li et al. [52].
The SSR marker-based subpopulation structures of cauliflower are not consistent with the agro morphological groups. STRUCTURE analysis of genotyping data made four sub-groups with prominence of admixture. It revealed that all three groups of Indian cauliflower had genetic regions from each other and also from snowball group. Similarly, the regions in snowball group were also matched with the Indian types. Thus, the present study could reveal that the present day Indian cauliflower germplasm was evolved as a result of intentional or natural intermixing between typical Indian types and introduced improved varieties/lines. The SSR marker-based analysis showed a varied level of heterozygosity in the tested genotypes of cauliflower. Further, these markers could not group two genotype of each cabbage and broccoli in separate groups, indicating for presence of sufficient genetic variation in these genotypes. These results are in conformity with the findings of Zhu et al. [22]. They investigated 165 cauliflower inbred lines from southeast China using 43 SSR markers and inconsistency between STRUCTURE based subpopulation and agro-morphological traits (curd maturity, curd solidity or geographical origins) based grouping. The information with 90 SSR markers is partially consistent with the functional grouping of few genotypes for curd initiation and maturity groups. Since, curding is a genetically complex trait and influenced by environmental factors [5,53]. Admixture in the genotypes could be due to introgression of genomic fragments in genotypes of F 6 to F 15 generations. Further, snowball group genotypes have introgression from European cauliflower and few genotypes of Indian cauliflower had genomic regions from exotic tropical types They are under purification or advance breeding stages as also revealed by the genomic SSR markers.
The first detailed analysis of genotypes from all the four diverse maturity groups at two distinct locations could reveal that (i) the curd initiation and development are critical to temperature factor in cauliflower, (ii) agro-morphological observations revealed that the grouping of cauliflower genotypes on the basis of temperature requirement for curd initiation and development was found to be effective (iii) the genetic diversity across the groups was revealed by the SSR markers but to small extent, and (iv) present day germplasm of Indian cauliflower had admixture from other maturity groups and also from snowball/European types. From this, we suggest that using of a greater number of SSR markers for identification of desired and novel alleles which will assist in DNA fingerprinting, genome mapping, linkage map construction, gene tagging etc. and pyramiding of these novel genes from different genotypes of cauliflower increases the diversity in order to obtain the valuable and desired hybridization combination for future use in breeding of cauliflower for marker assisted breeding.