Phylogeny and genetic structure in the genus Secale

Secale L. is a small but important genus that includes cultivated rye. Although genetic diversity of cultivated rye is high, patterns of genetic diversity in the whole genus, and potential factors affecting the distribution of genetic diversity remain elusive. The population structure and distribution of genetic variation within Secale, and its correlation with taxonomic delimitation, cultivation status or spatial distribution in relation to geography and climate zones were analyzed in this study. A collection of 726 individual plants derived from 139 different accessions representing Secale cereale, S. vavilovii, S. strictum, and S. sylvestre were investigated using SSR analysis and sequence diversity analysis of a nuclear EST region. Our results indicated that perennial S. strictum subspecies are genetically divergent from annual forms of the genus. Existence of two distinct clusters within the annual taxa was observed, one corresponding to samples from Asia, and a second to those outside of Asia. No clear genetic structure was observed between different annual species/subspecies, indicating introgression between these taxa. The analysis of cultivated rye revealed that landrace populations from the Middle East have the highest genetic diversity, supporting the idea of the area being the center of origin for cultivated rye. Considering high adaptive potential of those populations, Middle Eastern landraces should be regarded as genetic resources reservoirs for new niches and future breeding programs.


Introduction
Sustainable food production is a vital environmental issue, in the context of global climate change. Elevated temperatures and accompanying alterations in precipitation regimes are expected to decrease yields significantly. At the same time, global requirement for food is expected to increase by 60% by 2050. The adaptive capacity of plant populations under stress conditions are positively related to the degree of genetic diversity maintained in those populations [1]. Genetic diversity of modern varieties (cultivars) of crop plants is quite low due to genetic erosion stemming from domestication syndrome and modernization bottlenecks. On the other hand, wild relatives of crop plants and unimproved varieties known as 'landraces' are genetically diverse [2,3] and contain many adaptive alleles in their gene pools. PLOS  (S1 Table). Among these, 584 samples from 100 accessions of S. cereale, 46 samples from nine accessions of S. vavilovii, 89 samples from 23 accessions of S. sitrictum, two samples from two accessions of S. sylvestre, and five hybrid samples were used (S1 Table). In terms of cultivation status, 137 genotypes belonged to wild accessions, 51 genotypes to weedy accession, 343 genotypes to landrace accessions and 195 genotypes to cultivated accessions. The accessions were provided by United States Department of Agriculture Germplasm Resources Information Network (USA), and Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (Germany). Two accessions were collected in farms from Turkey, in 2010. In order to confirm taxonomic delimitations of accessions, seeds were planted in trial fields from December 2010 to June 2011, and the samples were regularly evaluated for certain phenotypic characters during all developmental stages following [7]. Total DNA was extracted according to the method described by Doyle and Doyle [32].
SSR analysis. Initially, 20 nuclear SSR primers previously used in the genus Secale [33,34] were screened in eight individual plants, representing four Secale species in terms of PCR amplification success and peak profiles. Among these, a set of ten microsatellite primers yielding good PCR products and scorable peaks were selected (REMS1187, REMS1254, REMS1323, REMS1264, REMS1205, REMS1238, REMS1160, REMS1303, REMS1259 and SCM 180) and used for the analysis of 721 samples (S2 Table). All PCR reactions were performed as described by Khlestkina et al. [33] and Saal and Wricke [34]. Amplification success was checked and successful PCR products were read on an ABI 3100 capillary sequencer with GS400HD size standard (Applied Biosystems).
The alleles were automatically binned using FlexiBin [35] and checked manually. The genotyping errors stemming from null alleles, large allele dropout or the scoring of stutter peaks that can potentially lead to deviations from Hardy-Weinberg proportions were detected using Micro-checker version 2.2 [36]. Based on the results, three markers (REMS1303, REMS1259 and SCM 180) were and seven SSR markers (REMS1187, REMS1254, REMS1323, REMS1264, REMS1205, REMS1238, REMS1160) were used for the subsequent analyses. The mean polymorphism information content (PIC) was calculated for each marker using MolKin v. 3.0 software [37].
We analyzed the whole data set, consisting of 721 samples, excluding the hybrids, in three different categories on the basis of (1) taxonomic identity, (2) cultivation status and (3) climatic conditions of geographical origin. In the first category, all of the genotypes were pooled into 11 groups based on their taxonomic identity, at the species and subspecies level, in order to understand the distribution of genetic diversity in different taxonomic groups. In the second category, all genotypes were grouped as wild, weedy, landrace and cultivated varieties, to evaluate the effect of cultivation status on the distribution of genetic diversity. In the third category, all genotypes excluding two samples of unknown geographical origin were assigned to 18 climate subgroups belonging to five main climate groups, as determined by Köppen-Geiger classification system, which is based on classifying the mean climate conditions on geographic areas around the globe using different climatic variables [31], in order to understand whether climatic conditions of geographic origins affect distribution of genetic diversity.
In addition to these three categories, patterns of genetic diversity in cultivated rye i.e. S. cereale subsp. cereale genotypes (both landrace and cultivated varieties were analyzed separately. In this analysis, a total of 533 genotypes from 83 different accessions originating from various geographical regions, representing 10 main gene pools (Africa, Australia, Europe, Balkans, Caucasus, East Asia, South and Central Asia, Middle East, North America and South America) were used. These samples were analyzed using the same seven microsatellite primers, as described above.
Allelic frequencies were tested for the deviations from Hardy-Weinberg equilibrium (HWE) using an exact test with a Markov chain (10000 steps) and 1000 dememorisation steps in Genepop version 4.0.10 [38,39]. Linkage disequilibrium was also tested between all loci using Genepop version 4.0.10 [38,39]. Genetic diversity parameters were computed for each group using GenAlEx v6.4 [40]. The sample sizes in different accessions/regions used in the study were different from each other. Therefore, to compensate for this sampling bias that may lead to inaccurate comparisons of allelic richness between loci, allelic richness (RS) and private allele richness (PR), independent from sample size were computed by a rarefaction method as implemented in HP-RARE version 1.0 [41]. In addition to these genetic diversity parameters, the overall gene diversity (H T ), the within-population genetic diversity (H S ), the amount of gene diversity among populations (D ST ), and the coefficient of genetic differentiation between populations (G ST ) were calculated with FSTAT version 2.9.3 [42]. To avoid any misinterpretation stemming from sampling bias, D ST , H T and G ST values were also calculated independently of sample size, using the same program. Pairwise F ST values between each population were calculated using GenAlEx version 6.4 [40] Population structure was also analyzed using a Bayesian clustering algorithm, as implemented in STRUCTURE version 2.3.3 [43]. Admixture model of ancestry and correlated allele frequency were allowed. The LOCPRIOR model was also applied using population information as a prior, to assist clustering [44]. The length of the burn-in was set to 30,000, and data were collected over 300,000 Markov Chain Monte Carlo (MCMC) replications in each run (K = 1-5). The optimum number of clusters (K), was determined as described by Evanno et al. [45]. Each individual with an ancestry value equal to or larger than 0.7 was assigned to the corresponding cluster, while the individuals with a smaller ancestry value were considered to have mixed ancestry following Coulon et al. [46]. The correspondences of obtained groups were evaluated for taxonomic identity, cultivation status, geographical origin, and climatic zones, as mentioned above. Finally, an Unweighted Pair Group Method with Arithmetic Mean (UPGMA) tree was constructed using Poptree2 [47] based on Nei's genetic distance (DA) [48] with 10,000 bootstrap iterations.

Sequence diversity analysis of nuclear EST markers
Varshney et al. [49] had shown that existing barley nuclear expressed sequence tag (EST)derived DNA markers could be employed in sequence diversity analysis in rye. Four of these markers were tested (S2 Table) and GBS0551, which gave the best results, was selected and used in this study. A total of 61 samples representing four species of Secale and five hybrid samples were included in the analysis. The PCR reactions were performed as described by Varshney et al. [49]. The amplified fragments were commercially sequenced at Macrogen Europe and the sequences were edited visually and aligned using Sequencher version 4.5 (Gene Code Corp). However, the discrimination of the alleles of heterozygote samples, especially with multiple differences was not straightforward. Therefore, these sequences were edited by DNAsp version 5.0 [50] using the coalescent-based Bayesian algorithm of PHASE software [51] that resolves haplotype phases and infers haplotypes correctly. A maximum-likelihood (ML) tree was constructed using MEGA 5 [52], and the reliability of the phylogenetic relationships was tested by bootstrapping (1000 replicates).

Informativeness of the SSR markers
The number of alleles per locus ranged between 9 and 22 with an average value of 14. Polymorphism information content (PIC) values ranged from 0.605 (REMS1264) to 0.882 (REMS1160), with an average value of 0.718 ( Table 1).

Distribution of SSR genetic diversity in different categories
Genetic allelic patterns were calculated for each group, in the three categories created based on taxonomic identity, cultivation status, and climatic conditions of geographical origin of the samples ( Table 2). In the taxonomy based groups observed heterozygosity was higher than the expected heterozygosity in all taxa. Expected heterozygosity was the highest in S. strictum subsp. strictum (0.731) and the lowest in S. cereale subsp. afghanicum (0.579), excluding S. cereale subsp. dighoricum, S. strictum subsp. irmanuso and S. sylvestre that had small sample sizes. The highest and lowest differentiation based on F ST was observed between S. cereale subsp. afghanicum and S. sylvestre (F ST = 0.181), and S. cereale and S. vavilovii (F ST = 0.007), respectively (S3 Table). S. cereale subsp. afghanicum and S. sylvestre were found to be the most divergent from the rest of the taxa analyzed.
The assessment of cultivation status based genetic diversity in 137 wild, 51 weedy, 343 landrace, and 190 cultivated plants showed that expected heterozygosity was the highest in wild accessions (0.735) and the lowest in cultivated varieties (0.675). Comparison of pairwise F ST values revealed no significant differentiation between different groups.
In terms of the climate subgroups, the highest expected heterozygosity was observed in the Warm-summer Mediterranean subgroup (0.73), and the lowest in Tropical monsoon climate subgroup (0.46) ( Table 2). Pairwise F ST comparisons revealed the Tropical monsoon climate subgroup to be the most different from the remaining climate subgroups, with the highest genetic distance when compared to the Mild tundra climate populations (0.19) (S4 Table).
STRUCTURE and UPGMA results. The Bayesian clustering analysis based on the distribution of 98 alleles at seven SSR loci among 721 accessions revealed presence of three separate clusters (Fig 1). The primary division at K = 2 was observed mainly between perennial S. strictum and remaining annual taxa. At K = 3, S. strictum cluster remained fairly intact, while annual taxa (S. cereale and S. vavilovi) grouped within two different clusters. The first cluster contained a total of 25 samples of S. strictum subsp. strictum and one S. strictum subsp. anatolicum sample. Among these, 20 samples originated from Iran, and the remaining samples originated from other parts of the Middle East (Fig 2A).  and the Middle East ( Fig 2C). The remaining 408 samples could not be assigned to any of these three clusters, and was considered to have mixed ancestry. Except for the first cluster that consisted of wild S. strictum samples, a weak correlation between clustering and cultivation status was noted. The structuring exhibited no significant correlation with major agro-climatic zones as described by Kottek et al [31]. The UPGMA dendogram constructed using subspecies of S. cereale, subspecies of S. strictum, S. vavilovii and S. sylvestre revealed a clear separation between S. sylvestre and the rest of genus ( Fig 3A). S. cereale subsp. afghanicum separated from the other subspecies of S. cereale and S. strictum. S. strictum subsp. kuprijanovii also diverged from the other remaining subspecies at a relatively basal position in the tree topology. S. cereale subsp. cereale, S. cereale subsp. segetale, S. vavilovii, S. strictum subsp. anatolicum and S. strtictum subsp. strictum constituted a group, while S. cereale subsp. ancestrale remained outside of this cluster. S. cereale subsp. cereale and S. cereale subsp. segatale were more closely related to S. vavilovii, rather than their conspecifics S. cereale subsp. afghanicum and S. cereale subsp. ancestrale.
As the STRUCTURE analysis revealed a clear separation between S. strictum subsp. strictum samples originating from Iran, as a next step, these populations were grouped separately (S. strictum subsp. strictum clade 1). The remaining S. strictum populations were also grouped together (S. strictum subsp. strictum clade 2), and the dendogram was rebuilt using these separated groups (Fig 3B). Branching off of S. strictum subsp. strictum clade 2 with a high bootstrap value (99%) revealed its significant divergence. Except for this difference, both trees reflected nearly identical topologies.

Nuclear sequence diversity of the genus Secale
The general topology of the maximum-likelihood (ML) tree constructed using a 667 bp fragment of nuclear sequences in 61 samples (GenBank accession numbers: MH421898-MH421958), representing four species in the genus Secale, and five hybrid samples showed that there were two main lineages (Fig 4). However, these groups did not correspond to taxonomic or spatial delimitations. S. vavilovii accessions were dispersed within S. cereale subspecies in both groups. The two S. sylvestre samples clustered together in a subgroup, rather than forming a separate linage. S. strictum subspecies clustered together forming two and one  (Table 3). Furthermore, comparison of pairwise F ST differentiation [53,54] showed the African gene pool to be the most different from remaining gene pools, having the highest genetic distance when compared to the South-Central Asian populations (F ST = 0.96) (S5 Table). The genetic differentiation among other gene pools was insignificant.
STRUCTURE analysis showed the presence of two separate clusters, with the first one composed of 333 samples, 72.02% of which were landraces that originated from the Middle East, and South and Central Asia. Except for two samples, all of the Australian cultivars clustered in this group. The second cluster was composed of 136 samples, mainly originating from Europe, Balkans and South America. The proportion of Middle Eastern and south Central Asian samples in this group was only 6.25%.
In the PCA analysis conducted to explore pattern of relationship between cultivated rye populations from different geographical regions with the microsatellite data, the first, second Phylogeny and genetic structure in the genus Secale and third components explained 45.44%, 23.37% and 12.54% of the variance, respectively. First and second components of the PCA analysis revealed two clusters (Fig 5A), while three distinct clusters were observed based on the first and third components (Fig 5B). The first cluster was dominated by samples from the Middle East, whereas the other clusters contained samples from diverse geographical areas. PCA clustering did not reflect cultivation status of the samples.

Distribution of SSR diversity in Secale
In this study, genetic diversity within Secale was evaluated using seven SSR markers. A worldwide collection of 721 samples belonging to 11 taxonomic units included wild, weedy, landrace and cultivated materials from diverse climatic zones. All of the SSR markers employed in the study had (PIC) values higher than 0.6 and are considered to be highly informative. The genetic diversity ofSecale at a global scale was relatively high compared to other crops like sorghum [55] and maize [56], which can be attributable to the outcrossing nature of many species in the genus Secale, and its wind-pollinated reproduction.
In our study, relative genetic diversity co-varied with the cultivation status: the highest diversity was observed in wild accessions, followed by weedy and landrace accessions, and lowest in cultivated varieties. Furthermore, wild and landrace populations had private alleles which were not detected in the cultivated gene pools, indicating that these forms offer a richer source of alleles, and high potential for crop improvement. High genetic diversity and presence of private or rare alleles in wild and weedy forms can be explained by the lack of a domestication bottleneck, see below.

Genetic clustering
Existence of three distinct clusters in the STRUCTURE analyses of the whole genus indicated the presence of three different gene pools: (1) perennial S. strictum subsp. strictum, (2) annual taxa that originated from Asia (Middle East and South-Central Asia), (3) annual taxa that originated from outside of Asia (mainly Europe). The clear separation between perennial S. strictum and the annual form has been shown previously [11,57,58] and can be explained by restricted gene-flow between annual and perennial taxa, possibly due to the differences in lifehistory traits such as timing of reproduction. Further separation of annual taxa was based on geographic origin, rather than taxonomic identity. This was also supported by maximum likelihood tree constructed using nuclear sequences, where all of the S. cereale subspecies and S. vavilovii were grouped together. Recently, Hagenblad et al. [11] showed that there was no clear taxonomic structuring among annual forms of the genus. Previous studies have also shown lack of morphological [59] and molecular [8,9,10,57] differences between annual forms (S. vavilovii, S. cereale subsp. ancestrale, S. cereale subsp. afghanicum, and S. cereale subsp. segetale) belonging to different taxa. Genetic similarity between annual wild and weedy forms and cultivated subspecies S. cereale subsp. cereale supports the hypothesis that S. cereale is of relatively recent origin, dating back to only a few centuries ago [60]. It is likely that there was insufficient time for the evolution of isolation mechanisms or barriers between cultivated rye, and its wild and weedy relatives, and hence the lack of structuring among annual taxa can be explained by introgression between sympatric populations of cultivated rye, and wild and weedy forms. As a result, morphological differences between the subspecies cannot be explained by genetic differentiation indicating the lack of nonexistence of the taxonomic boundaries at subspecies level. Interbreeding between different taxa, except for S. sylvestre and subsequent formation of hybrids with high pollen and seed fertility is very common in the Secale genus [6,20,61,62].
The further structuring of annual taxa based on geographic origins of samples suggests that each of the two annual clusters detected in the study originated from two distinct gene pools. Subsequently, the two distinct lineages retrieved in this study were initially separated, probably due to restriction of gene flow because of geographical isolation. Consistent with our findings, Hagenblad et al. [11] also showed that geographic clustering was evident among annual taxa, which is reflected by a separation between Asian and European accessions. Furthermore, Bolibok-Bragoszewska et al. [63] noted divergence of the Near Eastern and European accessions. On the other hand, some other studies reported that genetic structuring among different taxa corresponded to cultivation status [58,[64][65][66], and no geographical structuring was found in these studies. These conflicting results might be stemming from low discriminatory power of markers used in these studies. It should be noted that despite the existence of the clear geographical structuring among annual taxa mentioned above, no significant correlation with major agro-climatic zones was detected. Similarly, Hagenblad et al. [11] also reported a limited correlation between genetic structuring and agro-climatic conditions of the sampling localities in geographically structured rye populations. This can be explained by the observed structure stemming from geographical proximity and related pollen dispersal, rather than ecological and climatic adaptations.
The nominotypic S. strictum subsp. strictum has been previously shown to be significantly different from its subspecies [8,9,10,67], indicating that it has been evolving independently of other S. strictum subspecies [13]. In our study S. strictum subsp. strictum samples from northwest and west of Iran were divergent from the rest of the S. strictum accessions. In the UPGMA dendogram, the ancestral position of this group was observed, when compared to the rest of S. strictum subsp. strictum samples. The same dendogram also revealed that rest of the S. strictum subsp. strictum samples (i.e. other than the Iranian clade) originating from diverse areas, were genetically more similar to S. cereale accessions. This is compatible with the hypothesis that cultivated rye evolved from S. strictum [17,20,60]. In addition, in the dendogram constructed based on microsatellite data, S. strictum subsp. anatolicum and S. strictum subsp. strictum accessions originating from out of Iran were found to be closer to S. cereale subspecies compared to S. strictum subsp. kuprijanovii samples, and S. strictum subsp. strictum accessions originating from Iran. This suggests existence of gene flow between S. cereale subspecies and S. strictum subspecies originating from outside of Iran.
It was also interesting that S. strictum subsp. strictum samples that originated from Iran were basal to the clade that included the rest of S. strictum subspecies and all of the annual taxa, except for S. cereale subsp. afghanicum and S. sylvestre. This observation is also consistent with previous studies that show S. strictum being the most ancestral species, which the rest of the taxa have originated from [4,6,20,68,69]. This finding also underpins the hypothesis that Northeastern Turkey and the adjacent area including Armenia and northwestern Iran could be the center of origin for the genus [5,18].
Taxonomical position of S. vavilovii has also been a point of discussion: in some studies, S. vavilovii was considered to be a distinct species close to S. cereale [6,8,15,57,58,70], while some other researchers postulated that S. vavilovii should be classified as a subspecies of S. cereale [7,[9][10][11]13,67]. In our study, SSR and nuclear sequence diversity analysis did not reveal significant differentiation between S. cereale and S. vavilovii. Therefore, it is concluded that S. vavilovii should be considered as a synonym and a subspecies of S. cereale.
Our study affirmed that S. sylvestre is genetically the most divergent species, which is consistent with the general agreement that S. sylvestre is the first species that diverged from S. strictum during the Pliocene, and is morphologically and genetically the most distinct species [12,13,71].

Genetic diversity and structuring of cultivated rye
Genetic diversity. Cultivated rye is a wind pollinated allogamous species with a highly developed self-incompatibility system. As a result, high genetic diversity has been previously noted not only between different accessions [24,72,73], but also within the same cultivar [9,10,65]. Consistently, our study showed high degrees of genetic diversity in cultivated rye from all over the world. Moreover, due to its high tolerance of different environmental conditions, rye has a global geographic distribution which may also have contributed to its high levels of genetic diversity.
In the scope of the present study, genetic diversity levels of different gene pools were compared. Our results showed that landraces are genetically more diverse, when compared to cultivars. Similar results were previously reported in other studies on rye [23,61]. It is well established that current breeding practices narrows genepool and leads to reduction of genetic diversity [74]. Such reduction in genetic diversity results in loss of many important alleles, and this may have significant negative effects on adaptive capacity of plants. On the other hand, landraces are cultivated by traditional agricultural practices through many generations of selection, and they have become locally adapted to various environments by accumulating new alleles [75]. Therefore, compared to cultivars, the genetic diversity of landraces is high. Our findings highlighted that landraces should be regarded as a source of genetic variation, and should be integrated to rye breeding programs to compensate genetic diversity lost during modern breeding processes.
Second, we analyzed the distribution of genetic diversity in different geographic regions. Genetic diversity of cultivated rye was affected by geographic origins of the samples and found to be higher in the Middle East region (Turkey, Iran and Israel) compared to other regions. Although sample size of the region is larger than the others, to avoid any bias due to sample size, corrected genetic diversity measures (independent of sample size) were also used. The degree of genetic diversity was found to be highest in the Middle East for the corrected parameters, as well. Therefore the obtained results likely reflect real genetic diversity patterns of the region, rather than being a sampling artifact. Vavilov [17] proposed that genetic diversity of crop species on interspecific and intraspecific level is not evenly distributed: the genetic diversity in the center of origins is higher. Based on this assumption, our results indicate that the most likely center of origin for the genus is the Middle East or Caucasus. This is consistent with the idea that all Secale taxa have originated somewhere in the Middle East or South-Central Asia [5,18] that also covers the geographical area known as "Fertile Crescent", the center of origin for many crop species like wheat, barley, pulses, pea and flaxes [18]. Taking into account that many wild and weedy forms of the genus Secale are found in the area between northeastern Turkey and northwestern Iran, gene flow between wild forms and cultivated forms by introgression is quite possible, resulting in an increase in genetic diversity. High genetic diversity observed in the region can also be explained by most of the populations in this region being landraces, rather than genetically more-or-less uniform cultivars.
Genetic diversity of the accessions originating from Africa and East Asia was comparably low, probably due to a potential genetic bottleneck during introduction of cultivated rye to these regions. Besides, in comparison to South American and European samples, genetic diversity was lower in North America that contains populations from Mexico, USA and Canada. This probably stems from extensive use of genetically uniform cultivars in these regions. On the other hand, genetic diversity of the Balkan group (that contained samples from European part of Turkey (Thrace), Montenegro, Serbia, Macedonia, Yugoslavia, and Bosnia and Herzegovina) and the Caucasus gene pool (that consisted of two accessions from Georgia and East Azerbaijan) was high. Finally, the European gene pool sampled in this study contained accessions from a wide geographical range containing Germany, Switzerland, UK, Poland and Sweden, and their genetic diversity levels were found to be moderately high. Although agricultural systems of many countries in Europe favor the genetically uniform cultivars [76], the relatively high levels of genetic diversity observed could be explained by European cultivars having been developed using different genepools.
Origins of the ryes from different continents. The separation of genotypes originating from Asia (Middle Eastern, and south and central Asian) and from out of Asia (mainly Europe, Balkans and South America) was consistent with previous studies reporting a clear separation between the Middle East and European genepools [11,23,63].
Based on our findings it can be speculated that each of the two clusters obtained in the study originated from two distinct gene pools. The two main distinctive lineages retrieved in this study were initially separated probably due to restriction of gene flow because of geographical separation. Considering that in crop plants geographical distribution patterns usually reflect prevailing human mediated selection pressures in a particular environment [77], another explanation for this separation could be the cultivated rye having been introduced into new geographical ranges in which climatic and environmental conditions are quite different compared to those in the center of origin. This was possibly followed by anthropogenic selection of adaptable phenotypes to the conditions in those regions, leading to adaptive divergence. The Middle Eastern samples were observed in all three clusters, indicating their potential ancestral position, and supporting the conclusion -based on genetic diversity levels above-that the Middle Eastern populations are the likely progenitors of cultivated rye, and they recently expanded globally due to human mediated distribution and long-distance gene flow. Similarly, Einkorn wheat, emmer wheat, barley and lentil [78] were domesticated in the Middle East, more specifically in the Fertile Crescent and subsequently were radiated to Europe [79] and the rest of the world.
In the context of the study, the origin of the samples collected from outside of Asia and Europe was also investigated. Samples from South America grouped together with European samples into second cluster. This is consistent with the idea that many crop plants dispersed to South America from Europe, after the voyages of Columbus [80]. On the other hand, samples from Australia and North America grouped into the first cluster, indicating that cultivated rye was possibly introduced into these areas from the Middle East or South-Central Asia.
Furthermore, genetic differentiation among geographical regions revealed a significant differentiation between the African gene pool and the remaining gene pools. Considering climatic conditions of the region being relatively unique and that the region is physically separated from remaining gene pools by geographical barriers, it can be concluded that rye became locally adapted to this continent and remained separated. This is consistent with the idea that S. cereale subsp. cereale evolved as an isolated population in Africa [5]. Similarly, based on AFLP data, Chikmawati [73] previously reported that African populations of cultivated rye were genetically more distant when compared to other populations.

Conclusion
The global scale analysis of genetic diversity and phylogenetic relationships of Secale genus show a clear separation between perennial S. strictum subspecies and annual taxa. Further separation of annual taxa belonging to different species or subspecies into two groups was based on geographical origin, rather than taxonomic identity. Separation of the Middle Eastern and South Central Asian accessions from remaining accessions confirmed the previous findings revealing partitioning between Asian and European accessions, and the existence of two different genepools. The lack of any structuring within different species or subspecies belonging to annual taxa can be explained by recent separation of the species or subspecies, insufficient time having passed for the evolution of isolation mechanisms, and consequent continuation of gene flow even between species. In addition, the lack of a clear genetic separation between S. cereale and S. vavilovii led us to conclude that S. vavilovii, rather than being a distinct species, should be classified as a subspecies of S. cereale. The phylogenetic relationships of different species in the genus should be investigated in greater detail using high resolution molecular markers, such as RAD-seq, as well.
The evaluation of genetic diversity of cultivated rye populations led us to conclude that high levels of genetic variation exist in cultivated rye. The highest allelic variation and genetic diversity was found in the Middle Eastern landrace populations. This finding supports the idea that the area could be the center of origin for the genus. Nearly all of the populations examined in Near East are locally adapted landraces that have not been exposed to intense artificial selection pressures. Therefore, in contrast to modern crop varieties that have undergone genetic bottlenecks associated with the process of domestication, resulting in a decrease in genetic diversity, landraces constitute a large pool of genetic variation and contain many interesting traits, like strong tolerance to abiotic and biotic stress [81]. Considering that high genetic diversity in crop plant populations is directly related to adaptive potential of those populations to changing environmental conditions, landraces should be regarded as genetic resources reservoirs for new niches and future breeding programs. From a conservation point of view, the results obtained from the study suggest that an immediate action plan is required for in-situ conservation of the ancestral and highly diverse Middle Eastern landrace populations.
Supporting information S1