Genetic Structure of Modern Durum Wheat Cultivars and Mediterranean Landraces Matches with Their Agronomic Performance

A collection of 172 durum wheat landraces from 21 Mediterranean countries and 20 modern cultivars were phenotyped in 6 environments for 14 traits including phenology, biomass, yield and yield components. The genetic structure of the collection was ascertained with 44 simple sequence repeat markers that identified 448 alleles, 226 of them with a frequency lower than 5%, and 10 alleles per locus on average. In the modern cultivars all the alleles were fixed in 59% of the markers. Total genetic diversity was HT = 0.7080 and the genetic differentiation value was GST = 0.1730. STRUCTURE software allocated 90.1% of the accessions in five subpopulations, one including all modern cultivars, and the four containing landrace related to their geographic origin: eastern Mediterranean, eastern Balkans and Turkey, western Balkans and Egypt, and western Mediterranean. Mean yield of subpopulations ranged from 2.6 t ha-1 for the western Balkan and Egyptian landraces to 4.0 t ha-1 for modern cultivars, with the remaining three subpopulations showing similar values of 3.1 t ha-1. Modern cultivars had the highest number of grains m-2 and harvest index, and the shortest cycle length. The diversity was lowest in modern cultivars (HT = 0.4835) and highest in landraces from the western Balkans and Egypt (HT = 0.6979). Genetic diversity and AMOVA indicated that variability between subpopulations was much lower (17%) than variability within them (83%), though all subpopulations had similar biomass values in all growth stages. A dendrogram based on simple sequence repeat data matched with the clusters obtained by STRUCTURE, improving this classification for some accessions that have a large admixture. landraces included in the subpopulation from the eastern Balkans and Turkey were separated into two branches in the dendrogram drawn with phenotypic data, suggesting a different origin for the landraces collected in Serbia and Macedonia. The current study shows a reliable relationship between genetic and phenotypic population structures, and the connection of both with the geographic origin of the landraces.


Introduction
Durum wheat (Triticum turgidum L. var. durum) is a traditional Mediterranean crop. It originated in the Fertile Crescent (10,000 BP) and spread over the northern side of the Mediterranean, reaching the Iberian Peninsula in about 7000 BP [1] from both Italy and North Africa [2]. During this migration, natural and human selection processes resulted in the development of local landraces that were widely cultivated until the middle of the 20th century. From then, as a consequence of the Green Revolution, the cultivation of local landraces was progressively abandoned and they were replaced by the improved, more productive and genetically uniform semi-dwarf cultivars. The plant height (PH), general lateness and low harvest index (HI) of landraces have restricted their current cultivation to a few marginal areas or to the framework of organic farming, discouraging wheat breeding programmes from evaluating and using them extensively as parents in crossings.
Nevertheless, scientists are convinced that local landraces may provide new alleles for the improvement of commercially valuable traits [3]. Introgression of these alleles into modern cultivars can be very useful, especially in breeding for suboptimal environments. In the Mediterranean Basin durum wheat is mostly cultivated in rainfed environments, in areas where the amount and occurrence of rains fluctuate drastically between years and between locations within a year, resulting in major yield variations. Therefore, improving yield under water-limited conditions is one of the major challenges for wheat production, particularly in the current scenario of climate change. Mediterranean durum wheat landraces represent a particularly important group of genetic resources that are useful for breeding because of a number of suitable characteristics: good adaptation to the regions where they are grown, huge genetic diversity [4], a documented resilience to abiotic stresses [5], and resistance to pests and diseases [6]. An increase in the available genetic variation through the use of landraces in breeding programmes therefore seems possible in terms of adaptation to harsh environments and endproduct quality, given the high level of polymorphism found between and within landraces for traits of commercial importance [3,[7][8][9].
Knowledge of genetic diversity is essential for understanding the relationships between cultivars, facilitating their classification and characterization with the aim of defining new selection strategies and crosses in breeding programmes. Although several markers have been used in the last few decades for genetic studies [10], molecular markers based on microsatellite repeats (SSR-simple sequence repeat) have been the ones most used in wheat during the last few years because of their wide distribution in the genome, their codominancy, their high polymorphism and reproducibility, and their simplicity of analysis. A number of studies have confirmed SSR markers as an efficient tool for evaluating the genetic diversity of wheat germplasm collections and assessing subpopulation structure [11][12][13][14][15][16][17][18][19][20][21][22].
However, fine phenotyping is a major challenge for the improvement of cultivars, creating a bottleneck in the breeding process, especially for the quantitative traits that are the major determinants of abiotic stress resistance. Therefore, accurate phenotyping is essential to minimize the experimental errors due to uncontrolled environmental and experimental variability, and to reduce the genotype-phenotype gap.
To date, few studies have examined the relationship between genetic population structure and agronomic performance in wheat. Previous works [23,24] using collections of 30 bread wheat and 24 durum wheat accessions, respectively, revealed little correlation between phenotypic traits and genetic diversity based on molecular markers. More recently [14], using a set of 191 elite durum wheat genotypes representative of the genetic diversity present in the Mediterranean durums, the authors suggested that genotypic proximity corresponded to agronomic performance in only a few cases. Good correlation between phenotypic and molecular structures was found for accessions related to the CIMMYT hallmark founder 'Altar 84', for ICARDA accessions adapted to dryland areas, and for the reduced set of landraces used in that study.
The aims of this study were: 1) to determine the diversity existing in a durum wheat collection of 20 modern cultivars and 172 landraces representative of the variability existing in the species within the Mediterranean Basin, 2) to ascertain the genetic structure of the collection, and 3) to study the relationship between the genetic and geographic structures and the cluster based on the agronomic performance of the collection across six environments.

Plant material
The plant material included a collection of 172 durum wheat landraces and old varieties from 21 Mediterranean countries, and 20 modern cultivars used as reference, which were previously selected by [25] (S1 Table). Landraces were selected from a larger collection comprising 231 accessions of different origin based on genetic variability determined by 33 SSR markers in order to represent the genetic diversity of ancient local durums from the Mediterranean Basin ( [4]. Landraces provided by public gene banks (Centro de Recursos Fitogenéticos INIA-Spain, ICARDA Germplasm Bank and USDA Germplasm Bank) were bulk purified to select the dominant type (usually with a frequency above 80% of the bulk) and the seed was increased in plots planted in the same field in the years before each experiment to ensure a common origin for seeds of all lines. The modern set included Spanish, Italian and French cultivars, as well as the US desert durum cultivar 'Ocotillo' (S1 Table).

Molecular profiling
DNA isolation was performed from leaf samples following the method reported by Doyle and Doyle [26]. Forty-four SSR markers widely distributed along the genome and amplifying polymorphic alleles in previous studies [27][28][29] were chosen. SSR primer sequence and amplification conditions were obtained from the GrainGenes database (http://wheat.pw.usda.gov). The forward primer of each marker was 5'-labelled with a fluorescence tag and allele sizes were determined using an ABI Prism 3130xl Genetic Analyser with the GeneMapper software version 4.0 (Applied Biosystems).

Field experiments
Experiments were carried out in the 2007, 2008 and 2009 harvesting seasons in Lleida (41°40'N, 0°20'E, 260 m.a.s.l), northeastern Spain, and Granada (37°15'N, 3°46'W, 680 m.a.s.l), southern Spain. Soil analyses were performed before sowing. Experiments were carried out in a non-replicated modified augmented design with three replicated checks (the cultivars 'Claudio', 'Simeto' and 'Vitron') and plots of 6 m 2 (8 rows, 5 m long with a 0.15 m spacing). Sowing density was adjusted to 250 viable seeds m -2 . Meteorological data (S2 Table) were recorded by weather stations placed in the experimental fields. Experiments were conducted under rainfed conditions, but the lack of rain after sowing in 2007 made irrigation necessary to allow seed germination. Weeds and diseases were controlled according to the standard cultural practices of each site.
Zadoks growth stages (GS) [30] 21 (beginning of tillering), 33 (mid-jointing), 45 (booting), 55 (heading), 65 (anthesis), and 87 (physiological maturity) were determined in each plot. Samples of the plants in a 0.5-m-long row were pulled up in a central row of each plot at GS21, GS33 and GS65, and a 1-m-long row from a central row of each plot was taken at GS87. In the laboratory, the number of plants, stems and spikes in each sample were counted, and the aerial portion was weighed after being oven-dried at 70°C for 48 h. Crop dry weight (CDW g m -2 ) was then calculated for each sample as the product of average dry weight per plant and the number of plants m -2 , as described by Royo et al. [25]. The number of spikes per square metre (NSm 2 ) and the number of grains per square metre (NGm 2 ) were measured at GS87. HI was calculated as the ratio between grain and aerial biomass weight on a whole sample basis. PH was measured at anthesis in ten main stems per plot from the tillering node to the top of the spike, excluding the awns. Plots were mechanically harvested at ripening and grain yield (t ha -1 ) was expressed on the basis of 12% moisture. Thousand kernel weight (TKW) was estimated as the mean weight of three sets of 100 g per plot.

Data analysis
The following variables were estimated from the SSR marker data using the GenAlEx software version 6.502 [31]: number of alleles per locus (Na); expected heterozygosity ( where p i is the frequency of the i th allele) [32]; observed heterozygosity (Ho, calculated as the number of heterozygous genotypes divided by the total number of genotypes); and fixation index (F = 1-Ho/He) [33] (Table 1). Putative population structure was estimated using the STRUCTURE software version 2.1 [34], adopting an admixture model and correlated alleles, with burn-in and MCMC 10,000 and 100,000 cycles, respectively. A continuous series of K were tested from 2 to 11 in seven independent runs. The most likely number of subpopulations was calculated according to Evanno's test (ΔK) [35]. Genetic diversity was estimated with the total diversity (H T ) [32] using POPGENE version 1.32 [36]. The coefficient of genetic differentiation, i.e. the proportion of total variation that is distributed between populations (G ST ), was calculated as where H S is the mean genetic diversity within populations. Genetic distances between groups were calculated according to Nei's genetic distance [37], and cluster analysis of the different populations was carried out using the unweighted pair-group method (UPGMA) with DARWin software version 6.0.11 [38]. Analysis of molecular variance (AMOVA) was used to assess the variance between and within populations from different geographical origins with the GenAlEx software version 6.502 [31]. Phenotypic data were fitted to a linear mixed model considering the check cultivars, the row and column number and accessions as fixed in the model for each environment. Restricted maximum likelihood was used to estimate the variance components and to produce the best linear unbiased estimates (BLUEs) for the phenotypic data of each accession within each environment using Genstat software version 17 (VSN International). Correlation analyses between traits were calculated using Genstat software version 17 using the mean values of the BLUEs. Analyses of variance (ANOVA) were performed for each phenotypic trait, considering the genotype (G) and the environment (E) (combination of year and location) as the sources of variation using the SAS Enterprise Guide software version 4.2 (SAS Institute Inc, Cary, NC, USA).
Diversity analysis between durum wheat accessions was conducted using both molecular and phenotypic data. Genetic distances between durum wheat accessions were determined using the simple matching coefficient [39] and phenotypic relationships were determined from the Euclidean distances calculated with the standardized mean phenotypic data across environments implemented in the DARWin software version 6.0.11 [38]. Un-rooted trees were calculated using the neighbour-joining clustering method [40].

Molecular analyses
The analysis conducted using the STRUCTURE software [34] showed that 172 of the 192 accessions could be grouped into five subpopulations ranging from 20 to 73 members each   when the estimate of lnPr(X/K) reached a minimum stable value [35] (Fig 1a and Table 1). The inferred population structure for K = 5 showed that 67% of the accessions have a membership coefficient (qi) to one of the subpopulations higher than 0.8, while the rest could be considered as admixed (qi0.8). Nineteen accessions (9.9%) were not included in any of the subpopulations. Within each subpopulation the percentage of accessions with qi>0.8 ranged from 57% for subpopulation 1 to 95% for subpopulation 2, the last including only modern accessions. According to the frequency on each subpopultion of accessions collected in a given country (Fig 1b), the subpopulations could be classified according to their geographic origin (Fig 1c).  Table 1). The allelic frequencies (p) ranged from 0.003 to 0.857, with a mean of 0.098. A total of 226 alleles might be considered rare as they have a p<0.05. The mean genetic diversity values estimated were Ho = 0.14 and He = 0.71. Wright's fixation index (F), which compares He with Ho to estimate the degree of allelic fixation, ranged from -0.65 for wms601 to 0.99 for wmc486, with a mean value of 0.79 for the whole set of cultivars. Taking into account the mean values for the different subpopulations, modern cultivars showed lower heterozygosity and higher fixation index mean values, with 59% of the markers (26) having all the alleles fixed (Ho = 0 and F = 1) ( Table 1).
The five durum wheat subpopulations showed a total genetic diversity (H T ) ranging from 0.4835 for the modern cultivars (subpopulation 2) to 0.6979 for subpopulation 5, including western Balkan and Egyptian accessions ( Table 2). The high value for the genetic diversity among all the accessions (H T = 0.7080) and the lower value for the genetic diversity among subpopulations (D ST = 0.1225) resulted in a genetic differentiation value (G ST ) of 0.1730, indicating that genetic variation was relatively low between subpopulations (only 17.3% of the variability), while most of the diversity lies within the subpopulations (82.7%).
An UPGMA cluster was obtained from the genetic distance matrix [37] (S3 Table) of the five subpopulations (Fig 2). The genetic distance matrix revealed that the least genetic distance existed between subpopulation 1 and subpopulation 5 (0.2489) and the greatest (0.4886) between subpopulation 3 and subpopulation 4. The dendrogram distributed the subpopulations into three groups, with subpopulation 4 showing the maximum genetic distance and separated in a distinct group. The molecular variance factor obtained for the five subpopulations from analysis of the molecular variance (AMOVA) was compared as a further measure of the genetic diversity within the durum wheat accessions. The results of AMOVA indicated that most of the genetic variation (69%) between the 172 accessions structured in 5 subpopulations could be explained by variation within subpopulations, while the variation between them was 13%. Finally, variation within accessions represented 18% of the total variance (Table 3).

Phenotypic data
The ANOVA of phenotypic data revealed that except for yield and TKW, the site effect was more important in the phenotypic expression of traits than the year effect ( Table 4). The environment (combination of year and site) accounted for between 8.7% (for PH) and 81.1% (for days at booting) of total variation. The genotype effect explained the largest variation for PH, NGm 2 and yield, while it accounted for less than 10% of total variation for CDW in all growth stages. The partitioning of the sum of squares of the genotype effect into differences between and within subpopulations indicated that differences between subpopulations were significant for all traits except CDW, and that statistically significant variation existed within subpopulations for number of grains per m 2 and PH. In general, interaction effects accounted for a low percentage of total variance.
Mean values of phenotypic traits across environments for the five subpopulations are shown in Table 5. Mean yield of subpopulations ranged from 2.6 t ha -1 in subpopulation 5 to 4.0 t ha -1 in subpopulation 2, with the remaining three subpopulations showing similar values of 3.1 t ha -1 . Modern cultivars had the highest NGm 2 and HI and the shortest cycle length. For landraces, the highest NSm 2 were recorded in subpopulations 3 and 4 and the lowest in subpopulation 1. The NGm 2 was highest in subpopulation 4. Landraces from the eastern Balkan Peninsula and Turkey (subpopulation 3) and the western Mediterranean (subpopulation 1) had the heaviest grains, whereas those from the western Balkans and Egypt (subpopulation 5) and the eastern Mediterranean (subpopulation 4) had the lightest ( Table 5). The latter subpopulations also showed the lowest HI and the shortest plants. Landraces from the eastern Mediterranean Basin were the earliest and those including Balkan accessions had the longest cycle length.

Relationship between genetic and phenotypic structures
Cluster analyses were performed based on SSR markers and phenotypic data (Fig 3). The dendrogram generated with SSR data has five major clusters that are mainly in agreement with the five subpopulations given by STRUCTURE (Fig 3A). Cluster A1 included most of the western Mediterranean cultivars (subpopulation 1) together with the modern cultivars (subpopulation    2). The cluster includes one Portuguese cultivar from subpopulation 5 ('Alentejo') that can be considered admixed from subpopulation 1 (q 1 = 0.34) and subpopulation 5 (q 5 = 0.50), and four cultivars that were not assigned to any subpopulation, from Portugal, Cyprus, Morocco and Tunisia. Cluster A2 was divided into two branches, the first one including nine cultivars not assigned to any subpopulation, all of them from the eastern Mediterranean Basin, and the second one grouping the cultivars from the eastern Balkans and Turkey (subpopulation 3). This branch included the Greek cultivar 'Rapsani' (subpopulation 1, q 1 = 0.56; q 3 = 0.32) and the Egyptian cultivar '31' (subpopulation 5, q 5 = 0.64; q 3 = 0.27). Cluster A3 corresponds with subpopulation 5 (western Balkans and Egypt). The main group of the cluster included all but two of the accessions belonging to this subpopulation and two cultivars from subpopulation 1, the Italian cultivar 'Balilla Falso' (q 1 = 0.72) and the French cultivar 'Rubio enlargado d'Atlemteje' (q 1 = 0.71). A second group within the cluster included five cultivars, one from subpopulation 1, another from subpopulation 4 and three unstructured. Cluster A4 is divided into two branches: the first grouped cultivars from the western Mediterranean Basin (subpopulation 1) and the second grouped all but one accession from subpopulation 4 (eastern Mediterranean). Finally, cluster A5 included the rest of subpopulation 1 together with three unstructured cultivars. Cluster analysis using the mean values of the 14 phenotypic traits resulted in an un-rooted tree with 6 main clusters ( Fig 3B). As reported previously for SSR data, most of the cultivars were grouped in clusters corresponding to the classification defined by STRUCTURE. Cluster B1 grouped most of the western Mediterranean cultivars, including three Portuguese accessions not included in subpopulation 1: 'Caxudo de sete espigas' (subpopulation 5, q 1 = 0.36; q 5 = 0.63) and the unstructured accessions 'Marques' and 'Lobeiro de grao escuro'. Cluster B2 was divided into three branches, each representing a different subpopulation. The first branch clustered accessions from subpopulation 1 (western Mediterranean); the second branch the modern  Table 1. cultivars (subpopulation 2), including the Italian accession 'Capeiti'; and the third branch subpopulation 3 (eastern Balkans and Turkey). Cluster B3 basically corresponds to the fourth cluster obtained using SSR data with the inclusion of the Portuguese cultivar 'Dezassete' (subpopulation 5, q 1 = 0.41; q 5 = 0.50) within the subpopulation 1 accessions. Cluster B4 was divided into two groups: the first one a mix of accessions from subpopulation 1 (3) and subpopulation 5 (2) from the western Mediterranean Basin and one unstructured accession; and the second one eight unstructured accessions corresponding to one of the branches of cluster A2 in the SSR dendrogram. Cluster B5 included most of the subpopulation 5 cultivars, including two subpopulation 1 accessions, as reported above for the main group of cluster A3 for molecular data. Finally, cluster B6 grouped the rest of subpopulation 1, 3, 5 and unstructured accessions.

Genetic structure and diversity
The analysis of the population structure showed a noticeable division into landraces and modern cultivars and a clear classification of landraces according to their geographical origin. Excluding the modern cultivars, the Bayesian-based analysis without a priori assignment of accessions to population classified 152 landraces into four subpopulationscategorized according to their geographical origin. Accessions showed a strong structure with an eastern-western geographical pattern formed by four clearly defined groups: eastern Mediterranean (subpopulation 4), Eastern Balkans and Turkey (subpopulation 3), western Balkans and Egypt (subpopulation 5) and western Mediterranean (subpopulation 1). This structure agrees with the pattern of dispersal of wheat from east to west in the Mediterranean Basin [1]. Therefore, the low genetic distance estimated between subpopulations 1 and 5 may be explained by geographic proximity. However, this explanation is not valid for elucidating the distance between subpopulations 3 and 4, which were close geographically but showed the largest genetic distance. This finding, jointly with the splitting of landraces from the Balkan Peninsula into two different genetic subpopulations, supports the hypothesis of two different origins of the Balkan durum wheats, as suggested by previous results [4,41]. Moreover, Dedkova et al. [42] demonstrated that T. dicoccum accessions from the former Yugoslavia, Bulgaria and Russia do not carry the 7A:6B translocation that is common in the dicoccum accessions from western Mediterranean countries; they proposed a division of European T. dicoccum into two groups: western European and Volga-Balkan. In agreement with this theory, the results of the population structure obtained in the current study show that landraces of subpopulation 3 may have a different origin than those of subpopulation 5. The largest genetic distance between subpopulation s 3 and 4 than between subpopulations 4 and 5 allows us to hypothesize that subpopulation 3 may be the one including landraces from the Volga region.
The results of the present study showed the suitability of the groups resulting from the SSR marker analysis for depicting the genetic structure and diversity of durum wheat landraces across the Mediterranean Basin. Moreover, the population structure ascertained in this study may be very useful for improving the reliability of future association-mapping studies. It is well known that population structure influences linkage disequilibrium due to the presence of population stratification and an unequal distribution of alleles within groups, which can result in spurious associations.
The subpopulations showed a membership coefficient of the accessions higher than 0.8 in a range from 57% to 76%, suggesting the presence of admixture. The admixture in tetraploid Mediterranean wheat accessions could result from the incorporation in landraces of alleles from more than a single gene pool due to the spread of wheat from more than a single ancestral population, as has been suggested [18]. An alternative reason could be that the gene flow between different cultivars may have occurred in the past through the introduction of new genotypes into fields. The exchange of germplasm between different Mediterranean regions due to the expansion of the Arabian Empire during the Middle Ages has been suggested as a possible cause of admixture [13].
Genetic diversity in wheat was increasingly narrowed down during the second part of the 20 th century due to the wide adoption of improved semi-dwarf wheat cultivars. A number of studies consider collections of wheat landraces as sources of putatively lost variability that are able to provide new favourable genes/alleles to be introgressed into modern cultivars [3 and references therein]. Although Mediterranean landraces have been shown to be particularly valuable due to their huge genetic diversity and the presence of accessions with high resilience to abiotic stresses, resistance to pests and diseases and high grain quality [3,4], the large genetic distance estimated between modern cultivars and all landrace subpopulations shows the low use of durum landraces by durum wheat breeding programmes.
Exploiting the variability of wheat landraces requires previous knowledge of their genetic diversity. In this study, we used 44 SSR markers to quantify the genetic diversity existing in a set of 172 durum wheat landraces from the Mediterranean Basin and 20 modern cultivars. The number of alleles identified in this study, and the value estimated for the genetic diversity of the collection (0.71), were higher than the values reported by previous studies involving durum wheat collections (He values between 0.55 and 0.68) [43][44][45][46][47], and also than those found in bread wheat collections (He values between 0.54 and 0.63 [15-17, 20, 48]. The high level of genetic diversity found in this study may be due to the presence of many unique alleles in landraces from different areas of the region, so it is essential to assess the genetic structure of the population. The coefficient of gene differentiation (G ST ) is directly proportional to the amount of variation among populations [49]. As a consequence of the low value of the genetic diversity between subpopulations(D ST ) obtained in the current study, G ST was also low, showing that only 17% of the variability was due to differences between subpopulations, while the remainder was a consequence of the genotypic variability within each subpopulation. The results of the analysis of molecular variance were in agreement with the low value of G ST , as only 13% of the variation in the durum wheat accessions was due to variation between subpopulations. These results indicate that, though landraces of different geographic origin were polymorphic enough to trace a consistent geographical pattern, the genetic variability within the set of genotypes of a common origin was much wider. The presence of a large number of unique putative alleles in this collection, which have already been identified for glutenin subunits [9], may significantly contribute to this large variability.
As expected, gene diversity was the lowest for the modern cultivars due to the selection pressure applied by breeders in the last few decades. Several authors [1,50] have postulated that durum wheat spread across the Mediterranean Basin from the Fertile Crescent (10,000 BP) via Turkey (8500 BP), the Balkan Peninsula, Greece and Italy (8000 BP), and from there to North Africa and the Iberian Peninsula (7000 BP). Within the landrace subpopulations, gene diversity was lower in the eastern Mediterranean group, indicating that the diversity of wheat increased during its dispersal from its area of domestication to the western Mediterranean Basin. According to these results, Ren et al. [51] using a worldwide collection of durum wheat found that the Middle East region showed moderate levels of genetic diversity, lower than those from South America, North America, and Western Europe. Authors concluded that the centres of diversity were not confined exclusively to their centres of origin. More recently in a study of Ethiopian landraces [52] authors found higher level of diversity than in a set of Mediterranean landraces, suggesting that the evolutionary history of wheat in East Africa is different.

Agronomic performance
The ANOVA showed the large effect of environmental conditions on the phenotypic expression of agronomic traits. The environmental conditions during the three years of field experiments were typical of a Mediterranean climate with a pattern of increasing temperatures during the spring and uneven distribution of rainfalls [25]. The sum of the environmental effects (year and site) accounted for a larger variation than the genotype for most traits, particularly the number of days to different phenological stages (from 69% to 81% depending on the growth stage) and CDW (from 50% to 75%). Modern cultivars had a shorter cycle length than the landraces, a finding that has been explained in previous studies by the introduction of dwarfing genes [53]. Landraces from the Balkan Peninsula (subpopulation 3and 5) took 4 to 7 days more from sowing to reach the different growth stages than those from the eastern Mediterranean Basin, which showed the shortest periods to any phenological stage among the landraces. The cooler climate of the Balkan Peninsula may have resulted in a lengthening of the growth cycle of the landraces that originated in this area [25]. On the other hand, the high temperatures and low rainfall of the southeastern Mediterranean Basin may have reduced time to heading [25,54] as an adaptive physiological mechanism for terminal drought escape.
Genotypic differences in CDW were only significant at anthesis, but variability was not found between or within subpopulations. The low variability for biomass in durum wheat has been reported in previous studies involving semi-dwarf durums [55]. In addition, the lack of statistically significant differences for CDW between subpopulation s at physiological maturity and the superior HI of modern cultivars suggest that the plant weight of the landraces compensated for the higher weight allocated in the grains of modern cultivars, leading to similar CDW at maturity. Most of the phenotypic variability in PH was explained by the genotype effect, in agreement with the high heritability of this trait [56], previously associated with the presence of the dwarfing gene Rht-B1b [57]. A similar result was reported using a collection of 191 durum wheat elite accessions [14].
The genotype effect explained 16% to 37% of total variation for yield, yield components and HI, while differences between subpopulation accounted for 15% to 56% of variation. Variability within subpopulations was only significant for the number of grains. The high number of spikes recorded in landraces from the eastern Mediterranean Basin was in agreement with the findings of previous studies, which demonstrated that durum wheat yield under warm and dry environments is determined mostly by the number of spikes per unit area, whereas kernel weight predominantly influences grain production in colder and wetter environments [2,58,59].
As expected, the HI of Mediterranean landraces was lower than that reported for modern semi-dwarf cultivars. Among the landrace subpopulations, the highest HI was found within the eastern Mediterranean landraces. Moragues [2], using a collection of 52 durum wheat landraces classified according to the dispersal of durum wheat across the Mediterranean Basin (northern and southern dispersal), showed that HI was higher within the southern landraces coming from dryer and warmer areas. These authors suggested that the southern landraces probably had a higher capacity to allocate biomass into grains and a better ability to set grains under stress, which is in agreement with the large NGm 2 recorded in subpopulation 4. The greater HI of eastern Mediterranean accessions found in this study may also indicate that they were more efficient in using water during the later stages of development than landraces from cooler areas, which is a sign of adaptation to drought environments.

Relationship between genetic structure and phenotypic performance
Classification of the accessions using the neighbour-joining clustering method based on SSR and phenotypic dissimilarities showed an evident correspondence with results obtained in the analysis carried out using the STRUCTURE software. The clearest case was that of the 20 modern cultivars that were always grouped together. The clustering of the Italian cultivar 'Capeiti' (subpopulation 4) jointly with modern cultivars (subpopulation 2) in the phenotypic tree may be due to its extensive use by Italian breeders as a hallmark founder for the development of drought-tolerant cultivars [60].
The clustering of accessions in dendrograms based on SSR and phenotypic data were essentially coincident. However, in cases of accessions with large admixture, the dendrograms concurred in locating them in branches corresponding to different subpopulations than those assigned by STRUCTURE. This was the case of the Greek LR 'Rapsani' and the Egyptian LR '31'. Interestingly, 5 of the 19 accessions not assigned to any subpopulations by STRUCTURE were located in dendrograms in branches close to accessions of similar geographic origin. This was the case of the landraces 'Marques' and 'Lobeiro de grao escuro' from Portugal, 'Hamira' from Tunisia, and 'Haj Mouline' from Morocco, which were all located jointly with western Mediterranean accessions in both dendrograms.
In addition, some of admixed cultivars from STRUCTURE that originated in Israel, Jordan, Lebanon and Syria were placed close to the cluster defined by DARWin with molecular data containing accessions from the eastern Balkans and Turkey (subpopulation 3).
Accessions included in subpopulation3 (eastern Balkans and Turkey) by STRUCTURE were separated into two different branches in the dendrogram drawn with phenotypic data. One of them contained the accessions from Cyprus and Turkey 'Vroulos', 'IG-82549', 'BGE-018192', 'BGE-018353', 'BGE-019262' and 'BGE-019264', and the other clustered all Serbian and Macedonian landraces included in subpopulation 3. This result not only supports the hypothesis that subpopulation 3 contains accessions from the Volga region, but identifies them on the basis of their agronomic performance.
A previous study [14] using a collection of 191 durum accessions and mainly semi-dwarf elite materials, and including a limited number of founders, found a weak relationship between molecular clustering and phenotypic structure. Only accessions closely related to the CIMMYT hallmark founder 'Altar 84', the ICARDA accessions adapted to continental-dryland areas and the landraces were clustered in both the genetic dissimilarities tree and the tree obtained using Euclidean distances based on standardized phenotypic data across environments. Comparing that study with the current one, it seems that the similarities between genetic distances and adaptive responses is more accurate for landraces than for modern cultivars, likely due to the lack of genetic improvement of the landraces and the incorporation in modern cultivars of genes associated with traits different from those related to adaptive response of genotypes.

Conclusions
The current study aimed to understand the genetic, phenotypic and geographic structures of a collection of Mediterranean durum wheat cultivars and the relationships between them. The results demonstrated the usefulness of the methodologies used to determine the structure of Mediterranean durums. Landraces and modern cultivars were split into two different groups using either molecular or phenotypic data. Based on data from 44 SSR markers, STRUCTURE software assigned 90% of the accessions to five subpopulations, with the four ones having landraces showing a geographical structure. Unexpectedly, the genetic diversity was greater within subpopulations than between them, which denotes the large variability existing in landraces of a common geographic origin. The study identified a large number of alleles (448), about half of which had a very low frequency and were therefore rare alleles, and an average number of 10 alleles per locus. Gene diversity increased from the eastern to the western Mediterranean Basin, in agreement with the dispersal pattern of wheat from its domestication area.
The un-rooted neighbour-joining dendrogram based on SSR data coincided in essence with the clusters obtained by STRUCTURE, but complemented the information provided by it when accessions showed large admixture. The strong coincidence between dendrograms based on molecular and phenotypic data indicates i) the suitability of the phenotypic traits used in the current study for differentiating groups of accessions with similar field performance, and ii) the robust relationship between the phenotypic expression of traits and the genetic background underlying them. Among the subpopulation that included durums from the eastern Balkans and Turkey, the assessment of phenotypic traits based on yield, yield components and crop phenology was also useful for separating those from Macedonia and Serbia from those from Turkey, Cyprus and Greece, which very likely had a different origin. This is the first study using durum wheat Mediterranean landraces and modern cultivars that shows a reliable relationship between genetic and phenotypic population structures, and the connection of both with the geographic origin of landraces. The results of the current study demonstrate that, when appropriate markers in number and distribution are used and phenotyping is adequately conducted, high similarities may be found between genetic distances and the adaptive response of durum wheat.
Supporting Information S1