Characterization of genetic diversity in Turkish common bean gene pool using phenotypic and whole-genome DArTseq-generated silicoDArT marker information

Turkey presents a great diversity of common bean landraces in farmers’ fields. We collected 183 common bean accessions from 19 different Turkish geographic regions and 5 scarlet runner bean accessions to investigate their genetic diversity and population structure using phenotypic information (growth habit, and seed weight, flower color, bracteole shape and size, pod shape and leaf shape and color), geographic provenance and 12,557 silicoDArT markers. A total of 24.14% markers were found novel. For the entire population (188 accessions), the expected heterozygosity was 0.078 and overall gene diversity, Fst and Fis were 0.14, 0.55 and 1, respectively. Using marker information, model-based structure, principal coordinate analysis (PCoA) and unweighted pair-group method with arithmetic means (UPGMA) algorithms clustered the 188 accessions into two main populations A (predominant) and B, and 5 unclassified genotypes, representing 3 meaningful heterotic groups for breeding purposes. Phenotypic information clearly distinguished these populations; population A and B, respectively, were bigger (>40g/100 seeds) and smaller (<40g/100 seeds) seed-sized. The unclassified population was pure and only contained climbing genotypes with 100 seed weight 2–3 times greater than populations A and B. Clustering was mainly based on A: seed weight, B: growth habit, C: geographical provinces and D: flower color. Mean kinship was generally low, but population B was more diverse than population A. Overall, a useful level of gene and genotypic diversity was observed in this work and can be used by the scientific community in breeding efforts to develop superior common bean strains.


Introduction
Common bean (Phaseolus vulgaris L.) is one of the most ancient grain legumes of America [1-2-3], and is an important proteinaceous staple food for more than 300 million people [4]. It is a self-pollinated crop with a small genome size of 587 Mbs [5]. The genus Phaseolus contains more than 70 species five of which (P. vulgaris L., P. lunatus L., P. acutifolius A. Gray., P. coccineus L. and P. dumosus) being the most cultivated [6]. P. coccineus commonly known as scarlet runner bean is close relative of common [7] and 3 rd most economically important bean species after common bean and P. lunatus [8]. Runner bean is a climbing perennial crop but often grown as annual crop for green pod production [6].
Geographical origin of common bean remained unresolved until Bitocchi et al. [9] proposed the Mesoamerica as the center of origin on the basis of five loci analysis. Domestication is a very complex and important process which involves the modification of wild plants into a crop [10]. Some scientists found multiple domestication events in the common bean [11-12-13-14], while others such as Kwak and Gepts [15] and Rossi et al. [16] were in the favor of a single domestication event. Domestication of common bean resulted in the formation of two diverse gene pools i.e. Mesoamerican and Andean gene pools [17]. Andean gene pool extends from Southern Peru to Northwestern Argentina, while Mesoamerican gene pool extends between Colombia and Northern Mexico [15]. Andean gene pool contains three races i.e., Peru, Nueva Granada, and Chile, while Mesoamerica, Durango, and Jalisco are the three Mesoamerican gene pool races [18].
Europe is considered as secondary diversification center of common bean, and the introduction of this species into Europe occurred over different years and is related with Columbus's 1 st voyage and Pizzaro's voyage during 15 th and 16 th centuries. The Mesoamerican gene pool arrived first in Europe in 1506, followed by the Andean gene pool in 1528 [19][20]. Further spreading of this crop to other European countries was very complex with various introductions from different American regions and combined exchange involving Mediterranean and European countries [19]. Currently, common bean is grown worldwide for its edible dry seeds or unripe fruit, either as individual gene pools, or as hybrid forms between the two gene pools [21].
Turkey is considered as one of the world's biodiversity hotspots and the center of origin for many crops [22][23]. Asian traders were responsible for the introduction of common bean into Turkey from Europe. Since then Turkey hosts hundreds of local landraces of common bean in different geographical provinces [24]. Common bean has now secured a unique place in Turkish agricultural and culinary systems. The annual common bean production was 215,000 tons in 2014 [25], which highlights its importance in Turkish economy and diet, with the Turkish northeast Anatolian region contributing the major part of the production.
One of the world's big concerns in the twenty-first century, is the possibility to produce enough food for current and future generations while confronting climate change, and adverse environmental factors [26] associated with biotic and abiotic stresses. In order to mitigate these problems, there is a need to identify novel source of useful genetic variability. The investigation of genetic diversity is one of the means to get there as this discipline represents an important tool for assessing populations [22][23][24][25][26][27] that can be harnessed through breeding and in the process of cultivar development. Common bean landraces are naturally adapted to local environments, they are inherently heterogeneous, and can provide sufficient genetic diversity to sustain crop improvement endeavors [28].
Various genetic diversity studies have been conducted in Turkish common bean landrace germplasm, but they suffered poor sampling of this species' genome. These studies provided fragmented information showing important weaknesses including: the use of small number of accessions, low number of sampled geographical locations, or a small number of markers which did not cover the whole genome. For instance, Khaidizar et al. [29] studied 38 Turkish common bean landraces and reported a mean genetic similarity of 0.585 but, this study focused only on a specific part of Turkey i.e. Northeast Anatolian region. Another study reported genetic diversity of 30 genotypes from two districts of Van province i.e., Ercis and Gevas [30], while Nemli et al. [31] used iPBS markers for the characterization of the 67 Turkish common bean landraces, and, in their recent work, Nemli et al. [32] used SNP markers for the determination of diversity in the Turkish common bean.
The objectives of this study were therefore, to comprehensively investigate the level of genetic diversity and population structure of Turkish common bean germplasm using a larger germplasm with a greater number of high throughput whole-genome markers relative to previous works. To achieve these goals, we used a high number of SilicoDArT markers detected by DArTseq approach, and phenotypic data information in a mini-core collection of Turkish common bean landrace accessions collected from 19 different geographical regions throughout the Turkish territory.

Plant material
A diversity panel was assembled consisting of natural populations of 177 common bean landraces and 6 commercial cultivars with 5 scarlet runner bean landraces collected by a group of researchers (Baloch FS and Ç iftçi V) from various farmers' fields in different geographical provinces across Turkey. The sampling sites covered a wide range of natural eco-geographical locations (Table 1) under different latitudes and variable ecological conditions i.e. soil type, rainfall, temperature, and water availability. A mini core collection of common bean population was established and grown at the Abant izzet Baysal University, Turkey. A single plant was selected from each accession, and the selections grown under field conditions in augmented design for two consecutive years during 2014 and 2015, applying single plant selection and selfing. To increase seeds for further trials and genotyping purposes, all single plants selected were grown in year 2016 in 2m long single rows 50cm apart, with 10cm between plants within a row. In all trials, local standard agronomic practices were applied, and the commercial cultivars has been used as control group in earlier studies [29] and they are developed from landraces through single plant selection and represent different regions of Turkey. Growth habit and seed weight of each accession were determined as suggested by Singh et al. [12], and used for phenotypic characterization. Similarly, various morphological characters like flower color, bracteole size, bracteole shape, pod shape and degree of curvature, leaf shape and leaf color were taken according to IBPGR descriptors for Phaseolus [33].

DNA extraction
Genomic DNA was extracted from 2-week old seedlings derived from the 188 selfed mini core selections, according to modified CTAB protocol of Doyle and Doyle, [34] with some modifications of Baloch et al. [35]. The quality and quantity of extracted genomic DNA was checked by using DS-11 FX series spectrophotometer/fluorometer (Denovix, Wilmington, DE, USA) and further confirmed by agarose gel electrophoresis (i.e. 0.8% agarose gel). DNA extraction was repeated for some samples until high-quality DNA was obtained. High-quality DNA was further diluted to a final concentration of 50 ng μl -1 . The DNA samples were processed at Diversity Array Technology Pty, Ltd, Australia (http://www.diversityarrays.com/) for DArTseq analyses using genotyping-by-sequencing platform.

DArTseq analysis
DArTseq represents a combination of complexity reduction method and sequencing of resulting representations on next generation sequencing platforms [36-37-38] and facilitates the selection of genome fractions corresponding to various active genes [39] which are associated with various traits of interest in the plants. Optimization of this technology for the common bean was achieved by considering both fractions of selected genome and size of the representation. PstI-MseI was used in this complexity reduced method. Processing of DNA samples was performed in digestion/ligation reactions principally following Kilian et al. [37]. Amplification of mixed fragments (PstI-MseI) was performed in 30 rounds of PCR using following reaction conditions: (I) 94˚C for 1 min, (II) 94˚C for 20 s, (III) ramp 2.4˚C /s to 58˚C, (IV) 58˚C for 30 s, (V) ramp 2.4˚C /s to 72˚C, (VI) 72˚C for 45 s, (VII) repeat steps 2 to 6 29 times, (VIII) 72˚C for 7 min, (IX) hold at 10˚C [37]. After PCR, equimolar amounts of this amplified product were taken and bulked from each sample of the 96-well microtiter plate. This amplified and bulked product were then applied to c-Bot (Illumina) bridge PCR followed by sequencing on an Illumina Hiseq2000. A total of 77 cycles were run for the sequencing (single read). Resulted sequences from each lane were processed through the application of proprietary DArT analytical pipelines [39]. In the primary pipeline the fastq files are first processed to filter away poor quality sequences, applying more stringent selection criteria to the barcode region compared to the rest of the sequence. In that way the assignments of the sequences to specific samples carried in the "barcode split" step are very reliable. Around 4,000,000 sequences per barcode/ sample were investigated and used in marker calling. Eventually identical sequences were collapsed into ''fastqcall files". These files are used in the secondary pipeline for DArT PL's proprietary SNP and SilicoDArT (presence/absence of restriction fragments in representation) marker calling algorithms (DArTsoft14).

Genetic diversity analyses.
Genetic distances among the evaluated common bean materials were calculated from the proportion of shared alleles obtained from silicoDArT markers by using Euclidean genetic distance coefficients. In addition to the above algorithm, a number of other diversity-relevant metrics were computed. Expected heterozygosity (Hs), overall gene diversity (Ht), and inbreeding coefficient (Fis) were computed using hierfstat R package [40] following the algorithms of Goudet et al. [41] and Yang, [42]. Pairwise kinship coefficients were derived from genomic relationship matrix following the first method described in VanRaden, [43]. Genetic structure was assessed using principal coordinate analysis (PCoA), UPGMA, and model-based Bayesian clustering algorithms. The PCoA is an eigen-analysis of a distance or dissimilarity matrix, and was performed under R software environment [40] by running a multidimensional scaling algorithm on silicoDArT-based Euclidean distance matrix. The UPGMA trees were constructed in R [40] implementing the hclust algorithm, with the UPGMA relevant agglomeration method, on the pair-wise silicoDArT-based Euclidean distance matrix among common bean landraces and modern commercial cultivars; the resulting tree was visualized and edited in iTOL (http://itol.embl.de/; [44]). The Bayesian model-based clustering was implemented in the STRUCTURE software (version 2.3.4; Pritchard et al. [45] following the methodology developed by Evanno et al. [46]. The number of clusters (K) ranging from 1 to 8 were determined by using admixture model and shared allele frequencies. Ten independent runs were set for each K value, and for each run, the initial burn-in period was set to 500 with 500,000 MCMC (Markov chain Monte Carlo) iterations with no prior information on the origin of individuals. The true value of K was estimated using both the posterior probability of the data for a given K (Pritchard et al. [45]) and the Evanno et al. [46] method. To determine suitable number of clusters (number of K; number of subpopulations), we plotted the number of clusters against logarithm probability relative to standard deviation (ΔK) as explained by Evanno et al. [46]. For coherence purposes, resulted populations from the UPGMA and PCoA were named and assigned colors based on clusters identified with model based Structure algorithm. Such a high importance was given to Structure because this algorithm showed more robustness in previous works [47][48]. Genetic differentiation and significance levels were assessed by calculating the pair-wise FST (measure of genetic structure) values using hierfstat R package [40] following the algorithms of Goudet et al. [41] and Yang, [42]. Analysis of variance for phenotypic data was performed by fitting appropriate linear model with fixed effects for i = 1,. . ..,s clusters, j = 1,. . ..,n genotypes within a cluster i, y ij is the response variable of genotype j in the cluster i appropriately expressed as best linear unbiaised estimate from the 2-year augmented design described above, e is the residual. Mean comparisons were done by performing Tukey's test or bootstrapping with 1000 resamples and Student t-test as appropriate. Computations were executed using R software.

silicoDArT markers discovery by GBS
Whole genome DArTseq profiling of Turkish common bean germplasm was performed, and yielded 15,608 silicoDArT markers (Fig 1) from 3,694 unique sequences. The sequencing company provided the positions of 11,839 markers on the 11 common bean chromosomes according to reference genome. These 11,839 silicoDArT markers were distributed on all chromosomes of common bean ( Table 2)   Genotyping by sequencing in Turkish common bean reproducibility was 1 (100% reproducibility). The original silicoDArT marker dataset was filtered to retain 12,557 high quality markers with less than 5% missing data, PIC value greater than 0.10, call rate greater than 0.90, and 100% reproducibility, for use in further analyses in this work.

Genetic diversity and population structure in Turkish common bean
The Bayesian clustering model implemented in STRUCTURE software divided the evaluated bean accessions into two main groups: 112 landraces (59.57%) in the group A (red) and 71 landraces (37.76%) in the group B (blue) (Fig 3). Five genotypes (Hakkari-59, Bilecik-8, Bolu-Goynuk, Moralaca, and Van-29) on the right-most end of the graph were successfully discriminated from the rest of the evaluated population, displaying membership coefficients equal to 50% for either population, and were therefore considered as unclassified population (black) as suggested by Habyarimana, [49]. The UPGMA-based tree clearly divided the 188 accessions  into the above two main populations A (red) and B (blue) (Fig 4) similar to clustering by model-based structure. Here, also five unclassified landraces were identified and clustered separately. The first two axes of PCoA, explaining 89% of total diversity, divided the landraces into two main populations i.e. A and B (Fig 5), a clustering comparable to the pattern obtained with UPGMA and model-based structure and the unclassified genotypes also made up a separate group.
In order to statistically describe the importance of genetic structure and diversity in the evaluated Turkish common bean germplasm, different diversity metrics such as genetic distance, expected heterozygosity (Hs), overall gene diversity (Ht), pairwise kinship (F), Fstatistic (Fst) and inbreeding coefficient (Fis) were computed. For the entire population (188 accessions), the expected heterozygosity was 0.078 and overall gene diversity, Fst and Fis were 0.14, 0.55 and 1, respectively. The expected heterozygosity values for both populations A and B were, respectively, 0.039 and 0.150. ( Table 3). The average Euclidean genetic distance for the entire population was 72.59, with a maximum value of 106.35 between Bingol-7 and Hakkari-28 landraces. Average Euclidean genetic distance in population A and population B was 44.62 and 51.40, respectively. Maximum genetic distance within population B was 82.39 between Mus-46 and Bingol-6 landraces, while minimum genetic distance (16.97) within this population was found between landrace Bingol-6 and commercial cultivar Akman. For population A, the maximum genetic distance was 88.09 between landraces Bilecik-8 and Mus-42, while the minimum genetic distance of 7.01 was found between Erzincan-5 and Bitlis-48 landraces. Mean pairwise kinship (Table 4)  The analysis of variance showed significant differences between clusters (population A, population B, and unclassified group) in terms of plant height and 100 seed weight (Table 5). Using Tukey test at the 5% probability level, the unclassified population genotypes were comparably tall as population B, both of which were taller than population A. In terms of seed size (100 seed weight), unclassified population (black) landraces displayed biggest seeds (2-3 times bigger than populations A and B) followed by population A, while population B had the smallest seeds. In this study, the 6 known commercial cultivars (Karacesehir, Akman, Goksun, Akdag, Onceler, and Goynuk) were used as control to guide the characterization of the clusters assigned to landraces in terms of growth habit and seed weight. Karacesehir, Akman and Goksun genotypes displayed climber to prostrate growth habit with seed weight less than 40g and Akdag, Onceler, and Goynuk genotypes have indeterminate bushy growth habit with seed weight greater than 40g. Commercial cultivars Karacesehir, Akman, and Goksun clustered with the B group, while Akdag, Onceler, and Goynuk clustered with the A group (Figs 3, 4 and 5), as expected. Within population A, 76% of the accessions displayed 100 seed weight values greater than 40g (ranging from 40 to 68g), while 24% of the accessions had less than 40g (Table 6). Within population B, 75% of the accessions displayed 100 seed weight values lower than 40g, while 25% of the accessions showed values greater than 40g (ranging from 40 to 61g). All the unclassified population landraces showed uniquely big seeds with 100 seed weight ranging from 110.96 to 167.21g. Various morphological characters were also observed to explore the level of diversity more comprehensively and white was the dominant flower color with the small bracteole having intermediate shape bracteole (Table 6). Dominant pod shape of curvature was concave with weak degree of curvature. Mostly terminal leaflet shape was triangular and light green was dominant color in Turkish common bean germplasm.
After the seed weight, growth habit, geographical provinces and flower color actively participated in clustering. Population A contains indeterminate bushes, prostrate and climbering genotypes, while population B grouped landraces having only indeterminate and climber Genotyping by sequencing in Turkish common bean growth habit and mostly genotypes in both populations from different provinces clustered with genotypes having similar growth habit. Geographical provinces also played a role in clustering and genotypes belonging to same provinces generally clustered together in both populations. However, it was also observed that genotypes with same provenance also clustered in different groups. For instance, landraces from Hakkari, Mus and Van provinces displayed the phenotypic characteristics of population A but, some of the landraces from these provinces  grouped with B population. On the other hand, Elazig, Bitlis and Balikesir provinces provided landraces reflecting the phenotypic characteristics of B population, but some landraces from these provinces clustered in A population. Flower color clearly differentiated the genotypes by clustering white flower genotypes population A and purple, white and lilac colors were invariably found in population B.

DArTseq-generated silicoDArT markers as a genotyping tool
Investigation of genetic diversity is very important because it can provide insight into sources of novel alleles to be used in breeding programs. The use of molecular markers to assess genetic diversity represents a significant breakthrough [50-51-52]. Various types of molecular markers have been employed in an attempt to assess genetic diversity of common bean [9-19-53]. However, DArTseq-generated marker system emerged as a marker of choice for scientists for its high throughput, possibility of whole genome covering [38][39][40][41][42][43][44][45][46][47][48][49][50][51][52][53][54] and because it can be an alternative genotyping tool for the research laboratories having less financial support. In this work, a greater number (12,557) of highly polymorphic silicoDArT markers were used relative to previous studies [55][56] in order to produce more reliable results. For example, Cichy et al. [55] used 84 DArT and 494 SNP markers for the investigation of QTLs for seed color traits in common bean. Valdisser et al. [56] used the 6286 DArTseq generated SNPs markers for the diversity identification in Brazilian common bean. In another study, Valdisser et al. [57] identified a total of 23,748 RAD-SNPs, of which 3357 were found adequate for common bean genotyping. The distribution of the silicoDArT markers used in this work was generally homogeneous on each common bean chromosome but, the number of markers per chromosome was variable. Chromosome 2 had 11.43%, while 7.21% of the markers were found on chromosome 6. On average, 1,076.27 (6.89%) markers were identified per chromosome. These results are supported by earlier studies [32-56-58] reporting relatively more silicoDArT markers on common bean chromosome 2 and less number of markers on chromosome 6 [58]. However, our results made strong disagreement with Schroder et al. [59], as they found maximum SNP markers on chromosome 8. This may be due to the use of different marker system.
The identification of 24.14% novel markers with unknown positions represents important findings in this study, particularly for the breeding perspectives. Novel markers can be used in     genome wide association studies (GWAS) for the discovery of genetic factors of interest. We are conducting multi-location/year morphological experiments and these newly identified markers will be used for common bean GWAS in upcoming years. A good range of PIC value (0.10-0.5) was found in this study, which is in line with previous works on common beans. Valdisser et al. [56] found PIC values from 0.23 to 0.5 using DArTseq markers, Blair et al. [60] found 0.32 using SNPs, Wani et al. [61] obtained 0.22-0.49 with SSR and Kumar et al. [62] achieved 0.22-0.30 using AFLP markers. Mean PIC value (0.4) obtained in our study was much higher than obtained by the Nemli, et al. [32] in their recent work on the characterization of Turkish common bean germplasm with DArTseq-generated SNP markers. The wider PIC range obtained in this work suggests a greater level of variation deriving probably from the use of larger number of high quality markers in a larger and diverse population.

Genetic structure and diversity in Turkish common bean
Whole-genome silicoDArT molecular markers, growth habit, and 100 seed weight were used to characterize the Turkish common bean germplasm. More importance was given to molecular markers as suggested in Habyarimana, [49], because they provide higher clustering accuracy. The three clustering algorithms, model-based structure, UPGMA, and PCoA were implemented and showed a good level of agreement. Three genetic groups, population A, population B, and a group of unclassified population, were identified and represented heterotic groups from which parental lines can be fetched to conduct crossing blocks in the process of common bean genetic improvement. Genomic inbreeding and kinship coefficient were computed as part of diversity metrics. Within a population, individual genomic inbreeding represents the probability that two alleles at a randomly chosen locus are identical by state, whereas pairwise kinship measures the relatedness represented by the probability that two alleles, one sampled at random from each individual, are identical by state. Therefore, kinship predicts the future level of inbreeding which represents the repository for future genetic diversity.
The overall gene diversity and the range of Euclidean distance (Table 3) were higher in population B, while the pairwise kinship (Table 4) was higher in population A, confirming the higher level of genetic diversity in the population B which can be used with advantage for common bean genetic improvement. The overall gene diversity over the entire germplasm was lower than in Gioia et al. [19] working on the 256 European and 56 American landraces using chloroplast microsatellite (cpSSRs) and nuclear markers (phaseolin and Pv-shatterproof1). However gene diversity in the population B was higher than obtained by Nemli, et al. [32] using DArTseq generated SNP markers in 173 accessions mainly from Turkey. The results in this study are in good agreement with previous findings [9-63-64] in terms of relative importance of the gene diversity of common bean accessions.
The Fst statistic achieved in this work is much higher than in earlier reports by McClean et al. [65] working on USDA core collectıon. These differences can be attributed to the use of a higher number of markers, genotyping more diverse materials from different locations [66], and the use of different sampling approaches in this work. Overall and in both genetic populations, inbreeding coefficient (Fis) was found 1 in this study, and this was expected in virtue of the self-pollinating reproduction system in common bean.
Euclidean distance is a mean quantitative measure of the genetic divergence and can be calculated between populations, species or individuals at DNA sequence level or allele frequency level [67]. Information about the existence of genetic variations and relationships in common bean landraces is very useful for the selection parents to develop new gene recombinations and a much effective germplasm characterization for diverse agriculture [68]. Maximum genetic distance in Turkish germplasm was 106.035 between Bingol-7 and Hakkari-28 landraces. Bingol-7 belongs to population B having small seed size, bushy growth habit, while Hakkari-28 is a climber in nature and contained large seed size and clustered in population A. These landraces can therefore be good candidate parents for the development of improved common bean varieties.
Landraces Bilecik-8 and Mus-42 showed the highest genetic distance within the population A. As Turkish people like seeds with medium to large size in their diet, these landraces can be candidate parents for the development of common bean variety having favorable traits for the consumer. Similarly, we found Erzincan-5 and Bitlis-48 landraces most genetically similar to each other in population A, as they showed minimum pairwise genetic distance. Within the population B, the maximum genetic distance was found between Mus-46 and Bingol-6 landraces, while Bingol-6 and commercial cultivar Akman showed higher levels of similarity. The genetically distinct landraces identified in this work (Bingol-7, Hakkari-28, Bilecik-8, Mus-42, Mus-46 and Bingol-6) within and between both populations represent a great common bean breeding potential for the scientific community to develop improved cultivars according to farmer and consumer interest.
Structure analysis divided the common bean accessions into 2 main populations: population A was the dominant population with 112 accessions and population B was smaller and contained 71 genotypes. Clustering algorithms were in good agreement with statistical inferences made on phenotypic data (growth habit and seed size). Plant height of population B was higher than in population A (Table 5), because population B contained only genotypes having prostrate and climber growth habit while population A contained landraces with indeterminate bush growth habit landraces besides the other two growth habits. Overall, a higher proportion of climber growth habit (61.70% of total accessions) and the prevalence of large seed size were obtained in this study (Table 6), which is in agreement with Blair et al. [18] reporting higher proportion of climber growth habit in large seeded genotypes, and Angioi et al. [53] and Bitocchi et al. [9] showing the predominance of large seed size accessions in Europe. Commercial cultivars used in this study have been also evaluated by Ceylan et al. [69] and their various phenotypic attributes were also documented by them. Akdag, Onceler and Goynuk cultivars have seed weight above 40g and grouped in population A, while Akman, Goksun and Karacasehir has seed weight below 40g and grouped in population B, and this is in agreement with Ceylan et al. [69].
On the average, landraces clustered in the population A contained seed size greater than 40g/100 seeds and landraces present in the population B contained the seed size less than 40g/ 100 seeds (Table 6). On the other hand, 76% of accessions in population A had seed size greater than 40g/100 seeds, while 74% of accessions in population B had seed size smaller than 40g/ 100 seeds. According to Singh et al. [12], genotypes having seed size above 40g belongs to Andean gene pool and those with seed size below 40g are called Mesoamerican gene pool accessions. The occurrence of two gene pools in Turkish bean pool were confirmed in previous studies by the Nemli et al. [32]. Similarly, Nemli et al. [32] along with other scientific groups also came across the prevalence of Andean gene pool in Turkey and in Europe , which strongly supports the findings in the present work. These results are in line with the previous studies [9-19-70] showing that the European common bean germplasm originated from both gene pools.
In this study, 5 genotypes (Hakkari-59, Bilecik-8, Bolu-Goynuk, Moralaca, and Van-29) did not clustered to any population due to membership coefficient (equal to 0.5) and considered as unclassified genotypes as suggested by Habyarimana, [49] and were present on the rightmost end of the structure. All of these genotypes reflected average plant height of 145.09 cm and their 100 seed weight ranged between 110.96 to 167.21g and confirms their uniqueness for both phenotypic and genotypic information. Santalla et al. [8] stated that beans having seed weight more than 100g (100 seed) belongs to scarlet runner bean and mostly scarlet runner bean are climber in nature. On the basis of this information, we found that genotypes in unclassified population has 2-3 times higher seed weight and their plant height is much greater than the both populations of common bean. These genotypes also reflected their uniqueness in structure, UPGMA and PCoA by making their separate population. Therefore, we considered these genotypes as scarlet runner bean genotypes on the basis of information provided by Santalla et al. [8]. Scarlet runner bean has been proven as a good source of variations for the improvement of common bean and these five runner bean accessions can be used for developing new superior common bean cultivars in near future.
Various morphological characteristics were also observed by following the IBPGR descriptors for Phaseolus [33], white, lilac and purple colored flowers were present in Turkish common bean. 72.67% genotypes reflected white colored flower, 23.49% were purple colored and 3.82% genotypes contained lilac flowers ( Table 6). 56.83% genotypes contained small size bracteole and 43.71% genotypes contained intermediate shaped bracteole. Genotypes present in the population A of structure algorithm, UPGMA and PCoA mainly contained white color flowers with small size bracteole and they contained intermediate shaped bracteole. Genotypes in population B reflected diversity in their flower color and they contains white, purple and lilac color flowers and they contains small to medium size bracteole with lanceolate bracteole shape. 55.73% genotypes contained concave pod shape of curvature and 32.40% contained very slight or no degree of curvature. Genotypes in population A contained concave shaped pods with weak degree of curvature. Population B found diverse by clustering genotypes having concave and convex shaped pods with medium to strong degree of curvature. Triangular, circular and quadrangular were the shapes of leaf and Triangular (39.34%) was the most dominant shape of terminal leaf. Population A contained genotypes having triangular and quadrangular shaped leaf and their leaf color varied from pale green to medium green. Population B was the diverse population by clustering all shapes and color of leaf and mostly genotypes in population B contains medium to dark green leaf color. Among the 5 unclassified genotypes, only Moralaca genotypes contains red color flower and remaining contained white flower with large size and ovate shaped bracteole.
The 24% and 26% of individuals in populations A and B, respectively, showing seed size below 40g/100 seeds (for population A) and above 40g/100 seeds (for population B), can possibly be considered as hybrids. A good proportion of landraces were collected from Van, Bitlis and Hakkari provinces of Turkey and these provinces reflected higher level of hybridization than the other provinces. Turkey is not an origin center for common bean and therefore, the possible reasons for the presence of more hybridization events in these provinces may be their closeness with each other that favored horizontal gene transfer in the direction that most satisfied the interests of the farmer and the taste of the consumer. Presence of hybrids in this study are also confirmed by the recent study of Nemli et al. [32] and Ceylan et al. [69], where they investigated hybrids as separate group. The possible hybridization found in this work was higher than reported earlier by Carović-Stanko et al. [70] using Croatian landraces. There was also a disagreement between this work and the findings in Angioi et al. [53] and Gioia et al. [19] that showed higher level of hybridization in European common bean germplasm containing landraces from nearly all European countries. One of the possible reasons behind the presence of higher hybrids in the above studies on European common bean germplasm can involve the use of different molecular marker systems. Angioi et al. [53] used the chloroplast microsatellites and combined them with two nuclear loci for Pv-shatterproof1 and Phaseolin type, while Gioia et al. [19] used the nuclear and chloroplast microsatellite markers; wholegenome silicoDArT markers were used in the present work.
The findings in this study showed high genetic diversity with both phenotypic and genotypic information. Genetic relatedness was generally low and heterotic groups were identified that can be used for breeding purposes. Morphological information clearly reflected the existence of variation in Turkish common bean germplasm and supported the genotypic information. Seed weight was the main factor in clustering, while growth habit, geographical provinces, and flower color also played active role in the clustering. Now, there is a need to take initiatives and start executing hybridization programs in common bean in order to develop varieties responding to end user preferences. One of the possible successful breeding objectives can be initiated by developing high-yielding dwarf common bean ideotype that is generally preferred by farmers due to less labor requirement and ease of harvesting. A good number of dwarf accessions were identified in this study. Further dwarf common bean cultivars can be developed through hybridization between distant parents with desired traits. In this study, all genotypes of unclassified population reflected phenotypic and genotypic uniqueness and will be used to start various common bean breeding activities in near future. Endeavors in this direction are underway at the Abant Izzet Baysal University, Bolu, Turkey, where a common bean germplasm mini-core is being increased by collecting more landraces within Turkey and from core countries harboring good common bean diversity such as Mexico and Latin American nations. Currently, we are conducting multiyear/location experiments of this mini-core and markers produced in this study will be used to perform genome wide association studies and marker-aided common bean breeding. In perspective, standard Andean and Mesoamerican checks will be integrated in field trials to confirm the results presented herein. We will use the variety of information collected for the identification of molecular markers linked with yield and yield component traits. The appropriate silicoDArT markers for various traits of interest will be cloned and converted into kompetitive allele specific PCR (KASP). Anyone willing to initiate knowledge based breeding program and interested in Turkish common bean genetic resources based on the information generated in this study can contact us.