Isolation and Characterization of Microsatellite Markers and Analysis of Genetic Diversity in Chinese Jujube (Ziziphus jujuba Mill.)

Chinese jujube (Ziziphus jujuba Mill, 2n = 2× = 24, Rhamnaceae) is an economically important Chinese native species. It has high nutritional value, and its medicinal properties have led to extensive use in traditional oriental medicine. The characterization of genotypes using molecular markers is important for genetic studies and plant breeding. However, few simple sequence repeat (SSR) markers are available for this species. In this study, 1,488 unique SSR clones were isolated from Z. jujuba ‘Dongzao’ using enriched genomic libraries coupled with a three-primer colony PCR screening strategy, yielding a high enrichment rate of 73.3%. Finally, 1,188 (80.87%) primer pairs were amplified successfully in the size expected for ‘Dongzao’. A total of 350 primer pairs were further selected and evaluated for their ability to detect polymorphisms across a panel of six diverse cultivars; among these, 301 primer pairs detected polymorphisms, and the polymorphism information content (PIC) value across all loci ranged from 0.15 to 0.82, with an average of 0.52. An analysis of 76 major cultivars employed in Chinese jujube production using 31 primer pairs revealed comparatively high genetic diversity among these cultivars. Within-population differences among individuals accounted for 98.2% of the observed genetic variation. Neighbor-joining clustering divided the cultivars into three main groups, none of which correspond to major geographic regions, suggesting that the genetics and geographical origin of modern Chinese jujube cultivars might not be linked. The current work firstly reports the large-scale development of Chinese jujube SSR markers. The development of these markers and their polymorphic information represent a significant improvement in the available Chinese jujube genomic resources and will facilitate both genetic and breeding applications, further accelerating the development of new cultivars.


Introduction
Chinese jujube (Ziziphus jujuba Mill, 2n = 26= 24, Rhamnaceae) is an economically important Chinese native species. Chinese jujube has been cultivated for at least 3,000 years, and archaeological evidence indicates that it was utilized 7,700 years ago in China [1]. Its fruit has a high nutritional value because it contains high levels of vitamin C, abundant phenolic compounds, high carbohydrate and mineral (particularly potassium and iron) content, and the highest level of cyclic AMP among higher plants [2,3]. It can be consumed as fresh, dried or processed fruit. In addition, the fruit has interesting medicinal properties and has been extensively used in traditional oriental medicine for its analeptic, palliative and antibechic purposes [4,5].
Chinese jujube is well adapted to various climate and soil conditions (pH 5.5-8.5); however, well drained, sandy or loamy soils combined with high levels of sunlight produce high yield and fruit quality [1]. Chinese jujube is distributed throughout China except for the most northern part, i.e., Heilongjiang province. The total growing area is estimated at more than 1.5 million hectares, and average annual production amounts to 3.5 million tons (dried weight) [6]. The top 6 provinces ranked in order of production are Xinjiang, Shannxi, Shanxi, Hebei, Shandong and Henan; together, these provinces account for 90% of the entire yield. This plant has been introduced to more than 30 countries, among which only South Korea has engaged in commercial production.
During the long-term process of natural evolution and artificial selection, the Chinese jujube has developed a wide range of variation, and more than 800 cultivars have been reported [7]. These cultivars are distributed throughout China and are propagated vegetatively either by grafting onto rootstock or as rooted cuttings [8]; the origin of most accessions is obscure because of the frequent exchange of plant material among different cultivation areas, and the lack of cultivar history documentation. The naming of jujube cultivars and types is confusing, particularly the use of homonyms and synonyms. For example, there are more than 20 different local names for the cultivar 'Dongzao,' a cultivar that provides the best fresh fruit quality. Therefore, there is an urgent need for accurate germplasm and cultivar identification.
Previously, the classification of cultivars had been based mainly on morphological characteristics and their usages [1]. However, traditional morphological identification has a number of limitations, including low polymorphism, low heritability, late expression and vulnerability to environmental influences [9]. The advent of molecular markers offers a promising tool for Chinese jujube cultivar identification. Several molecular markers, such as random amplified polymorphic DNA (RAPD), amplified fragment length polymorphisms (AFLPs) and sequence-related amplified polymorphisms (SRAPs), have been used to differentiate various Chinese jujube cultivars [10,11]. However, the use of these marker systems is laborious and produces complex patterns that are inconvenient for database building [12]. SSR, also known as microsatellite DNA, is widely distributed in eukaryotic genomes [13]. Due to their abundance, high polymorphism, codominance, stability and suitability for automated analysis, SSR markers are widely used in germplasm conservation, genetic diversity analysis, the study of genetic relationships, genetic mapping, DNA fingerprinting and molecular marker-assisted breeding [14].
Furthermore, the genetics of Chinese jujube lags behind that of many other fruit crops mainly due to a lack of both segregating populations and efficient markers. Because the jujube has a small flower (which is difficult to work with) and a low rate of kernel in the seed, no successful artificial controlled crossing has yet been attempted to obtain structured pedigrees with sufficient offspring. Lu et al. [15] developed the only F 1 segregating population in Chinese jujube using an AFLP marker to identify full sib seedlings from open pollinated seedlings. These researchers subsequently constructed the first genetic map of jujube using AFLP markers [16]; however, no co-dominant marker was integrated into the map, limiting its further utilization. As proposed in the 'breeding without breeding' (BWB) method, molecular markers such as SSRs can be employed to obtain structured pedigrees from open pollinated progenies [17]. Thus, the development of SSR markers will also benefit the development of more structured pedigrees for the breeding and genetic analysis of important traits for outcrossing Chinese jujube.
Several strategies are available for identifying new SSRs from a genome [18]. Ma et al. [19] developed 25 SSR primers using the selectively amplified microsatellite (SAM) approach. EST SSRs can be mined from expressed sequence tags (ESTs) and cDNAs [20]. Recently, Liu et al. [21] developed 119 EST-SSRs from a Chinese jujube fruit cDNA library. However, the number of available markers remains insufficient for genetic and breeding research of Chinese jujube. Therefore, additional SSR markers must be developed for genetic studies of important traits in Chinese jujube, such as fruit quality and disease resistance. Unfortunately, publicly available DNA sequence data are limited for Chinese jujube. The use of traditional SSR-enriched libraries remains one of the methods of choice for the development of largescale SSR markers for less-studied species [18,22].
In this study, we report the identification and characterization of 1,118 unique SSR markers developed from SSR-enriched genomic libraries of Chinese jujube. A set of 350 SSR markers was evaluated for their ability to detect polymorphisms across a panel of six diverse cultivars. Since the genetic base of the main cultivars currently employed in the production of Chinese jujube in China is still not clear, a total of 31 polymorphic primer pairs were utilized to determine the range of genetic diversity among them and analyze their genetic relationships.

SSR isolation by SSR-enriched libraries
Six SSR-enriched DNA libraries were constructed using a combination of three restriction enzymes and two types of probes; from these libraries, 6,720 colonies were screened using the threeprimer method [23], as shown in Table 1. In total, 2,030 positive colonies were obtained. After sequencing, 1,854 (91.3%) sequences were found to harbor SSR loci, confirming the efficiency of the three-primer strategy. After eliminating redundant sequences, 1,488 unique SSR clones remained, corresponding to an enrichment rate of 73.3% across all libraries. A set of 368 clones was excluded because the sequences flanking the SSR motifs were too short to design both forward and reverse primers. Finally, 1,120 unique SSR clones were identified and used to design primers. The enzyme HaeIII produced the highest average percentage of efficient clones (61.5%); AluI (57.6%) produced the second highest, and RsaI (46.8%) produced the least. No difference in the efficiency of the two probes, (AC) 15 and (AG) 15 , was observed.

Primer design and evaluation
From the 2,128 identified loci, 1,469 primer pairs were designed and validated for 'Dongzao'. Finally, 1,188 (80.87%) primer pairs successfully amplified products of the expected size (Table S1). Among these primer pairs, 1,039 (87.5%) yielded the single motifs, and the remaining 149 (12.5%) generated the complex motifs. The markers that were successfully amplified had repeat numbers ranging from 3 to 37. A set of 350 primer pairs with the clearest banding patterns was selected to evaluate polymorphisms in six Chinese jujube cultivars (Table S2). Of the 350 primer pairs, 301 detected polymorphisms. The polymorphism information content (PIC) values ranged from 0.15 to 0.82 (Table S2), with an average of 0.52; 157 primer pairs had PIC values greater than 0.50. Among the 301 polymorphic primer pairs, 270 contained dinucleotide repeats exhibiting a mean PIC value of 0.52, 12 contained trinucleotide repeats exhibiting a mean PIC value of 0.46, and the remaining 19 pairs with repeats longer than three nucleotides exhibited a mean PIC value of 0.50. The correlation coefficient between the PIC and SSR length was 0.12 (p = 0.0206).

Genetic diversity analysis of the main Chinese jujube cultivars
To elucidate the genetic diversity of the cultivars used in Chinese jujube production, we analyzed a set of the 76 major cultivars using 31 SSR loci. A total of 178 SSR marker alleles were detected, with an average of 5.7 alleles per locus ( Table 4). The BFU0308 primer pair produced the highest number (15) of alleles, whereas BFU0584 and BFU0614 produced the lowest number (2 alleles). The effective number of alleles per locus, which reflects the evenness of allelic frequencies, varied from 1.314 to 7.415, with an average of 3.148. The observed heterozygosity (H o ) ranged from 0.250 to 1.000, with an average of 0.678. The expected heterozygosity (H e ) ranged from 0.239 to 0.865, with an average of 0.621. The PIC value ranged from 0.229 for BFU0521 to 0.851 for BFU0308. The overall fixation index (F) was 20.081, indicating a slight excess of heterozygotes.
Based on their geographical distribution, 76 cultivars were divided into seven populations (Pop1-Pop7) ( Table 5). The number of alleles per locus ranged from 3.097 in Pop1 to 4.548 in Pop6. Private SSR alleles were present in all populations, with the highest numbers occurring in Pop4 (4), Pop5 (6) and Pop6 (7)( Table 6). AMOVA analysis revealed that a very low percentage of variation was partitioned among the populations (Table 7). Of the total genetic variance, 99.80% was ascribed to differences within populations.

Genomic SSR marker development efficiency
Various types of molecular markers have been developed since the advent of RFLP technology; SSRs and single nucleotide polymorphisms (SNPs) have been the principal markers utilized in plant genetic analysis and marker-assisted breeding [24,25]. However, it is costly to develop SNPs for a less-studied plant such as Chinese jujube, for which only a few SSR markers have been reported [19]. Therefore, the development of a large number of SSR markers for Chinese jujube is urgently needed for use in genetic and breeding research.  Table 3. SSR-enriched genomic libraries have been successfully applied in many plants, including Robusta coffee [26], Levant cotton [27] and peach [28]. The three-primer colony PCR screening strategy eliminates the need for colony hybridization to detect the desired inserts, resulting in substantial time and cost savings by minimizing the sequencing of inserts that do not contain the proper motif [23]. In this study, of the 2,030 sequenced positive colonies in the enriched libraries, 1,854 (91.3%) contained SSRs, similar to the results obtained from sunflower using a colony hybridization strategy (89%, [29]). This number is higher than those obtained for switchgrass (83.5%, [30]), eggplant (81.7%, [22]) and bunching onion (34.4%, [31]).
The percentage of unique clones in this study was 73.2%, higher than that reported in jute (67.3%, [32]), switchgrass (51.1%, [30]) and Italian ryegrass (25.6%, [33]). The proportion of redundant sequences was 19.7%, lower than those obtained in sorghum (33%, [34]) and chickpea (25.2%, [35]). Furthermore, the rate of successful amplification by the primer pairs (80.87%) obtained in this study is much higher than that obtained in eggplant (75.3%, [22]). These results indicate that the current enrichment procedure coupled with three-primer PCR screening efficiently generated a large number of SSR markers in Chinese jujube.

SSR marker polymorphism
SSR markers exhibit much higher polymorphism than RFLPs, RAPDs, AFLPs and ISSR [38]. In the present study, the rate of polymorphic markers (86%) was higher than those obtained in sorghum (80.9%, [39]) and yarrow (53.3%, [40]). Botstein et al. [41] defined a locus with a PIC of 0.5 as highly polymorphic. In  Henan Fresh this study, 157 primer pairs met this criterion, providing an important tool for the evaluation of Chinese jujube genetic variability. We observed that markers derived from sequences containing dinucleotide repeats were generally more polymorphic than those containing trinucleotide repeats, in agreement with previous results for grape [42] and switchgrass [29]. Adjacent alleles are more easily separated and identified from one another using long nucleotide repeats compared to dinucleotide repeats [42]. In humans, long nucleotide repeats have been adopted for fingerprinting [43]. Studies of plant crops such as grape have also begun to employ SSR markers with long nucleotide repeats for fingerprinting [42]. The present study provided 35 SSR markers with repeats of three or more nucleotides, which will be valuable for use in the construction of fingerprints for Chinese jujube germplasm. The level of polymorphism of an SSR is thought to be related to the number of repeats, as observed in Pinus radiata [44], eggplant [45] and peach [28]. However, we observed a very weak correlation (r = 0.12) between SSR length and PIC value, consistent with the results obtained in olive tree [46], bean [47] and Cucumis [38]. Thus, selecting loci with a sufficient number of repeats is not necessary to ensure the detection of higher polymorphism in Chinese jujube.

Genetic diversity analysis of the main Chinese jujube cultivars
Scores of cultivars have been employed in the production of Chinese jujube; however, the level of genetic diversity has not yet been evaluated. Therefore, 31 primers with high PIC scores were selected to determine the level of genetic diversity among the major cultivars of Chinese jujube. All 76 Chinese jujube cultivars were uniquely identified, demonstrating a high efficiency of the primers in differentiating the cultivars.
In SSR data analysis, loci exhibiting two bands were scored as heterozygous at a single locus. A recent comprehensive study by Barthe et al. [48] confirmed the complex origin of genetic variation in the size and sequences of amplified microsatellites. If the observed bands correspond to duplicated DNA amplifications instead of variants of the same locus, the observed heterozygosity (Ho) and the expected heterozygosity (He) may be overestimated. Eighteen of the SSRs employed in this study were confirmed to follow Mendelian segregation (Pang et al., unpublished data), ruling out the possibility of duplicate loci. The population genetic parameters and structure obtained using these 18 SSR loci were similar to those obtained using 31 SSR loci; for this reason, we have reported the results obtained using the entire data set. The average values of the allele number (Na), the effective allele number (Ne), Ho and He were 5.7, 3.148, 0.678 and 0.621, respectively, which are higher than those obtained in other horticultural plant species, including apple [49], peach [28] and tomato. With the exception of BFU0277, BFU0249, BFU0478, BFU0479 and BFU0614, the fixation indices for the 31 primers were significantly less than zero, indicating an excess of heterozygotes. Moreover, the high average number of alleles amplified per locus (5.7) and the average observed heterozygosity values of 0.678 suggest that SSR diversity is comparatively high within Chinese jujube.
Cultivars from all seven populations were scattered among the three groups, and no population formed a distinct group in the dendrogram. These populations were defined on the basis of geography and thus might not reflect underlying genetic relationships. Most bootstrap values were less than 50% in the NJ clustering dendrogram, indicating that Chinese jujube has a complex genetic background resulting from frequent cultivar Table 5. Cont.  [50] and tree peony (Paeonia suffruticosa Andrews) [51]; these low values were hypothesized to result from the large number of hybrid genotypes in the data set and possible recombination among cultivar groups, respectively. This hypothesis is further supported by the observation of a slight excess of heterozygotes. These results suggest that the recorded location distribution of many Chinese jujube cultivars may not represent their real origin.
From the NJ dendrogram, it is apparent that 'Changjixinzao' and 'Pingshunjunzao,' 'Dongzao100' and 'Dongzao70,' 'Xiangzao 45' and 'Xiangzao10,' 'Muzaokanglie' and 'Zhongyangmuzao,' and 'Wuhezao' and 'Jinsixiaozao' have high genetic similarity, indicating a close genetic relationship. In a previous study, 'Damaya' and 'Huluchanghong' could not be differentiated by 113 SRAP fragments [11]. In the present study, 12 of the 31 SSR markers could be used to distinguish the two cultivars, demonstrating the differential power of the markers. 'Wuhezao' is possibly a sport from 'Jisixiaozao'; only two differences (BFU0561 and BFU0308) were observed between these cultivars. The clustering results confirmed the close relationship between these cultivars, with a high bootstrap value of 83%. 'Muzaokanglie' has been described as a cultivar selected from 'Zhongyangmuzao' [52]; we observed differences in twelve SSRs between these two cultivars. Notably, several cultivars with similar cultivar names (which were expected to indicate a similar origin of the cultivars) did not cluster together in the dendrogram, including 'Maya,' 'Damaya' and 'Beijingmaya,' 'Huizao3' and 'Huizao154,' and 'Changjixinzao' and 'Changjixinzao10.'. The cultivars that have the same usage do not cluster together, consistent with a previous result obtained using SRAP markers [11]. Taken together, the results of this study indicate that the microsatellite markers we have developed for Chinese jujube exhibit a high level of polymorphism, thus providing a powerful tool for genetic diversity studies and cultivar identification in germplasm collections.

Conclusions
We reported the development of 1,188 SSR primer pairs from six enriched genomic SSR libraries of Chinese jujube. A set of 301 highly polymorphic SSRs was obtained using six Chinese jujube cultivars, 31 of which were employed to reveal a high level of genetic diversity among the major cultivars. The large-scale SSR markers developed here, together with their polymorphic information, represented a significant improvement in the available Chinese jujube genomic resources. These markers and their polymorphic information will be beneficial for both genetic and breeding applications to facilitate Chinese jujube improvement and accelerate the development of new cultivars.

Plant materials and genomic DNA isolation
All the plant materials were acquired with permissions from the National Key Base for Improved Chinese Jujube Cultivar, Cangzhou, China abiding by the laws in China. The plant materials used in this study did not involve endangered or protected species.
Fresh healthy leaves of 76 Chinese jujube cultivars were collected (Table 5). Z. jujuba 'Dongzao' was used for the construction of all genomic libraries. SSR primers were screened for polymorphism across a set of six cultivars that included 'Dongzao', 'Muzao', 'Xiaoyazao', 'Lizao', 'Lingbaodazao' and 'Lajiaozao', which were showed to be highly diverse previously [11]. Total genomic DNA was extracted from the leaves using a modified CTAB method [53]. DNA quality was measured by 1.0% agarose gel electrophoresis.
Construction and sequence analysis of SSR-enriched genomic libraries 'Dongzao' was used to isolate microsatellites using magnetic bead enrichment as described in Nunome et al. [54]. Genomic DNA (20 mg

Sequence checking and primer design
Sequences containing SSRs of 12 or more bases were identified using the MISA program (Microsatellite Identification Tool) from http://pgrc.ipk-gatersleben.de/misa. All sequences containing SSRs were analyzed for redundancy using the ClustalW program (http://www.genome.jp/tools/clustalw/). Only unique SSR clones with sufficiently long flanking sequences were used for primer design using the primer3 program (http://frodo.wi.mit. edu/). The sequences of the clones have been deposited to GenBank. The GenBank accession number for each clone sequence has been included in Table S1. All primers were designed using the following parameters: (1) product size from 100 to 350 bp; (2) primer size from 18 to 24 bp with an optimum size of 20 bp; (3) annealing temperature from 55 to 60uC with an optimum of 58uC; (4) GC content from 45 to 50%. An M13tagged sequence (59-TGT AAA ACG ACG GCC AGT-39) was added to the 59 end of the forward primer to enable detection with a universal fluorescently labeled M13 primer [55]. All primers were synthesized by GENEWIZ Biological Technology Co., Ltd. (Beijing, China).

Polymerase chain reaction and fragment analysis
A third primer (M13F) labeled with a fluorescent (FAM, HEX, ROX, TAMRA) was used in the PCR reactions. PCR amplifications were performed using a GeneAmp PCR System 9700 thermal cycler (Applied Biosystems, Foster City, California, USA) in a 10ml reaction volume containing 10-15 ng of template DNA, 2xTaq PCR mix (Biomed-Tech, Beijing, China), 1.6 pmol of each reverse and universal fluorescently labeled M13 primer and 0.4 pmol of the forward primer. The PCR amplification program was as follows: 94uC for 5 min; 30 cycles of 94uC for 30 s, 55uC for 40 s, and 72uC for 40 s; 8 cycles of 94uC for 30 s, 53uC for 40 s, and 72uC for 40 s; and a final extension at 72uC for 10 min [55]. The PCR products were subsequently detected using an ABI 3730XL DNA Analyzer and a GeneScan-500LIZ size standard (Applied Biosystems) and Gene-Marker software (Soft-Genetics LLC, USA).

Primer evaluation
To evaluate the efficiency of the microsatellites, 350 primer pairs exhibiting clear banding patterns were selected to evaluate polymorphism in six divergent cultivars. The number of alleles per locus and the PIC were calculated. PIC was calculated according to PIC = 12gP 2 i , where P i is the frequency of the i th allele among the total number of alleles in the sample [41].

Analysis of genetic diversity
To determine the level of genetic diversity among the 76 Chinese cultivars, which were divided into seven populations  Figure 1. Dendrogram of the genetic relationships among 76 Chinese jujube cultivars based on SSR polymorphism. The dendrogram was generated using a simple matching coefficient based on 31 polymorphic primer pairs. Cluster analysis was performed using the neighbor-joining method. Bootstrap values obtained from 1000 replicate analyses higher than 50% are indicated on the nodes. doi:10.1371/journal.pone.0099842.g001 according to their possible geographical orgin (Table 5). Thirtyone primers exhibiting high PIC values were selected for analysis. The amplification bands were corrected using FlexiBinv2 [56]. GenAlEx version 6.4, Microsatellite tools [57] and Cervus 3.0 were used to measure the variability in Na, Ne, Ho, He, Shannon's informative index (F) and PIC at each locus. The partition of observed genetic variation among and within populations and genetic groups was characterized using an analysis of molecular variance (AMOVA) as implemented in the program Arlequin version 3.5 [58].

Genetic distance estimation and cluster analysis
Following the strategy employed by Federici et al. [59] and Pang et al. [60], SSR allelic data was transformed as present or absent (coded A or T, respectively) using DataTrans1.0 [61]. The genetic distance was calculated as the p-distance of nucleotide acids using MEGA 6.05, which is equivalent to that estimated using a simple matching coefficient, i.e., the proportion of shared A's and T's subtracted from 1. Furthermore, a neighbor-joining (NJ) dendrogram was constructed, and the robustness of the genetic relationships was evaluated using bootstrap analysis with 1,000 re-samplings using MEGA 6.05 [62].