Genome-Wide Characterization of Simple Sequence Repeat (SSR) Loci in Chinese Jujube and Jujube SSR Primer Transferability

Chinese jujube (Ziziphus jujuba), an economically important species in the Rhamnaceae family, is a popular fruit tree in Asia. Here, we surveyed and characterized simple sequence repeats (SSRs) in the jujube genome. A total of 436,676 SSR loci were identified, with an average distance of 0.93 Kb between the loci. A large proportion of the SSRs included mononucleotide, dinucleotide and trinucleotide repeat motifs, which accounted for 64.87%, 24.40%, and 8.74% of all repeats, respectively. Among the mononucleotide repeats, A/T was the most common, whereas AT/TA was the most common dinucleotide repeat. A total of 30,565 primer pairs were successfully designed and screened using a series of criteria. Moreover, 725 of 1,000 randomly selected primer pairs were effective among 6 cultivars, and 511 of these primer pairs were polymorphic. Sequencing the amplicons of two SSRs across three jujube cultivars revealed variations in the repeats. The transferability of jujube SSR primers proved that 35/64 SSRs could be transferred across family boundary. Using jujube SSR primers, clustering analysis results from 15 species were highly consistent with the Angiosperm Phylogeny Group (APGIII) System. The genome-wide characterization of SSRs in Chinese jujube is very valuable for whole-genome characterization and marker-assisted selection in jujube breeding. In addition, the transferability of jujube SSR primers could provide a solid foundation for their further utilization.


Introduction
Microsatellites, or simple sequence repeats (SSR), are iterations of between 1-and 6-bp nucleotide motifs. These sequences have been detected in the genomes of numerous organisms and are distributed throughout the entire genome in both coding and non-coding regions [1,2]. Given their many desirable attributes, including wide genomic distribution, co-dominant inheritance, their multi-allelic nature, and a high level of polymorphisms, SSRs are highly favored molecular markers [3][4][5]. SSRs are also very useful in genetic analysis, molecular assisted breeding, genetic mapping, and varietal identification [6][7][8]. Moreover, SSRs are easily assayed by PCR [9]. As next-generation sequencing technology has developed, de novo genome sequencing has greatly accelerated SSR discovery and numerous additional SSR loci could be identified using genome-wide sequence analysis.
The Chinese jujube (Ziziphus jujuba Mill.), which belongs to the Rhamnaceae family, originated in China and has been cultivated for more than 7,000 years [10]. This tree has been introduced into approximately 50 countries throughout the world, including Japan, Korea, India, Tunisia, Italy, the United States, and Australia. Approximately 900 Chinese jujube cultivars are available [11]. Previously reported jujube SSR markers were proved to be very useful in genetic analysis of jujube and wild jujube [12][13][14], and more available SSR markers should be developed. In addition, no reports have characterized SSRs throughout the Chinese jujube genome.
We recently sequenced the genome of the Chinese jujube de novo [15]. The objectives of this study included the following: (a) to perform genome-wide characterization of SSRs in the jujube genome, (b) to develop and evaluate jujube SSR primers, and (c) to determine the transferability of jujube SSR primers to a wide range of angiosperm families. To our knowledge, this is the first report characterizing genome-wide SSRs in the Chinese jujube and the transferability of jujube SSR primers. This study will provide a foundation for the further utilization of jujube SSR primers.

Characterization of jujube SSRs
Using the MISA program to analyze 396.18 Mb (approximately 90.00%) of the estimated jujube genome [13], 70.83% of the 3,027 scaffold sequences were found to contain SSR loci. A total of 436,676 SSR loci were identified, with an average distance of 0.93 Kb between the loci. Over two-thirds (67.62%) of the scaffold sequences contained more than one SSR. Among the 480 types of motifs that were identified, mononucleotide and dinucleotide repeats were the most common in the intronic, UTR and non-genic regions, and trinucleotide repeats were the most prevalent type in the exonic region (S1 Table). Among the different types of repeats, mononucleotide repeats (283,301) were the most common, accounting for 64.87% of all repeats, followed by dinucleotides (24.40%), trinucleotides (8.74%), tetranucleotides (1.64%), pentanucleotides (0.21%), and hexanucleotides (0.14%).
Numerous SSRs were identified in the jujube genome, and their primitive characteristics were consistent with those of many other plant genomes, such as apple [16] and grape [17]. Species with a large number of short repeat-type SSR loci generally exhibit a higher genomic mutation rate [18][19][20]. The high proportion of short repeat-type SSR loci in the jujube genome indicates that this genome has a long evolutionary history or that it has a high mutation rate.
The mononucleotide repeats exhibited a strong bias toward A/T motifs (98.48%) compared with C/G repeats ( Table 1). The AT/AT motif (74.33% in total dinucleotide repeats) was the most common type, whereas CG/CG was present at very low levels (0.02%). Among the other types of repeats, the most prevalent included AAT/ATT (64.17%), AAAT/ATTT (79.45%), AAAAT/ATTTT (51.23%), and AAAAAG/CTTTTT (52.96%). Jujube SSR repeat motifs exhibited a preference for A and T, which is consistent with the results from a previous study of a small region of the jujube genome (8.4 Mb) [21]. That study also indicated that hexanucleotide repeats were the most abundant, which is inconsistent with our results. This difference suggested that whole-genome sequencing is necessary for SSR characterization.

Comparison of genomic SSRs from jujube with other species
The jujube genome is smaller than the apple, pear, and grape genomes, but it contains more SSRs (S2 Table). The average distance between SSRs varied between the species, with the smallest distance (2.65 Kb) noted in jujube and the largest (7.52 Kb) in apple (S2 Table). The jujube genome exhibited the highest SSR density (387 SSRs/Mb) followed by mulberry (281 SSRs/ Mb), peach (219 SSRs/Mb), and Prunus mume (211 SSRs/Mb). Overall, the jujube genome contains significantly more SSR loci compared with the other seven species.
The predominant SSR motifs differ among the different species (S3 Table). AT/AT was the primary dinucleotide motif in jujube, grape, and mulberry, whereas the AG/CT motif was the most common in peach, strawberry, and Prunus mume. In addition, a large proportion of both motifs were noted in the apple and pear genomes. The predominant trinucleotide repeats included AAT/ATT in jujube, grape, and mulberry; AAG/CTT in strawberry; AAC/GTT and AAG/ CTT in apple; and AAT/ATT and AAG/CTT in pear, peach, and Prunus mume. Among the tetranucleotide repeats, AAAT/ATTT was the most common motif in all tested species. Although the most common SSR motifs varied among species, all of the dominant repeat motifs were A/T rich, which is consistent with previous results regarding eukaryotic SSR loci [18]. This phenomenon might result from the conversion of cytosine to thymine [22]. In monocotyledons, CCG/ CGG is the primary trinucleotide repeat motif, but this motif is very rare in dicotyledons [23]. It is possible that trinucleotide repeats have a high GC content in monocotyledons [24]; alternatively, this phenomenon is potentially caused by a preference for certain bases [25].

Screening of jujube SSR primers
In total, 283,301 mononucleotide repeats were excluded, and the remaining 153,375 SSRs were used to design primers. Primer pairs were successfully designed for 78,928 SSRs. Then those  A/T  -----69067  39804  30597  25125  21449  18879  74082  279003   C/G  -----676  510  446  414  337  305  1610  4298   AT/AT  --10521  12065  14460  14498  11745  7287  3854  2016  1031  1715  79192   AG/CT  --3253  2604  2250  2065  1651  1384  1144  1013  763  203  19583   AC/GT  --1391  1201  1010  846  662  533  422  342  262 1068 7737 doi:10.1371/journal.pone.0127812.t001 primer pairs with product size between 120 and 280 bp in length were selected, we reduced this number to 46,314 primer pairs. In SSR loci with > 60-bp repeats, the amplification efficiency of the primers and the proportion of polymorphisms were significantly reduced compared with those for SSR loci 60 bp (Table 2). Theoretically, longer SSR loci are more polymorphic, but our results indicate that the SSR locus length is not positively correlated with the proportion of polymorphisms. This result was supported by previous studies, which also reported a balance between an increasing SSR locus length and an increasing proportion of polymorphisms without the need of a positive correlation [12,[26][27]. Then, we focused on SSR loci containing 60-bp repeats, and 30,565 primer pairs were further screened. Among them, dinucleotide and trinucleotide repeats were the most common, accounting for 81.48% and 13.39% of the amplicons, respectively. Furthermore, 1,000 primer pairs were randomly selected from these 30,565 pairs (S4 Table) and were evaluated using 6 jujube cultivars and wild types. Of the initial 1,000 primer pairs, 725 were shown to be effective, and 511 were polymorphic (S5 Table). The most common motifs were dinucleotide (243 primers) and trinucleotide repeats (232 primers).
Primers for different types of motifs should exhibit different effects on the proportion of polymorphisms and the amplification efficiency. Because only a limited number of tetranucleotide, pentanucleotide, and hexanucleotide repeats were identified, we focused on the amplification effects of primers for dinucleotide and trinucleotide repeats. Increased proportions of effective primers and polymorphic amplicons were observed for dinucleotide repeats and trinucleotide repeats, respectively ( Table 2). The lower proportion of polymorphisms in dinucleotide repeats indicates that this type of SSR locus is more highly conserved in the genome to ensure species stability. The higher polymorphic rate in trinucleotide repeats was similar to previous study [13], and that might be attributed to three bases coding for an amino acid, and frameshift mutations could effectively prevent other SSR loci from changing. From the perspective of SSR evolutionary analysis, these changes in the SSR loci could be explained with a stepwise mutation model (SMM); the length of an SSR locus was changed by adding or removing a percentage of the motif, not by adding or removing a single base [28]. This viewpoint was confirmed by sequencing (Fig 1).
The sequencing and alignment of the jujube SSR alleles verified the presence of SSR loci and revealed a high degree of conservation of the regions flanking the SSRs (Fig 1). The sequencing results also indicated that the loci had diverse alleles, which may contribute to the diversity of the jujube germplasm. The allelic diversity was mainly due to variations in microsatellite repeat lengths combined with point mutations within the flanking regions.

Polymorphic verification of jujube SSR markers
Among the 511 polymorphic SSR primers, 16 pairs were randomly selected and used to analyze the genetic relationship among 20 jujube cultivars (Fig 2, Table 3). The primers amplified 68 polymorphic bands, and each primer pair amplified an average of 4.25 polymorphic fragments. The polymorphism information content (PIC) values of the loci were between 0.51 and 0.72, with an average of 0.61. Loci with PIC > 0.5 were considered highly polymorphic [29]. Therefore, 16 primer pairs were highly efficient. The genetic relationship among the 20 jujube cultivars was constructed in a dendrogram using Numerical Taxonomy System of Multivariate Programs (NTSYS) cluster analysis (Fig 3). Both 'Daliganweibazao' and 'Daliyuanzao' are from Dali County, and these cultivars exhibit similar botanical characteristics and fruit shapes. The two cultivars were not separated in the dendrogram, indicating that they are either the same cultivar or they have a very close relationship. The genetic similarity coefficient between 'Zunyitianzao' (from Southern China) and the other 19 cultivars (from Northern China) was considerably low, which is consistent with their distant geographical locations. The related study among sour jujube populations also provided valuable information about genetic diversity and geographical distances [14].

The transferability of jujube SSR primers
To verify the transferability of the primers to 15 species from 8 angiosperm families, 64 pairs of jujube SSR primers (S6 Table) were randomly selected from the 1,000 pairs mentioned above, and 35 pairs of SSR primers yielded polymorphisms (S7 Table, Fig 4).    Table 4.  29.69% (19/64) of the 64 primers successfully produced amplification products in the Rosaceae family and the Vitales, Brassicales, Solanales, Malvales, Laurales, and Poales orders, respectively. The 15 tested species grouped into two main clusters, one with the 13 dicotyledonous species and the other with the two monocotyledonous species (Fig 5). The clustering results indicated that wild jujube was closely related to the 4 species in the Rosaceae family. Moreover, the similarity coefficient between jujube and peach (0.69) was higher than that between jujube and the other species of Rosaceae. This result indicated that the jujube has a closer relationship with the peach, which is supported by genomic data analysis [15]. The results were also highly consistent with the widely accepted Angiosperm Phylogeny System (http://www.mobot.org/  Table 3, and the 20 jujube cultivars are listed in Table 4.  MOBOT/Research/APweb/welcome.html) and provided evidence supporting the transferability of the jujube SSR primers.
SSR primers from the apple genome are transferable to pear; both species belong to the Rosaceae family [16,30]. Fan et al. verified the transferability of SSR primers from the pear genome to other species in the Rosaceae family [31]. The transferability of SSR primers to other species, such as loquat [32], grape [33], strawberry [34], citrus [35], sweet cherry [36], and lychee [37], has also been reported. However, the transferability of plant SSR primers is typically studied at the family level. Our study was among the first to demonstrate the transferability of SSR primers to different families, which provides evidence for the wider application of plant SSR primers.

Conclusion
In this study, we conducted genome-wide characterization of SSRs in jujube and used SSR markers to determine the transferability of jujube SSR primers to a wide range of angiosperm families. By analyzing the distribution of SSRs in the jujube genome and comparing the SSR pattern among jujube and other related species, we concluded that the jujube genome is significantly enriched for SSR loci compared with seven other species. Jujube SSR primers are valuable for marker-assisted selection in breeding, and their wide transferability would also provide a foundation for their further utilization.  (Table 4) were used to verify the efficiencies of the primarily screened SSR primers. A total of 15 angiosperm species from 8 families and 7 orders (Table 5) were used to explore the transferability of the jujube SSR primers. All leaf

DNA extraction and analysis
Genomic DNA was extracted from young leaves of different jujube cultivars using an improved cetyltrimethyl ammonium bromide (CTAB) method [38]. After extraction, 5-10 μl of DNA solution was loaded on a 1.0% agarose gel to assess the sample quality. Then, the DNA quality and concentration were further assessed using a NanoDrop2000.

SSR identification and primer design
Identification and localization of the SSR primers in the Chinese jujube genome were performed using MISA (MIcroSAtellite) software with Perl. The following search criteria were implemented: 10 repeat units for mononucleotides, 7 repeat units for dinucleotides, and 5 repeat units for tri-, tetra-, penta-, and hexanucleotides. Interrupted compound SSRs were also selected when the bases interrupting the two SSRs were 10 repeat units. Primer pairs were

Comparison of genomic SSRs between jujube and other plants
Genomic data from apple [39], pear [40], peach [41], strawberry [42], Prunus mume [43], mulberry [44], and grape [45] were downloaded from the NCBI database. The genomic SSRs of other species were searched using the same criteria as employed for Chinese jujube and were then compared with the SSRs from the jujube genome. Mononucleotides were generally not very informative [46] and thus were not included in this analysis.

Polymerase chain reaction and fragment analysis
Polymerase chain reaction (PCR) was performed in a total volume of 12.5 μl containing 0.5 μl of 50 ng/μl genomic DNA, 6.3 μl of 2×Taq Master Mix (CWBIO), and 0.5 μl of 10 μmol/L each of forward and reverse primers. The reactions were performed using the following conditions: 94°C for 3 min; 30 cycles of 94°C for 30 s, 50-60°C for 30 s, and 72°C for 30 s; and a final step at 72°C for 10 min. Then, 3 μl of the PCR product and a 50-bp molecular size marker were loaded onto an 8% denaturing polyacrylamide (PAGE) gel in 1×TBE buffer, run at 200 V, and visualized using silver staining. SSR analysis was performed at least twice to confirm primer amplification.

Sequencing of SSR PCR products for SSR locus verification
To verify both the presence and allelic variation of the SSR loci, PCR amplification products from two primer pairs (JSSR293 and JSSR314) were resolved in 2.0% agarose gel and purified by SanPrep Column DNA Gel Extraction Kit. The product ligated into the pMD19-T vector and sequenced by Sangon Biotech.

Data analysis
Only the fragments that could be clearly scored were used in the data analysis. The genotypic data were analyzed using the unweighted pair-group method with arithmetic averaging (UPGMA) clustering using NTSYS [47]. The numbers of alleles per locus and PIC values were calculated [48].
Supporting Information S1