Tibetan annual wild barley is rich in genetic variation. This study was aimed at the exploitation of new SSRs for the genetic diversity and phylogenetic analysis of wild barley by data mining. We developed 49 novel EST-SSRs and confirmed 20 genomic SSRs for 80 Tibetan annual wild barley and 16 cultivated barley accessions. A total of 213 alleles were generated from 69 loci with an average of 3.14 alleles per locus. The trimeric repeats were the most abundant motifs (40.82%) among the EST-SSRs, while the majority of the genomic SSRs were di-nuleotide repeats. The polymorphic information content (PIC) ranged from 0.08 to 0.75 with a mean of 0.46. Besides this, the expected heterozygosity (He) ranged from 0.0854 to 0.7842 with an average of 0.5279. Overall, the polymorphism of genomic SSRs was higher than that of EST-SSRs. Furthermore, the number of alleles and the PIC of wild barley were both higher than that of cultivated barley, being 3.12 vs 2.59 and 0.44 vs 0.37. Indicating more polymorphism existed in the Tibetan wild barley than in cultivated barley. The 96 accessions were divided into eight subpopulations based on 69 SSR markers, and the cultivated genotypes can be clearly separated from wild barleys. A total of 47 SSR-containing EST unigenes showed significant similarities to the known genes. These EST-SSR markers have potential for application in germplasm appraisal, genetic diversity and population structure analysis, facilitating marker-assisted breeding and crop improvement in barley.
Citation: Zhang M, Mao W, Zhang G, Wu F (2014) Development and Characterization of Polymorphic EST-SSR and Genomic SSR Markers for Tibetan Annual Wild Barley. PLoS ONE 9(4): e94881. https://doi.org/10.1371/journal.pone.0094881
Editor: Xianlong Zhang, National Key Laboratory of Crop Genetic Improvement, China
Received: January 19, 2014; Accepted: March 19, 2014; Published: April 15, 2014
Copyright: © 2014 Zhang et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: The project was supported by National Natural Science Foundation of China (31171488), the National 863 Program (2012AA101105), and the Key Research Foundation of Science and Technology Department of Zhejiang Province of China (2012C12902-2). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Barley (Hordeum vulgare L.) is the fourth important cereal crop worldwide. With the rapid development of beer and feed industry, the demand for barley keeps increasing. However, during the long-term domestication of the cultivated barley, especially after the modern breeding and intensive cultivation, the genetic variation degraded significantly, resulting in missing lots of genes, including some rare alleles . The monotonous genetic background of cultivated barley has become the bottleneck of the effectiveness of breeding, while the abundant diversity of wild barley can provide a pool of alleles for barley breeding and improvement , . Morphological, archaeological cytogenetic and isozyme data revealed that wild barley on the Qinghai-Tibet Plateau is different from the Fertile Crescent wild barley . Researches so far have shown even rich genetic diversity in Tibetan wild barley than in Ethiopian barley . Novel germplasm has been identified from the Tibetan wild barley tolerant to drought, salinity and aluminum toxicity –.
Increasing efficient molecular markers would be valuable in diversity analyses, resource conservation and beneficial alleles exploitation for wild barley. Comprehensive sets of expressed sequence tags (ESTs) sequences have been generated in many plants (http://www.ncbi.nlm.nih.gov/dbEST). The availability of increasing sequence databases enables the identification of functional genes with similar sequences in related species . EST-based SSR markers (EST-SSRs) have been widely employed as powerful molecular genetic tools in a large number of cereal crop species due to their high level of transferability, close association to genes with known function, codominant inheritance, and low cost for development with available development from public databases –. Jaikishan et al.  used 25 EST-SSRs and 25 genomic SSRs to predict grain yield heterosis; multiple EST-SSRs were generated for wheat (Triticum aestivum L.) and these markers showed high transferability between wheat and the other crops, such as barley, maize, rice, and sorghum –. Up to date, polymorphic EST-SSRs were identified to establish Hordeum chilense evolutional relationships  and new EST-SSRs and genomic SSRs were complemented to the published Australian barley genetic maps . However, to our knowledge, little work has been performed to develop EST-SSRs and apply them for population structure in Tibetan wild barley.
In the present study, with the objective of exploiting new SSRs from EST databases and confirming the published genomic SSRs in the Tibetan wild and cultivated barley accessions, 49 EST-SSRs and 20 genomic SSRs were developed and characterized. These markers can be utilized to evaluate the genetic variation and phylogenetic relationships of 96 barley genotypes. Furthermore, polymorphism, and genetic diversity in the Tibetan wild barley accessions were evaluated which would be particularly useful for identification of novel genes with traits of interest, and marker-assisted breeding in barley.
Materials and Methods
A total of 96 barley accessions were used in this study including 80 Tibetan annual wild barley from Qinghai-Tibet Plateau provided by Huazhong Agricultural University barley germplasm collection, and 16 cultivars from China which were stored at the Institute of Crop Science, Zhejiang University, Hangzhou, China (Table S1). These accessions were collected on public land. And no specific permits were required for the collection. Seeds were surface sterilized with 3% H2O2 for 30 minutes and thoroughly rinsed with distilled water, followed by germination in nutrient rich soil in an incubator (22/18°C, day/night) for 10 days. Total genomic DNA was extracted from barley leaves using the Plant Genomic DNA Kit (TianGen, Beijing, China).
Sequence screening and primer designing
A total of 525999 barley ESTs were acquired from the EST database of GenBank (up to September 2012) (http://www.ncbi.nlm.nih.gov/Genbank/). Redundant sequences were removed from these ESTs using CD-HIT-EST (http://cd-hit.org) with the identity parameter of 95%. The presence of SSRs was screened using Simple Sequence Repeat Identification Tool (SSRIT) software (http://www.gramene.org/gramene/searches/ssrtool). The criteria for di-, tri-, tetra-, and penta-nucleotides were 10, 7, 5, and 4 repeat units, respectively. A total of 188 EST-SSRs were randomly selected and primers were designed using Primer5.0 with a length ranging from 18–22 bp, and product sizes of 100 to 300 bp. The reverse primers were marked with 6-FAM or HEX fluorescent dye at 5′ side for each pair. Based on the previous study of barley, 41 genomic SSR markers were selected and SSR primers were designed with the same criteria as mentioned above.
PCR amplification and sequencing
PCR amplification was performed in a total of 20 µL reaction mixture that contained 1 µL of genomic DNA, 1 U ExTaq DNA polymerase (Takara Inc.), 2 µL of 10×Ex Taq Buffer (Mg2+ Plus), 0.2 mM dNTPs mix, 0.05 µM forward primers, 0.1 µM reverse primers and fluorescent primers (FAM or HEX).
The PCR protocol used was as follows: initial denaturation for 5 min at 94°C, followed by 5 cycles of denaturation for 30 s at 94°C, annealing for 30 s at 50°C, and extension for 30 s at 72°C, subsequently followed by 32 cycles of denaturation for 30 s at 94°C, annealing for 30 s at 55°C, extension for 30 s at 72°C, with a final extension for 10 min at 72°C and a 4°C holding temperature. PCR products were diluted and tested on a MegaBACE 1000 DNA analysis system (Amersham Biosciences, Piscataway, NJ) at the Center of Analysis and Measurement in Zhejiang University. The lengths of PCR fragments were calculated using the ET550-R size standard and Genetic Profiler version 2.2.
Calculation of polymorphism
The polymorphism of EST- and genomic SSR alleles were scored for the presence (1) and absence (0) for 96 accessions. Alleles with frequency less than 5% (rare alleles) in the population were removed and considered as missing data for the polymorphism calculation and population structure analysis . The genetic diversity was evaluated by the number of alleles (Na), the effective number of alleles (Ne), observed heterozygosity (Ho), and expected heterozygosity (He) using POPGENE v.1.31 . Polymorphism information content (PIC) was calculated by applying software PIC_CALC version 0.6.
Population structure was assessed using the STRUCTURE software v2.3.3 based on the admixture model . Models were tested for clusters (k) from 1 to 15, each with ten independent runs and 100,000 MCMC (Markov Chain Monte Carlo) iterations. The most likely number of clusters (k) was indicated by Δk, the change rate of the estimated log probability of the data (LnP[D]) .
Gene function blast
EST-SSRs associated unigene sequences were blasted against the GenBank non-redundant (nr) protein database using BLASTX (http://www.ncbi.nlm.nih.gov/BLAST) with an expected value (E-value) of 10−10 for the function of polymorphic EST-SSRs.
Characterization of polymorphic SSRs
In total, 69 SSR primer pairs, including 49 (26% out of 188) EST-SSRs and 20 (49% out of 41) genomic SSRs (Tables 1 and 2), showed polymorphism among 96 accessions. A total of 213 alleles were generated from 69 loci with an average of 3.14 alleles per locus. The ratio of the EST-SSR repeat motifs was not equally distributed. The di-, tri-, tetra-, and penta-nucleotides accounted for 16.32%, 40.82%, 26.53%, and 16.32%, respectively. Whilst most of the genomic SSRs selected were composed of dinucleotide repeats. According to the results of POPGENE for the 69 SSRs, the observed number of alleles per locus (Na) ranged from 2 to 6 (mean = 3.14) and the effective number of alleles per locus (Ne) varied from 1.09 to 4.54 (mean = 2.30). The average Na was 3.12 and 2.59 for wild and cultivated barley, respectively (Table 3). Besides this, the polymorphic information content (PIC) ranged from 0.08 to 0.75 with a mean of 0.46, and the PIC of wild barley was higher than that of cultivars with 0.44 vs 0.37. The expected heterozygosity (He) ranged from 0.0854 to 0.7842 with an average of 0.5279, while the observed heterozygosity (Ho) ranged from 0 to 0.766 with an average of 0.1677. As an indicator of genetic diversity, the average He was 0.5098 in wild barley accessions and 0.4333 in cultivated accessions.
Gene functions of the 49 unigene sequences containing polymorphic EST-SSRs
Functions of the 49 polymorphic EST-SSRs were determined and 47 unigenes showed significant similarities to the known genes (Table 4), for instance, zinc finger protein MAGPIE, transcription factor LAF1, photosystem II reaction center PSB28 protein, xyloglucan endotransglycosylase (XET), and protein kinase APK1B. In addition, the results revealed that the most annotated proteins were from Triticum urartu (17, 36.2%), and the species Hordeum vulgare and Aegilops tauschii accounted for the same percentage (11, 23.4%).
Population structure and genetic distance
To detect the population structure in the 96 barley genotypes, we performed STRUCTURE program for Bayesian clustering analysis using 69 SSR markers, assuming that the number of populations (K) ranged from 1 to 15. The highest log likelihood score (Δk) was at K = 8 (Figure 1A), indicating that the most suitable number of subpopulations was eight. The frequency of each accession assigned to a subpopulation was shown in Table S1. If the threshold of frequency was set at 0.5, only six accessions were defined as admixed. However, about 80% of the accessions can be derived from the subpopulations when the threshold was at 0.7. The output of structure analysis demonstrated that wild and cultivated barleys were assigned to different subpopulations (Figure 1B). Most of the cultivated barleys were classified into the subpopulation 4, except for A74, Tadmor, B1342 and B1031. Fifty percent of the wild barley accessions studied were assigned to subpopulation 1.
Estimation of the likelihood of clusters (k) for the most appropriate subpopulations (Δk) (A), and the population structure of 96 barley accessions in k = 8 clusters (B).
According to the values of genetic distance of the eight subpopulations, we get the dendrogram showing the genetic relationship of the subpopulations via UPGMA clustering analysis (Figure 2). The dendrogram showed that the subpopulation 3 was most close to the cultivated barleys (subpopulation 4) with the genetic distance of 132.188. The subpopulation 7 had the largest genetic distance (165.167) with the cultivated subpopulation.
In recent years, different kinds of molecular markers have been used widely, including marker-assisted breeding, study of genetic relationships between populations, and screening candidate genes associated with the target traits . The simple sequence repeats (SSRs) are increasingly important due to their high polymorphism and convenient techniques. However, EST-SSRs are superior to genomic SSRs for their transcriptional sequence and suitable application in cross-species . In the present study, we developed 49 EST-SSR and 20 genomic SSR markers for wild barley. These novel EST-derived markers will be a valuable resource for tagging and mapping of genes related to agronomic and stress-resistant traits of interest. In addition, these markers are advantageous for identifying functional diversity of unique adaptive germplasm because of their genic function.
In many plants, the di- and tri-nucleotides repeat motifs were the major types, but the predominant motifs were different in various species , . In our research, the tri-meric repeats were the most abundant motifs (40.82%), followed by the tetra-meric repeats accounted for 26.53%, and the di-meric and penta-meric repeat motifs were at the same frequency (16.32%).The polymorphism of SSRs can be divided into three degrees: high (PIC>0.5), medium (0.5>PIC>0.25) or low (PIC<0.25) . In our study, the genetic diversity of genomic SSRs was higher than the EST-SSRs, with the mean PIC value of 0.57 (high) and 0.41 (medium), respectively, resulting in the general medium polymorphism (mean = 0.46). This finding was in line with previous results, and the lower level of polymorphism of EST-SSRs might be due to the selection against the variation in the conserved regions of the EST-SSRs . Moreover, the expected levels of heterozygosity at EST-SSRs were also not as high as that of genomic SSRs, ranging from 0.0854 to 0.697 vs 0.3899 to 0.7842. Pompanon et al.  contributed the deficiency of heterozygosity to the primer problems, the deletion of alleles and appearance of invalid alleles at the annealing points.
Studies of the genetic variation in barley suggested that Tibetan wild barley showed higher polymorphism than cultivated barley –. The results of our study were consistent with the previous studies. The number of alleles and the PIC of wild barley were both higher than that of cultivated barley, being 3.12 vs 2.59 and 0.44 vs 0.37. The expected heterozygosity (He) showed the same trend, with 0.5098 and 0.4333 for wild and cultivated barley, respectively. The richness of genetic diversity in Tibetan wild barley may be the source of novel genes contributing to the tolerance of biotic and abiotic stresses, which is important in the barley breeding.
BLASTX analysis indicated that 47 (96%) of the 49 unigenes containing EST-SSRs can be matched to at least one important proteins in the NCBI nr protein database. For futher study, we can search the candidate genes of interest via association analysis referring to the function of markers in the metabolism pathways. Furthermore, these EST-SSR markers can be utilized as affirmative markers for comparative studies in the related species, for example, Triticum urartu and Aegilops tauschii.
In the present investigation, the findings of population structure analysis demonstrated that the developed EST-SSRs and genomic SSRs could distinguish between the cultivated and wild barley genotypes clearly. The 96 genotypes were divided to eight subpopulations. The subpopulation 3 (XZ161, XZ163, XZ165, XZ168) was most closely related to the cultivated barley (subpopulation 4), and the subpopulation 7 (XZ120, XZ151, XZ153) and the cultivated barleys were two most genetically distant populations. The genetic relation of the subpopulations suggested that the subpopulation 3 contained the most domesticated genotypes among the studied wild barley. Futhermore, the other subpopulations of wild barley, especially subpopulation 7, may be the important germplasm resource for the improvement of cultivars tolerant of abiotic and biotic stresses. These results were consistent with recent clustering studies in the Tibetan wild barley genotype using DArT markers and SNPs. This indicates that the cluster analysis using EST-SSR and SSR markers is an effective way to determine the structure of populations and can constitute a solid foundation for the genetic variation study.
The 49 novel EST-SSRs and 20 genomic SSR markers developed from 96 barley genotypes were highly polymorphic and could be employed to examine genetic diversity, evolution, linkage mapping, comparative genomics, and population structure. The Tibetan wild barley showed higher genetic variation than cultivated barley, and the cultivated subpopulation could be separated from the wild barley clearly. For further studies, these developed markers could be useful in identifying trait-marker association of interest in the marker-assisted breeding programs in barley.
Conceived and designed the experiments: FBW MZ. Performed the experiments: MZ. Analyzed the data: MZ WHM. Contributed reagents/materials/analysis tools: FBW GPZ WHM. Wrote the paper: MZ FBW GPZ WHM.
- 1. Russell J, Booth A, Fuller J, Harrower B, Hedley P, et al. (2004) A comparison of sequence-based polymorphism and haplotype content in transcribed and anonymous regions of the barley genome. Genome 47: 389–398.
- 2. Nevo E, Apelbaum-Elkaher I, Garty J, Beiles A (1997) Natural selection causes microscale allozyme diversity in wild barley and lichen at ‘Evolution Canyon’ Mt Carmel Israel. Heredity 78: 373–382.
- 3. Dai F, Nevo E, Wu DZ, Comadran J, Zhou MX, et al. (2012) Tibet is one of the centers of domestication of cultivated barley. Proc Natl Acad Sci USA 109: 16969–16973.
- 4. Ren XF, Nevo E, Sun DF, Sun GL (2013) Tibet as a Potential Domestication Center of Cultivated Barley of China. PloS One 8: e62700.
- 5. Zhang QF, Dai XK (1992) Comparative assessment of genetic variation at 6 isozyme loci in barley from two centers of diversity: Ethiopia and Tibet. Acta Genet Sin 19: 236–243.
- 6. Zhao J, Sun HY, Dai HX, Zhang GP, Wu FB (2010) Difference in response to drought stress among Tibet wild barley genotypes. Euphytica 172: 395–403.
- 7. Dai HX, Shan WN, Zhao J, Zhang GP, Li CD, et al. (2011) Difference in response to aluminum stress among Tibetan wild barley genotypes. J Plant Nutr Soil Sc 174: 952–960.
- 8. Wu DZ, Qiu L, Xu LL, Ye LZ, Chen MX, et al. (2011) Genetic variation of HvCBF genes and their association with salinity tolerance in Tibetan annual wild barley. PloS One 6: e22938.
- 9. Michalek W, Weschke W, Pleissner KP, Graner A (2002) EST analysis in barley defines a unigene set comprising 4,000 genes. Theor Appl Genet 104: 97–103.
- 10. Varshney RK, Graner A, Sorrells ME (2005) Genic microsatellite markers in plants: features and applications. Trends Biotechnol 23: 48–55.
- 11. Zeng SH, Xiao G, Guo J, Fei ZJ, Xu YQ, et al. (2010) Development of a EST dataset and characterization of EST-SSRs in a traditional Chinese medicinal plant, Epimedium sagittatum (Sieb. Et Zucc.) maxim. BMC Genomics 11: 94–104.
- 12. Li M, Zhu L, Zhou CY, Lin L, Fan YJ, et al. (2012) Development and characterization of EST-SSR markers from Scapharca broughtonii and their transferability in Scapharca subcrenata and Tegillarca granosa. Molecules 17: 10716–10723.
- 13. Jaikishan I, Rajendrakumar P, Ramesha MS, Viraktamath BC, Balachandran SM, et al. (2010) Prediction of heterosis for grain yield in rice using ‘key’ informative EST-SSR markers. Plant Breeding 129: 108–111.
- 14. Mohan A, Goyal A, Singh R, Balyan HS, Gupta PK (2006) Physical mapping of wheat and rye expressed sequence tag-simple sequence repeats on wheat chromosomes. Crop Sci 47(S_1): S3–S13.
- 15. Tang J, Gao L, Cao Y, Jia J (2006) Homologous analysis of SSR-ESTs and transferability of wheat SSR-EST markers across barley, rice and maize. Euphytica 151: 87–93.
- 16. Li L, Wang J, Guo Y, Jiang F, Xu Y (2008) Development of SSR markers from ESTs of gramineous species and their chromosome location on wheat. Prog Nat Sci 18: 1485–1490.
- 17. Castillo A, Budak H, Varshney RK, Dorado G, Graner A (2008) Transferability and polymorphism of barley EST-SSR markers used for phylogenetic analysis in Hordeum chilense. BMC Plant Biol 8: 97.
- 18. Willsmore KL, Eckermann P, Varshney RK, Graner A, Langridge P (2006) New eSSR and gSSR markers added to Australian barley maps. Crop Pasture Sci 57: 953–959.
- 19. Breseghello F, Sorrells ME (2006) Association mapping of kernel size and milling in wheat (Triticum aestivum L.) cultivars. Genetics 172: 1165–1177.
- 20. Yeh FC, Yand RC, Boyle T (1999) POPGENE (Version 1.31): Microsoft Window-bases freeware for population genetic analysis, University of Alberta and the Centre for International Forestry Research.
- 21. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155: 945–959.
- 22. Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol 14: 2611–2620.
- 23. Gupta PK, Rustgi S (2004) Molecular markers from the transcribed/expressed region of the genome in higher plants. Funct Integr Genomics 4: 139–162.
- 24. Mian MA, Saha MC, Hopskins AA, Wang ZY (2005) Use of tall fescue EST-SSR markers in phylogenetic analysis of cool-season for age grasses. Genome 48: 637–647.
- 25. Varshney RK, Thiel T, Stein N, Langridge P, Graner A (2002) In silico analysis on frequency and distribution of microsatellites in ESTs of some cereal species. Cell Mol Biol Lett 7: 537–546.
- 26. Kumpatla SP, Mukhopadhyay S (2005) Mining and survey of simple sequence repeats in expressed sequence tags of dicotyledonous species. Genome 48: 985–998.
- 27. Xie WG, Zhang XQ, Cai HW, Liu W, Peng Y (2010) Genetic diversity analysis and transferability of cereal EST-SSR markers to orchardgrass (Dactylis glomerata L.). Biochem Syst Ecol 38: 740–749.
- 28. Scott KD, Eggler P, Seaton G, Rossetto M, Ablett EM, et al. (2000) Analysis of SSRs derived from grape ESTs. Theor Appl Genet 100: 723–726.
- 29. Pompanon F, Bonin A, Bellemain E, Taberlet P (2005) Genotyping errors: causes, consequences and solutions. Nat Rev Genet 6: 847–859.
- 30. Ellis RP, Forster BP, Robinson D, Handley LL, Gordon DC, et al. (2000) Wild barley: a source of genes for crop improvement in the 21st century. J Exp Bot 51: 9–17.
- 31. Jin XL, Cai SG, Han Y, Wang J, Wei K, et al. (2011) Genetic variants of HvGlb1 in Tibetan annual wild barley and cultivated barley and their correlation with malt quality. J Cereal Sci 53: 59–64.
- 32. Sun DF, Ren WB, Sun GL, Peng JH (2011) Molecular diversity and association mapping of quantitative traits in Tibetan wild and worldwide originated barley (Hordeum vulgare L.) germplasm. Euphytica 178: 31–43.