Medicinal plant trade is important for local livelihoods. However, many medicinal plants are difficult to identify when they are sold as roots, powders or bark. DNA barcoding involves using a short, agreed-upon region of a genome as a unique identifier for species– ideally, as a global standard.
What is the functionality, efficacy and accuracy of the use of barcoding for identifying root material, using medicinal plant roots sold by herbalists in Marrakech, Morocco, as a test dataset.
In total, 111 root samples were sequenced for four proposed barcode regions rpoC1, psbA-trnH, matK and ITS. Sequences were searched against a tailored reference database of Moroccan medicinal plants and their closest relatives using BLAST and Blastclust, and through inference of RAxML phylograms of the aligned market and reference samples.
Sequencing success was high for rpoC1, psbA-trnH, and ITS, but low for matK. Searches using rpoC1 alone resulted in a number of ambiguous identifications, indicating insufficient DNA variation for accurate species-level identification. Combining rpoC1, psbA-trnH and ITS allowed the majority of the market samples to be identified to genus level. For a minority of the market samples, the barcoding identification differed significantly from previous hypotheses based on the vernacular names.
Endemic plant species are commercialized in Marrakech. Adulteration is common and this may indicate that the products are becoming locally endangered. Nevertheless the majority of the traded roots belong to species that are common and not known to be endangered. A significant conclusion from our results is that unknown samples are more difficult to identify than earlier suggested, especially if the reference sequences were obtained from different populations. A global barcoding database should therefore contain sequences from different populations of the same species to assure the reference sequences characterize the species throughout its distributional range.
Citation: Kool A, de Boer HJ, Krüger Å, Rydberg A, Abbad A, Björk L, et al. (2012) Molecular Identification of Commercialized Medicinal Plants in Southern Morocco. PLoS ONE7(6): e39459. https://doi.org/10.1371/journal.pone.0039459
Editor: Robert DeSalle, American Museum of Natural History, United States of America
Received: March 5, 2012; Accepted: May 21, 2012; Published: June 27, 2012
Copyright: © 2012 Kool et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This study was funded through a SRL-MENA grant from the Swedish Research Council, Sida and Formas. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have the following interest: The bulk of the laboratory equipment for the plant DNA lab at the University of Marrakech was generously donated by VWR International Sweden. This does not alter the authors’ adherence to all the PLoS ONE policies on sharing data and materials, as detailed online in the guide for authors.
1.1 Marrakech Medicinal Plant Trade and the Moroccan Herbal Pharmacopoeia
Traditional medicine has played an important role in many North African societies, and continues to do so today . This is evident not least in the Moroccan city of Marrakech, situated at a crossroads of trade routes between the High Atlas Mountains and surrounding coastal plains.
The traditional equivalent of the doctor in Moroccan medicine is the herbalist – a profession that continues to be practiced in Marrakech, manifested by the herbalist-owned drug stores that line the market districts of the medina, or old town (Fig. 1). In these shops, Marrakech herbalists stock a variety of plant parts and plant-derived products, sold either separately or in mixtures. In general, these plant parts are harvested in the wild , by specialized collectors and reach the herbalists through middlemen and wholesalers .
An important part of the plant inventory of Moroccan herbalists consists of barks and roots, which typically possess few physical characteristics that enable accurate morphology-based identification. All herbalists are able to provide information about the local name of a plant product, its medicinal uses and origins, but this information may be imprecise, or insufficient for species identification purposes, especially considering that herbalists often do not possess knowledge of medicinal plants in the wild . Some medicinal products have multiple synonymous names, and in other cases the same vernacular name is applied to multiple plant species . In other words, confirming the identity of a root sample bought from these herbalists has so far presented a challenge. In addition, since the collection of roots usually requires the whole plant to be dug up, the trade of medicinal roots has a large impact on natural plant populations , .
The identity of the plants being sold in these markets has conservational as well as medical implications. For example, rare or endangered species could inadvertently be collected if they are easily confused with their more abundant relatives. Likewise, increasing demands for medicinal products may lead to local over-harvesting extinction of otherwise non-threatened plant species. Misidentified collections could also lead to the introduction of toxic or otherwise unsuitable species to the market, with potential health risks to end-users , . For example Chinese star anise (Illicium verum Hook f.) is commonly used in herbal teas, whereas Japanese star anise (I. anisatum L.) causes neurotoxic effects in infants when used as a substitute for Chinese star anise . In all cases, appropriate measures could be taken if a reliable method for species identification of medicinal plant products existed.
1.2 Molecular Identification
Species identification on the basis of DNA sequences has been done for some time, e.g. fungi , animals –, plants . Hebert et al.  proposed to use the mitochondrial gene CO1 as the standard barcode for all animals, and this was readily adopted by the scientific community. Assessments have since shown that CO1 can be used to distinguish over 90% of species in most animal groups , . In recent years barcoding research has grown substantially, and worldwide efforts coordinated by the Consortium for the Barcode of Life (CBOL) are now being focused on retrieving barcode sequences from all organisms .
Barcoding in other major groups, such as plants, has developed at a markedly slower pace. Early on, it became clear that the mitochondrial genome evolves far too slowly in most plants to allow it to distinguish between species , . Various genes and non-coding regions in the plastid genome have been put forward as alternatives , , –. In addition to being sufficiently fast evolving, a molecular barcode must also be flanked by conserved regions that can function as universal primer binding sites for PCR reactions . A single barcoding locus combining these two traits has not been found for plants, and it appears that a combination of two or more, probably plastid, loci will almost certainly be required to approach the level of species discrimination and universality that CO1 provides for animals . In 2009, CBOL proposed matK and rbcL combined as a universal barcode for land plants, but with the option to supplement it with one or two other markers , for example psbA-trnH or ITS .
Most species concepts agree on species being evolving metapopulation lineages, but delimiting species is often more problematic . The importance role of hybridization in plant speciation makes species delimitation in plants much more complicated than in animals . Species delimitation based on molecular data in the light of coalescent theory is being developed but requires many accessions as well as many loci . In an ideal situation, studies at population genetic level would have to be done for all species in a DNA barcoding database; this is far from being achieved at present and instead a more or less arbitrary cut-off value for sequence divergence is often used –. The main methodological problem with DNA barcoding remains that it is often impossible to tell the difference between interspecific sequence variation and intraspecific sequence variation , , . But notably, difficulties in distinguishing between intra- and interspecific variation are a widespread problem in morphological species delimitation as well.
Even in animals molecular barcoding is problematic, since approximately 88% of the estimated 7.8 million animal species lack a formal description , , and adopting an arbitrary cut-off value for pairwise sequence divergence distance to speed up cataloguing these undescribed species would be disastrous for existing taxonomic treatments in animals . Also in fungi, another group in which the vast majority of the taxa is undescribed, an arbitrary sequence divergence threshold for the nuclear ribosomal ITS region proved to be not feasible , . The fields of molecular identification, DNA barcoding, and DNA taxonomy are still very much in development, and are certainly not without practical or theoretical problems.
Despite these problems, DNA barcoding has been applied to a broad range of problems, including taxonomic studies of cryptic taxa or species complexes, e.g. skipper butterflies . Barcoding has also been used in ecological studies to survey animal diets through the analysis of plant remains in faeces , in identifying plant species from wood samples , and as a tool to control the cross-border trade of aquarium fish . In addition molecular identification has been used in several studies on traditional medicine , , –. Barcoding lends itself particularly well to these forensic applications where only a small tissue sample from the organism is available for identification, or where the sample is degraded or has been processed.
Methods for matching an unknown query sequence with a reference database tend to be either based on sequence similarity like BLAST  (e.g. ) and Blastclust  (e.g. ), or on tree-based criteria , , , . Several other alignment-free methods, e.g. DNA-BAR/DEGENBAR and ATIM, have been proposed, but these are reported to perform equally well as BLAST , . Sequence similarity methods require a decision on a threshold at which a sequence is considered to belong to a certain taxon, which can be somewhat subjective and may be applicable to certain taxa but not to others , . Tree-based methods, in which a query sequence is considered to belong to a certain taxon if it is found in a clade consisting of reference sequences for that taxon, have as a clear advantage that no cut-off value is necessary, but they do require an alignment of the query and reference sequences combined, which can be problematic for highly variable sequences . Nonetheless, the success of any method used to assign sequences to a certain taxon is ultimately dependent on the taxonomic coverage of the reference database.
There is a wide variety of studies that assess the efficacy of molecular identification techniques by analysing the sequence variation within a large number of known samples , , , , or by identifying query sequences from the same dataset as the reference sequences , , , . Studies using a separate query dataset to investigate the identification success of a certain marker or marker-combination is not commonly done. Gonzalez et al.  used a reference database created for a lowland rainforest area in French Guiana to identify saplings from the same area and reported a significantly lower identification success rate (70%) than most other studies due to low sequence variation in a few species-rich clades. A study on ingredients of commercial teas showed that rbcL and matK could identify roughly 70% of the ingredients in tea, but that sequence variation between closely related tea ingredients was in the same order of magnitude as sequencing error .
In this study we investigate which medicinal roots are commercialized in the souks of Marrakech using a regional reference database approach and sequence data from the plastid genome (matK, psbA-trnH, and rpoC1) as well as the nuclear genome (ITS). RbcL, albeit one of the standard plant DNA barcodes, was not included as its sequence variation is comparable to that of rpoC1 , . We compare using BLAST combined with additional data on the occurrence of the plant in Morocco, with the use of Blastclust and a RAxML analysis of the aligned query and reference sequences and were able to identify roughly half of the samples to species level and an additional third of the samples to genus level.
2.1 DNA Extraction, PCR and Sequencing Success
The standard extraction protocol worked for approximately 75% of the market and all but one of the reference samples. However, for 28 out of 111 market samples the extraction method consistently failed to yield PCR products.
Amplification of matK yielded PCR products for less than 30% of the reference specimens and matK was subsequently excluded as a potential barcode in this study, as was also done by Piredda et al.  and Sass et al. . Sequencing success rates for the other three loci (rpoC1, psbA-trnH, and ITS) for both reference- and market samples are detailed in Table 1, and most roots were successfully sequenced for at least two of the regions (Data S1). RpoC1 sequence lengths ranged from 409 to 545 bp, psbA-trnH sequence lengths from 141 to 658 bp, and ITS sequence lengths from 194 to 748 bp. The reference samples (Data S2), which were extracted from herbarium vouchers collected mainly in Morocco (Data S3), were consistently easier to sequence than the market samples.
A total of nine ITS sequences obtained from the market samples and ten of the reference ITS sequences turned out to have fungal contamination. Twenty-nine ITS sequences of the market samples and fourteen of the reference samples could not be used because of polymorphisms.
The extended reference databases, obtained through downloading all sequences that yielded an E-value of 0.0 in the initial BLAST searches consisted of 1864 (rpoC1), 2332 (psbA-trnH), and 3168 (ITS) sequences. The aligned rpoC1 dataset consisted of 652 aligned positions and the aligned datasets of psbA-trnH and ITS of 706, respectively 1327 aligned positions. All three alignments contained insertion-deletions (indels), but the aligned matrix of the coding region (rpoC1) contained significantly less indels than the ITS and psbA-trnH matrices. The RAxML phylograms (Data S4, S5, S6) and Blastclust output (Data S7, S8, S9) for all three datasets are presented in the Dataemental data.
The identification success was dependent on marker, identification method as well as taxonomic group (Fig. 2, Data S3). Blastclust analysis of the psbA-trnH data yielded fewest identifications (24 of 83 sequences identified to either species or genus level) whereas BLAST analysis of the rpoC1 data was most successful (64 of 83 sequences identified to either species of genus level). The identification success was somewhat higher for monocots than for eudicots using rpoC1 or ITS, whereas eudicots were more readily identified using psbA-trnH.
The identification of the market samples and how these identifications differ from those based on the pharmacopeia is presented in Table 2 and discussed in Data S10. In total 15 (18%) of the samples were identified as belonging to a different species than the one mentioned in the pharmacopoeia. Of these, ten belonged to a different genus than earlier hypothesized and five to a different family.
3.1 Analyses and Role of Markers, Methods, and Taxonomic Group
The main advantage of this chloroplast region is its high amplification success rate, as confirmed here –88% of all reference samples were successfully sequenced (Table 1). This is consistent with many other studies, which show this locus typically scores the highest in this aspect , . On the other hand rpoC1 exhibits a slower rate of evolution than non-coding plastid regions and some plastid genes such as matK . In this study, roughly half (45%) of all root samples yielded species level identifications and 37.5% yielded genus level identifications for rpoC1 (Fig. 2). The relatively low number of species level identifications is probably due to identical sequences for different species. Such cases would probably increase in frequency if the reference database were larger and contained more species and more diverse genera.
Sequencing success for this locus, although lower than that of rpoC1, was relatively high for reference sequences (81.4%) and moderate for root sequences (74.4%). Sequencing success was particularly low for monocots, only in 50% of the market samples and 66% of the reference samples yielded a psbA-trnH sequence. Discriminatory power was somewhat superior to that of rpoC1. Almost 60% (59.7%) of the samples that yielded a sequence could be identified to the species level and 24.2% to genus level. However, assembling the psbA-trnH trace files into contigs was not always straightforward, as repeats of 10 or more consecutive A’s or T’s induced Taq-polymerase errors, which made it difficult to accurately assemble the trace files. This resulted in a number of unreliable sequences that could not be used. It has been suggested that this feature of psbA-trnH and other non-coding regions prevent their use in future large-scale barcoding projects, in which manual editing of sequences is necessarily kept to a minimum . Also, although not problematic in this study, psbA-trnH occurs in more than one copy in cycads  and in a number of sedges .
ITS proved to be most useful marker for identifying samples to species level (63.8%) or genus level (29.8%) once a sequence was obtained. However, 45% of the market and reference sequences could not be used, 34% due to polymorphisms, and 11% due to fungal contamination. Fungal contamination may in this case have been caused by molds on the final dried medicinal roots or by mycorrhizal fungi that were present in the roots. Chen et al.  also reported a very low sequencing success rate for monocots for ITS as a whole and Gonzalez et al.  reported difficulties sequencing ITS in a study on Amazonian forest trees. In a recent study, the China Plant BOL Group found significantly lower levels of polymorphism and fungal contamination after sequencing a large sample of angiosperms . Chen et al.  argue for including ITS2 as a standard barcode, but do not discuss polymorphism difficulties, and report no fungal contamination in their samples. A possible explanation for this is that the study uses leaf samples from freshly collected plant material of plants known to be used in Traditional Chinese Medicine as opposed to the processed medicinal products themselves. Their arguments to include a marker from the nuclear genome are legitimate, but we find that polymorphism and fungal contamination (particularly for root material) do cause problems in using ITS as a marker for DNA barcoding.
BLAST in combination with species distribution data as well as critical evaluation of the presence or absence of related species in GenBank was the most successful way to identify the market samples (Fig. 2). Several other studies also indicate that BLAST outperforms other methods like DNABAR, ATIM, Blastclust, neighbor-joining trees, and PWG-distance, the distance method adopted by the CBOL Plant Workgroup , , .
The tree-based method was relatively successful for the identification of market samples using rpoC1 (51.3% species level identification), which is a coding region that could be readily aligned using MAFFT. The species level identification frequency for ITS was also relatively high, 48.9%. PsbA-trnH sequences were more difficult to identify using MAFFT and RAxML, 29%. A possible explanation for the difference in identification success between ITS and psbA-trnH is that the highly conserved 5.8S coding region in ITS facilitated the alignment. Also, the ITS dataset contained roughly one third more sequences than the psbA-trnH dataset, which might have played a role in the alignment process. A clear advantage of tree-based methods is the branch lengths, which provide a visual representation of sequence divergence. The relative success of the coding region in applying tree based methods supports the idea of using coding plastid regions as universal barcoding markers.
The Blastclust analyses resulted in many unidentified samples for all markers that either belonged to clusters containing many different reference sequences or to clusters that contained only query sequence (Data S7, S8, S9). Adjusting the similarity threshold had no effect on the number of identifications, probably because different lineages have different evolutionary rates and no single threshold could fit a dataset containing many unrelated taxa, especially if there is no clear distinction between inter- and intraspecific variation.
3.1.7 Role of taxonomic group.
Nineteen of the 83 market samples (23%) yielded a sequence for only one of the markers, of which twelve were rpoC1 sequences, four psbA-trnH, and three ITS. Of these samples one was a basal angiosperm (Aristolochia), ten were monocots and 8 were eudicots. This represents all the basal angiosperms, 77% of the monocots, and 12% of the eudicots.
The sequencing success for all markers was clearly higher for eudicots than for monocots (and basal angiosperms) for both market and reference samples (Table 1). This could be due to primer fit problems, secondary metabolites or differences in how well the DNA in these groups tolerate long term storage as either herbarium vouchers or dried medicinal roots.
Eudicots were on average most successfully identified using ITS (63.8% resp. 29.8% to species and genus level) after correction for the number of sequences that were obtained. Species level identification of eudicots was least frequent using rpoC1 (48.3%). Within the eudicots the Apiaceae could be identified to species level twice as often as the Asteraceae despite a higher sequencing success for the Asteraceae. Species level identification was higher for Apiaceae than for Asteraceae for all three markers. Caryophyllaceae could either be identified to species level (in the cases of Corrigiola and Silene, the latter being due to the large number of ITS sequences for this group available in GenBank) or only to family level, showing that even within one family the evolutionary rates can differ enough to cause considerable variation in species identification success using molecular data.
All rpoC1 monocot sequences could be identified to either species or genus level (77.8% resp. 22.2%), whereas only 50%, resp. 16.7% of the monocot psbA-trnH sequences could be identified to the species and genus level. Only two ITS monocot query sequences were obtained. It is noteworthy that six of the eight monocot market samples were shown to belong to the same species, Dioscorea communis (L.) Caddick & Wilkin.
The combined analyses did not show improved species level identification as compared to the individually analyzed markers even after we corrected for the missing query sequences (Data S1). This is in part due to the limited reference dataset that was used, but in the individual analyses identification success can often be traced back to one or two specific marker(s) whereas the other marker(s) yielded identical sequences for several species or even genera.
Our study shows a somewhat lower species level identification success-rate than several other studies that use the same markers (Table 3). This can in part be explained by the nature of the market samples. Sequencing failure for many of the market samples may be due to post-harvest processing resulting in DNA degradation, such as drying at high temperatures, slow drying under moist conditions or storage in alcohol. Another study targeting medicinal products reports similar difficulties obtaining sequence data from degraded samples . Also in contrast to most studies testing the efficacy of molecular identification of plant material our reference database presumably consisted of sequences obtained from different populations than those of the query sequences, an approach that we deem realistic since a global barcoding database would inevitably only contain samples from a fraction of the populations of any given species.
3.2 Ethnobotanical and Environmental Implications
Overall we found that 18% of the samples were misidentified in the pharmacopeia. The apparent discrepancy between the barcoding identifications and the vernacular names can largely be explained by the lack of a one-to-one correspondence between the vernacular names of plants (or plant products) and biological species. This phenomenon is a feature of virtually all folk classifications systems of living organisms . However adulteration and misidentification play a major role as well.
3.3 Taxonomic Under-differentiation and Product Qualities
Nineteen samples analysed belonging to five plant products turn out to be species complexes. That is groups of species for which the same vernacular name is used. This appears to be due to taxonomic under-differentiation, which is failure to distinguish between closely related species. In some instances, the species identification for a particular root sample seems to correlate with the “quality” assigned to the root product by the herbalist. The most clear-cut case is ’ud-mserser, of which the samples designated as the highest in quality were identified as Daucus crinitus Desf. (Apiaceae), whereas those designated as secondary quality were found to correspond to closely related Thapsia spp. (Apiaceae) ,  (Table 2). Another example of under-differentiation is nnjem that is hypothesized to be Cynodon dactylon (L.) Pers. in the pharmacopoeia , but is found to include other grasses as well.
The various types of sargina (6 samples tested, see Table 3) constitute another species complex consisting of plants that belong to the carnation family (Caryophyllaceae), although here it is less clear how the types actually relate to biological entities, if they do at all. In all of these examples, the herbalists treat the species as subtypes of the same vernacular name suggesting that they are believed to share the same medicinal properties and are used to treat the same ailments.
3.4 Taxonomic Over-differentiation
Taxonomic over-differentiation is where one biological species is referred to by several vernacular names. For example, frifra, bouzfour, terta and zziyata were all identified as Kundmannia sicula DC. (Apiaceae) in at least one of the samples analysed. The most common vernacular for this species is zziyata according to Bellakhdar , while frifra and bouzfour usually refer to other members of the family . The latter two cases might therefore have resulted from a misidentification by the collector. On the other hand, terta, normally applies to the unrelated Withania frutescens (L.) Pauquy (Solanaceae), which in the wild is very unlikely to be confused with any of the other three species. This is more likely error on the part of the herbalist due to a mix-up of similar-looking prepared root products. Silene was either sold as sargina or as tigigest, but it should be noted that these names do probably refer to two not very similar looking species of Silene and might in fact not represent a case of taxonomic over-differentiation. Echinops was found to be sold as taskra, besbas and horsef. Only taskra is mentioned as a vernacular name for Echinops by Bellakhdar . The other two product names usually refer to Cynara (horsef) or to Foeniculum (or possibly Anthum foeniculoides, cf. Data S9) in the case of besbas and Echinops seems to be popular as an adulterant for these products. The names bougoudz and ndkhir are both in use for Dioscorea communis a plant that is new for the Moroccan traditional pharmacopoeia. In total taxonomic over-differentiation was inferred to affect 22 samples belonging to roughly one-third (11) of the products.
3.5 Adulteration, Misidentification, and Toxicity
The trade in medicinal plants provides the main source of income for herbalists, and economic constraints may provide incentive for herbalists to substitute cheaper and more readily available species for rare ingredients, misleadingly selling them under the same name. Such cases of deliberate adulteration of coveted ingredients are often difficult to distinguish from cases of under or over-differentiation or misidentification. Many of the cases mentioned in the previous sections could have occurred either inadvertently (by misidentification), or purposefully.
A clear example of possible adulteration is the sample of bukbuka, which translates as Colchicum autumnale L. . This plant has traditionally been used to treat acute arthritis and renal disorders , but Bellakhdar  states that it is no longer traded in Morocco owing to its extreme toxicity. Perhaps unsurprisingly, molecular identification showed the vernacular name specified by the herbalist to be misleading. Instead the sample was identified as Bunium sp. (for which bukbuka does not apply), a plant with similar bulbous underground parts, but non-toxic and entirely unrelated to Colchicum. If Bellakhdar’s note that Colchicum is no longer used in the Moroccan pharmacopoeia is correct, then the usage of the name bukbuka is probably intentionally deceptive. Other cases of adulteration or misidentification comprise both samples of l-harmel which instead of harmala L. were identified as Carlina brachylepis (Batt.) Meusel & Kästner, and a species of grape (Vitis sp.) and two samples of ’aqirqarha that were identified as species of Catananche instead of Anacyclus. ’Aqirqarha is a relatively expensive product and adulteration is therefore profitable.
In total eight samples belonging to six different products were probably adulterated, or at least misidentified. Adulteration and misidentification issues raise concerns of potentially toxic plants being sold to the consumers, sometimes without the herbalist being aware of it. However, two of the three products, which are known to be highly toxic (bukbuka and l-harmel) are clearly being replaced by less harmful plants. Only Carlina gummifera (L.) Less. is still being sold regularly as addad.
Another plant that raises public health concerns is Arundo donax L., a giant reed that has shown potential for use in phytoremediation of soils with high concentrations of arsenic, cadmium and lead . Significantly elevated concentrations of heavy metals were found in the roots of A. donax grown on polluted soils , . Elevated heavy metal concentrations might be a concern when A. donax roots are consumed for medicinal purposes, depending on where the plants are collected.
3.6 Conservation Issues
Several endemic plants are commercialized as medicinal roots (Data S10), like for example Acacia gummifera Willd., Silene mentagensis Coss., and possibly Anethum foeniculoides Maire & Wilczek. Endemic plants are not necessarily rare, but they could quickly become critically endangered if they are harvested in an unsustainable way. A number of products that could be identified to genus level belong to genera that contain rare or very rare species. For example half of the species of Armeria occurring in Morocco are rare and locally or regionally endemic. Additional field studies together with the people collecting these plants combined with a more taxon-specific barcoding approach could give insight into whether these endangered species enter the markets as well and if the plant collectors are aware of the differences in morphology and abundance between these species. The vast majority of the roots that are sold in Marrakech belong to species that are not threatened and that are common, also outside Morocco. Nevertheless, the high level of adulteration may indicate that there are species that are locally overexploited or endangered.
Roughly one fifth of the market samples that were analyzed proved to be something other than what was hypothesized on the basis of the Moroccan pharmacopoeia. There seems to be a trend towards toxic plants being replaced by species that are less dangerous. The analyses showed that several endemic and possibly also endangered plants are being commercialized in Marrakech. Adulteration is common and may indicate that the original products are becoming locally endangered. Nevertheless the majority of the medicinal roots that are sold belong to species that are common, and not known to be endangered.
Sequencing success was highest for rpoC1 and lowest for ITS (Table 1), mainly due to polymorphism, but also due to fungal contamination. Eudicot samples yielded a higher sequencing success than monocots and basal angiosperms. Identification success was highest using BLAST combined with data on species distribution and information on presence or absence of species in the reference database. Tree-based identification, after alignment using MAFFT, was very successful for coding rpoC1, moderately successful for ITS and had low success for psbA-trnH due to alignment problems. Identification success for each marker depended on taxonomic group.
The identification success in our study is somewhat lower than in several other studies that involved testing the efficacy of molecular identification on the basis of one large dataset ,  or by using query sequences from the same populations as the reference sequences . This is probably due to a combination of high intraspecific variation, and low number of sequences per species in the reference datasets. A significant conclusion from our results is that unknown samples are more difficult to identify than suggested, especially if the reference sequences were obtained from different populations than the unknown material, even when the reference samples were collected in the same country. A global barcoding database should therefore contain a large number of sequences from different populations of the same species to ensure that the reference sequences characterize the species throughout its distributional range.
Although molecular identification often fails to assign individuals to species our results demonstrate that it is a helpful tool in providing clues for identifying medicinal plant products that lack morphological features for species identification.
Materials and Methods
5.1 Market Samples
A total of 111 market samples of medicinal roots were bought from a total of 10 herbalists in central Marrakech. 96 of these samples were initially collected in October and November 2007, and additional samples of 15 products that proved to be difficult to sequence were collected in November 2008. All samples were stored at the herbarium of the Natural History Museum Marrakech and at Uppsala University’s herbarium (UPS). The vernacular name for each sample as communicated by the herbalist was recorded, along with the herbalist’s name and the place and date of purchase. In most cases several samples were collected per vernacular name, resulting in the collection’s comprising 37 different medicinal plant products (Table 2, Data S1). Some products are further divided by the herbalists into subtypes specified by modifiers placed after the main noun (e.g. sargina lmsouwsa vs. sargina rrahmania). Putative scientific names have been assigned to the material based on the Moroccan vernacular names, using the most recent herbal pharmacopoeia of Morocco . All roots were purchased as single products to avoid mixtures of different plants.
5.2 Reference Database
Reference species were selected based on the putative scientific names of the 37 medicinal plant products. Species known to occur in Morocco were selected according to the Flore practique du Maroc , , Catalogue des plantes vasculaires du nord du Maroc , , Catalogue des plantes vasculaires rares, menacées ou endémiques du Maroc , and Flore vasculaire du Maroc , , as this is the main origin for medicinal roots traded in Marrakech . All genera considered candidates for the identity of a certain market sample were comprehensively sampled, while larger genera with 7 or more species were sampled with up to three or four species (Data S2).
The reference database was complemented for market samples that could not be identified using the selection process described above by sequencing the nuclear ITS region. These ITS sequences were then queried against GenBank’s nr-database using the Megablast algorithm with default parameters. The highest-scoring hits from these queries were used as preliminary identifications to select additional reference material (Data S2).
In total, the reference database consisted of plant material from 131 herbarium specimens kept at the Reading University Herbarium (RNG), UK. Most of these voucher specimens were collected in Morocco (Data S3).
5.3 DNA Extraction
Root material was extracted using a slightly modified version of the Carlson/Yoon DNA isolation procedure . About 2 g of each sample was fragmented into coarse grains, if necessary using a scalpel. The sample fragments were transferred to a mortar and dry-ground at room temperature with sterile grinding sand until homogenized. No more than 500 µg of the ground material was transferred to a 2 ml microfuge tube after which the regular protocol was followed.
Total DNA of leaf material of the reference samples was extracted and purified in the same way as for the market samples, but using a Mini-Beadbeater (BioSpec Products) instead of manual grinding: ca. 0.02 g of plant material was combined with silica beads, 750 µl of CTAB (hexadecyl trimethyl ammonium bromide) and 20 µl mercaptoethanol in a 2 ml tube. The tube was put into the Mini-Beadbeater and shaken for 40 seconds or more, and then incubated at 65°C for 45 min, intermittently mixed by inverting.
Each total DNA extract was further purified using the GE Illustra GFX™ PCR DNA and Gel Band Purification Kit following the manufacturer’s protocol (GE Healthcare).
5.4 PCR and Sequencing
Barcoding loci and primers were selected from the Royal Botanic Gardens Kew Phase 2 Protocols and Update on plant DNA barcoding . These consisted of ITS primers ITS-4  and ITS-5 , matK primers, matK-2.1a and matK-5 , rpoC1 primers, rpoC1-2 and rpoC1-4 , and psbA-trnH primers, psbA and trnH . PCR amplification of, ITS, matK, rpoC1 and psbA-trnH was done on purified total DNA from all reference and market samples.
PCR amplification of purified total DNA was performed in 200 µl reaction tubes with a total volume of 50 µl. Each tube contained a mixture of 5 µl reaction buffer (ABgene, 10x), 3 µl MgCl2 (25 mM), 1 µl dNTP’s (10 µM), 0.25 µl Taq-polymerase (ABgene; 5 U/µl), 0.25 µl BSA (Roche Diagnostics), 12.5 µl of each primer (2 mM) and 1 µl template DNA. The PCR conditions were as follows for the plastid markers: an initial 2 min of denaturation at 94°C followed by 38 cycles of 30 sec of denaturation at 94°C, 40 sec annealing at 53°C, and 40 sec elongation at 72°C ending with an additional elongation of 5 min at 72°C. The PCR-programs used for ITS was: an initial 5 min of denaturation at 98°C followed by 35 cycles of 30 sec of denaturation at 98°C, 1 min annealing at 55°C, and a 1 min elongation at 72°C ending with an additional elongation of 10 min at 72°C resp. an initial 2 min of denaturation at 98°C followed by 35 cycles of 10 sec of denaturation at 98°C, 1 min annealing at 60°C, and a 1 min elongation at 72°C ending with an additional elongation of 8 min at 72°C.
Following the PCR, we checked for PCR product by running 5 µl of sample with 2 µl of loading buffer on a 1% agarose gel in TAE buffer. The gel was then stained in a bath with 1% ethidiumbromide and the fragments were visualized using UV-light.
Sequencing was performed by Macrogen Inc. (Seoul, South Korea) on an ABI3730XL automated sequencer (Applied Biosystems). The same primers used in PCR amplification were also used for the sequencing reactions. Trace files were aligned with the programs Gap4 and Pregap4 , both modules in the Staden package .
5.5 Data Analyses
All reference sequences were submitted to GenBank. NCBI’s web-based megablast algorithm using the default settings was then used to identify the query sequences. Each identification was made manually taking E-value, maximum identity, number of closely related species represented in the database, as well as distribution of the plant(s) in question into consideration.
All sequences that yielded an e-value of 0.0 in the BLAST searches were then downloaded from GenBank in fasta-format to create an extended reference database for each marker. Sequences that were longer than 700 bp (plastid markers), resp. 800 bp (ITS) and sequences that had more than 5% unspecified nucleotides (Ns) were removed using BioPerl . The query sequences were then added to the files and orientation of the sequences in each file was subsequently checked to make sure no reverse-complements were used.
Blastclust analyses  were done on the MPI Bioinformatics Toolkit webserver  for each dataset using a 98% similarity threshold for the non-coding markers (psbA-trnH and ITS) and a 100% similarity threshold for rpoC1 as well as a 90% minimum length coverage for all three datasets. Query sequences were identified on the basis of the reference sequences that they formed a cluster with. Similarity thresholds were determined using pairwise analysis in SpeciesIdentifier v. 1.7.8 .
In addition to these two alignment free methods, all three datasets were aligned using MAFFT  and phylograms were constructed using RAxML version 7.2.8 ,  under the GTRGAMMA model with 1000 bootstrap replicates under the GTRCAT model on the Cipres Science Gateway . All three phylograms were visualized using Dendroscope  (Data S4, S5, S6). The query sequences were identified to the species level as described in Meier et al.  (i.e. only if they belonged to a species specific clade, but not if the query sequence was sister to a species-specific clade) with the exception that branch lengths were taken into account so that query sequences that were identical to a sequence of a certain species in the reference database with which they formed a clade were deemed identified to the species level. Other sequences were identified to either genus or family level if they were clustered at least one node into a clade consisting of sequences from only a certain genus or family. Support values were not taken into account in the identifications.
Blastclust and RAxML analyses were performed on the combined datasets using only the reference data generated in this study. A combined data analysis that also includes GenBank data would have been ideal, but was not feasible since GenBank records often lack information on the voucher specimen, hence making it impossible to combine the extended reference databases for the different markers.
The final identification of each product was done on a case-to-case basis using the outcome of the three methods for each of the three markers (Data S3, S10) and taking into account when reference sequences from a certain species were present in one or two datasets but not the other(s).
Table with market samples, identifications, and GenBank accession numbers.
Reference samples and GenBank accession numbers.
Map with collection sites of specimens that were used for the reference database.
RAxML phylogram of the rpoC1 extended reference dataset plus the market rpoC1 sequences.
RAxML phylogram of the psbA-trnH extended reference dataset plus the market psbA-trnH sequences.
RAxML phylogram of the ITS extended reference dataset plus the market ITS sequences.
The help of Prof. S.L. Jury of the University of Reading Herbarium (RNG), Mohamed El Haouzi of the Global Diversity Foundation, and Abderrahim Ouarghidi of Cadi Ayyad University is acknowledged. The bulk of the laboratory equipment for the plant DNA lab at the University of Marrakech was generously donated by VWR International Sweden thanks to the gracious and determined assistance of Kerstin Eriksson and Lisa Sundblad. Pravech Ajawatanawong is acknowledged for his help with the bioinformatics; Michelle Soares for the photograph in Figure 1; and two anonymous reviewers for their valuable comments on an earlier version of the manuscript.
Conceived and designed the experiments: AK HJdB. Performed the experiments: AK ÅK AR. Analyzed the data: AK. Wrote the paper: AK HJdB. Conceived and initiated the project: AK HJdB AA LB GJM.
- 1. IUCN Centre for Mediterranean Cooperation (2005) A guide to the medicinal plants of North Africa. Malaga, Spain: IUCN CMC. 256 p. IUCN Centre for Mediterranean Cooperation2005A guide to the medicinal plants of North Africa. Malaga, Spain: IUCN CMC.256 p
- 2. El-Hilaly J, Hmammouchi M, Lyoussi B (2003) Ethnobotanical studies and economic evaluation of medicinal plants in Taounate province (Northern Morocco). J Ethnopharm 86: 149–158.J. El-HilalyM. HmammouchiB. Lyoussi2003Ethnobotanical studies and economic evaluation of medicinal plants in Taounate province (Northern Morocco).J Ethnopharm86149158
- 3. Bellakhdar J (1997) La pharmacopée marocaine traditionnelle: Médecine arabe ancienne et savoirs populaires. Saint-Etienne, France: Ibis. 764 p. J. Bellakhdar1997La pharmacopée marocaine traditionnelle: Médecine arabe ancienne et savoirs populaires. Saint-Etienne, France: Ibis.764 p
- 4. Bellakhdar J, Claisse R, Fleurentin J, Younos C (1991) Repertory of standard herbal drugs in the Moroccan pharmacopoea. J Ethnopharm 35: 123–143.J. BellakhdarR. ClaisseJ. FleurentinC. Younos1991Repertory of standard herbal drugs in the Moroccan pharmacopoea.J Ethnopharm35123143
- 5. Marshall NT (1998) Searching for a cure: conservation of medicinal wildlife resources in East and Southern Africa. Nairobi, Kenya: TRAFFIC East Africa. p. NT Marshall1998Searching for a cure: conservation of medicinal wildlife resources in East and Southern Africa. Nairobi, Kenya: TRAFFIC East Africa.p
- 6. Cunningham AB (1993) African medicinal plants: setting priorities at the interface between conservation and primary health care. People and Plants working paper 1: 1–50.AB Cunningham1993African medicinal plants: setting priorities at the interface between conservation and primary health care.People and Plants working paper1150
- 7. Barthelson RA, Sundareshan P, Galbraith DW, Woosley RL (2006) Development of a comprehensive detection method for medicinal and toxic plant species. Am J Bot 93: 566–574. RA BarthelsonP. SundareshanDW GalbraithRL Woosley2006Development of a comprehensive detection method for medicinal and toxic plant species. Am J Bot 93: 566–574.
- 8. Ize-Ludlow D, Ragone S, Bruck IS, Bernstein JN, Duchowny M, et al. (2004) Neurotoxicities in infants seen with the consumption of star anise tea. Pediatrics 114: e653. D. Ize-LudlowS. RagoneIS BruckJN BernsteinM. Duchowny2004Neurotoxicities in infants seen with the consumption of star anise tea. Pediatrics 114: e653.
- 9. Gardes M, Bruns TD (1993) ITS primers with enhanced specificity for basidiomycetes - application to the identification of mycorrhizae and rusts. Mol Ecol 2: 113–118.M. GardesTD Bruns1993ITS primers with enhanced specificity for basidiomycetes - application to the identification of mycorrhizae and rusts. Mol Ecol 2: 113–118.
- 10. Arnason U, Spilliaert R, Pálsdóttir A, Arnason A (1991) Molecular identification of hybrids between the two largest whale species, the blue whale (Balaenoptera musculus) and the fin whale (B. physalus). Hereditas 115: 183–189.U. ArnasonR. SpilliaertA. PálsdóttirA. Arnason1991Molecular identification of hybrids between the two largest whale species, the blue whale (Balaenoptera musculus) and the fin whale (B. physalus).Hereditas115183189
- 11. Tang J, Toè L, Back C, Zimmerman PA, Pruess K, et al. (1995) The Simulium damnosum species complex: phylogenetic analysis and molecular identification based upon mitochondrially encoded gene sequences. Insect Mol Biol 4: 79–88.J. TangL. ToèC. BackPA ZimmermanK. Pruess1995The Simulium damnosum species complex: phylogenetic analysis and molecular identification based upon mitochondrially encoded gene sequences. Insect Mol Biol 4: 79–88.
- 12. Caldeira RL, Vidigal TH, Paulinelli ST, Simpson AJ, Carvalho OS (1998) Molecular identification of similar species of the genus Biomphalaria (Mollusca: Planorbidae) determined by a polymerase chain reaction-restriction fragment length polymorphism. Mem Inst Oswaldo Cruz 93: 219–225.RL CaldeiraTH VidigalST PaulinelliAJ SimpsonOS Carvalho1998Molecular identification of similar species of the genus Biomphalaria (Mollusca: Planorbidae) determined by a polymerase chain reaction-restriction fragment length polymorphism.Mem Inst Oswaldo Cruz93219225
- 13. Milinkovitch MC, Caccone A, Amato G (2004) Molecular phylogenetic analyses indicate extensive morphological convergence between the “yeti” and primates. Mol Phylogenet Evol 31: 1–3. MC MilinkovitchA. CacconeG. Amato2004Molecular phylogenetic analyses indicate extensive morphological convergence between the “yeti” and primates. Mol Phylogenet Evol 31: 1–3.
- 14. Garnock-Jones PJ, Timmerman GM, Wagstaff SJ (1996) Unknown New Zealand angiosperm assigned toCunoniaceae using sequence of the chloroplastrbcL gene. Plant Syst Evol 202: 211–218. PJ Garnock-JonesGM TimmermanSJ Wagstaff1996Unknown New Zealand angiosperm assigned toCunoniaceae using sequence of the chloroplastrbcL gene. Plant Syst Evol 202: 211–218.
- 15. Hebert, Cywinska A, Ball S, de Waard J (2003) Biological identifications through DNA barcodes. Proc R Soc Lond B 270: 313–322.A. Hebert, CywinskaS. BallJ. de Waard2003Biological identifications through DNA barcodes.Proc R Soc Lond B270313322
- 16. Kerr KCR, Stoeckle MY, Dove CJ, Weigt LA, Francis CM, et al. (2007) Comprehensive DNA barcode coverage of North American birds. Mol Ecol Notes 7: 535–543.KCR KerrMY StoeckleCJ DoveLA WeigtCM Francis2007Comprehensive DNA barcode coverage of North American birds. Mol Ecol Notes 7: 535–543.
- 17. Smith MA, Poyarkov NA, Hebert PDN (2008) CO1 DNA barcoding amphibians: take the chance, meet the challenge. Mol Ecol Resources 8: 235–246.MA SmithNA PoyarkovPDN Hebert2008CO1 DNA barcoding amphibians: take the chance, meet the challenge. Mol Ecol Resources 8: 235–246.
- 18. Hollingsworth PM, Forrest LL, Spouge JL, Hajibabaei M, Ratnasingham S, et al. (2009) A DNA barcode for land plants. PNAS 106: 12794–12797.PM HollingsworthLL ForrestJL SpougeM. HajibabaeiS. Ratnasingham2009A DNA barcode for land plants.PNAS1061279412797
- 19. Fazekas AJ, Kesanakurti PR, Burgess KS, Percy DM, Graham SW, et al. (2009) Are plant species inherently harder to discriminate than animal species using DNA barcoding markers? Mol Ecol Resources 9: 130–139.AJ FazekasPR KesanakurtiKS BurgessDM PercySW Graham2009Are plant species inherently harder to discriminate than animal species using DNA barcoding markers? Mol Ecol Resources 9: 130–139.
- 20. Cho Y, Mower JP, Qiu Y, Palmer JD (2004) Mitochondrial substitution rates are extraordinarily elevated and variable in a genus of flowering plants. PNAS 101: 17741–17746.Y. ChoJP MowerY. QiuJD Palmer2004Mitochondrial substitution rates are extraordinarily elevated and variable in a genus of flowering plants.PNAS1011774117746
- 21. Kress WJ, Wurdack KJ, Zimmer EA, Weigt LA, Janzen DH (2005) Use of DNA barcodes to identify flowering plants. PNAS 102: 8369–8374.WJ KressKJ WurdackEA ZimmerLA WeigtDH Janzen2005Use of DNA barcodes to identify flowering plants.PNAS10283698374
- 22. Kress WJ, Erickson DL (2007) A two-locus global DNA barcode for land plants: the coding rbcL gene complements the non-coding trnH-psbA spacer region. PLoS ONE 2: e508. WJ KressDL Erickson2007A two-locus global DNA barcode for land plants: the coding rbcL gene complements the non-coding trnH-psbA spacer region. PLoS ONE 2: e508.
- 23. Ford CS, Ayres KL, Toomey N, Haider N, Stahl JVANA, et al. (2009) Selection of candidate coding DNA barcoding regions for use on land plants. Bot J Linn Soc. pp. 1–11.CS FordKL AyresN. ToomeyN. HaiderJVANA Stahl2009Selection of candidate coding DNA barcoding regions for use on land plants.Bot J Linn Soc111
- 24. Chen S, Yao H, Han J, Liu C, Song J, et al. (2010) Validation of the ITS2 region as a novel DNA barcode for identifying medicinal plant species. PLoS ONE 5: 1–8. S. ChenH. YaoJ. HanC. LiuJ. Song2010Validation of the ITS2 region as a novel DNA barcode for identifying medicinal plant species. PLoS ONE 5: 1–8.
- 25. Li DZ, Gao LM, Li HT, Wang H, Ge XJ, et al. (2011) Comparative analysis of a large dataset indicates that internal transcribed spacer (ITS) should be incorporated into the core barcode for seed plants. PNAS 108: 19641–19646.DZ LiLM GaoHT LiH. WangXJ Ge2011Comparative analysis of a large dataset indicates that internal transcribed spacer (ITS) should be incorporated into the core barcode for seed plants.PNAS1081964119646
- 26. De Queiroz K (2007) Species concepts and species delimitation. Syst Biol 56: 879–886. K. De Queiroz2007Species concepts and species delimitation. Syst Biol 56: 879–886.
- 27. Soltis PS, Soltis DE (2009) The role of hybridization in plant speciation. Ann Rev Plant Biology 60: 561–588.PS SoltisDE Soltis2009The role of hybridization in plant speciation.Ann Rev Plant Biology60561588
- 28. Knowles LL, Carstens BC (2007) Delimiting species without monophyletic gene trees. Systematic biology 56: 887–895. LL KnowlesBC Carstens2007Delimiting species without monophyletic gene trees. Systematic biology 56: 887–895.
- 29. Blaxter M, Mann J, Chapman T, Thomas F, Whitton C, et al. (2005) Defining operational taxonomic units using DNA barcode data. Philos Trans R Soc Lond B Biol Sci 360: 1935–1943. M. BlaxterJ. MannT. ChapmanF. ThomasC. Whitton2005Defining operational taxonomic units using DNA barcode data. Philos Trans R Soc Lond B Biol Sci 360: 1935–1943.
- 30. Lefébure T, Douady CJ, Gouy M, Gibert J (2006) Relationship between morphological taxonomy and molecular divergence within Crustacea: proposal of a molecular threshold to help species delimitation. Mol Phylogenet Evol 40: 435–447. T. LefébureCJ DouadyM. GouyJ. Gibert2006Relationship between morphological taxonomy and molecular divergence within Crustacea: proposal of a molecular threshold to help species delimitation. Mol Phylogenet Evol 40: 435–447.
- 31. Nielsen R, Matz M (2006) Statistical approaches for DNA barcoding. Syst Biol 55: 162–169. R. NielsenM. Matz2006Statistical approaches for DNA barcoding. Syst Biol 55: 162–169.
- 32. Lahaye R, Van Der Bank M, Bogarin D, Warner J, Pupulin F, et al. (2008) DNA barcoding the floras of biodiversity hotspots. PNAS 105: 2923–2928. R. LahayeM. Van Der BankD. BogarinJ. WarnerF. Pupulin2008DNA barcoding the floras of biodiversity hotspots. PNAS 105: 2923–2928.
- 33. Chase MW, Fay MF (2009) Barcoding of plants and fungi. Science 325: 682.MW ChaseMF Fay2009Barcoding of plants and fungi.Science325682
- 34. Mora C, Tittensor DP, Adl S, Simpson AGB, Worm B (2011) How many species are there on earth and in the ocean? PLoS Biology 9: e1001127. C. MoraDP TittensorS. AdlAGB SimpsonB. Worm2011How many species are there on earth and in the ocean? PLoS Biology 9: e1001127.
- 35. Will KW, Rubinoff D (2004) Myth of the molecule: DNA barcodes for species cannot replace morphology for identification and classification. Cladistics 20: 47–55.KW WillD. Rubinoff2004Myth of the molecule: DNA barcodes for species cannot replace morphology for identification and classification.Cladistics204755
- 36. Meier R, Shiyang K, Vaidya G, Ng PKL (2006) DNA barcoding and taxonomy in Diptera: a tale of high intraspecific variability and low identification success. Syst Biol 55: 715–728. R. MeierK. ShiyangG. VaidyaPKL Ng2006DNA barcoding and taxonomy in Diptera: a tale of high intraspecific variability and low identification success. Syst Biol 55: 715–728.
- 37. Nilsson RH, Kristiansson E, Ryberg M, Hallenberg N, Larsson K-H (2008) Intraspecific ITS variability in the kingdom fungi as expressed in the international sequence databases and its implications for molecular species identification. Evol Bioinform Online 4: 193–201.RH NilssonE. KristianssonM. RybergN. HallenbergK-H Larsson2008Intraspecific ITS variability in the kingdom fungi as expressed in the international sequence databases and its implications for molecular species identification.Evol Bioinform Online4193201
- 38. Schoch CL, Siefert KA, Huhndorf S, Robert V, Spouge JL, et al. (2012) Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi. PNAS 109: 6241–6246.CL SchochKA SiefertS. HuhndorfV. RobertJL Spouge2012Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi.PNAS10962416246
- 39. Burns JM, Janzen DH, Hajibabaei M, Hallwachs W, Hebert PDN (2008) DNA barcodes and cryptic species of skipper butterflies in the genus Perichares in Area de Conservacion Guanacaste, Costa Rica. PNAS 105: 6350–6355. JM BurnsDH JanzenM. HajibabaeiW. HallwachsPDN Hebert2008DNA barcodes and cryptic species of skipper butterflies in the genus Perichares in Area de Conservacion Guanacaste, Costa Rica. PNAS 105: 6350–6355.
- 40. Valentini A, Miquel C, Nawaz MA, Bellemain E, Coissac E, et al. (2009) New perspectives in diet analysis based on DNA barcoding and parallel pyrosequencing: the trn L approach. Mol Ecol Resources 9: 51–60.A. ValentiniC. MiquelMA NawazE. BellemainE. Coissac2009New perspectives in diet analysis based on DNA barcoding and parallel pyrosequencing: the trn L approach. Mol Ecol Resources 9: 51–60.
- 41. Deguilloux M-F, Pemonge M-H, Petit RJ (2002) Novel perspectives in wood certification and forensics: dry wood as a source of DNA. Proc R Soc Lond B 269: 1039–1046. M-F DeguillouxM-H PemongeRJ Petit2002Novel perspectives in wood certification and forensics: dry wood as a source of DNA. Proc R Soc Lond B 269: 1039–1046.
- 42. Collins R, Armstrong KF, Meier R, Yi Y, Brown SDJ, et al. (2012) Barcoding and border biosecurity: Identifying cyprinid fishes in the aquarium trade. PLoS ONE 7: e28381. R. CollinsKF ArmstrongR. MeierY. YiSDJ Brown2012Barcoding and border biosecurity: Identifying cyprinid fishes in the aquarium trade. PLoS ONE 7: e28381.
- 43. Coghlan M, Haile J, Houston J, Murray D, White N, et al. (2012) Deep Sequencing of Plant and Animal DNA Contained within Traditional Chinese Medicines Reveals Legality Issues and Health Safety Concerns. PLoS Genetics 8: e1002657. M. CoghlanJ. HaileJ. HoustonD. MurrayN. White2012Deep Sequencing of Plant and Animal DNA Contained within Traditional Chinese Medicines Reveals Legality Issues and Health Safety Concerns. PLoS Genetics 8: e1002657.
- 44. Asahina H, Shinozaki J, Masuda K, Morimitsu Y, Satake M (2010) Identification of medicinal Dendrobium species by phylogenetic analyses using matK and rbcL sequences. J Natural Med 64: 133–138.H. AsahinaJ. ShinozakiK. MasudaY. MorimitsuM. Satake2010Identification of medicinal Dendrobium species by phylogenetic analyses using matK and rbcL sequences.J Natural Med64133138
- 45. Song J, Yao H, Li Y, Li X, Lin Y, et al. (2009) Authentication of the family Polygonaceae in Chinese pharmacopoeia by DNA barcoding technique. J Ethnopharm 124: 434–439.J. SongH. YaoY. LiX. LiY. Lin2009Authentication of the family Polygonaceae in Chinese pharmacopoeia by DNA barcoding technique.J Ethnopharm124434439
- 46. Sucher NJ, Carles MC, others (2008) Genome-based approaches to the authentication of medicinal plants. Planta Med 74: 603–623.NJ SucherMC Carlesothers2008Genome-based approaches to the authentication of medicinal plants.Planta Med74603623
- 47. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403–410.SF AltschulW. GishW. MillerEW MyersDJ Lipman1990Basic local alignment search tool.J Mol Biol215403410
- 48. Sass C, Little DP, Stevenson DW, Specht CD (2007) DNA barcoding in the cycadales: testing the potential of proposed barcoding markers for species identification of cycads. PLoS ONE 2: e1154. C. SassDP LittleDW StevensonCD Specht2007DNA barcoding in the cycadales: testing the potential of proposed barcoding markers for species identification of cycads. PLoS ONE 2: e1154.
- 49. Dondoshansky I (2002) Blastclust (NCBI Software Development Toolkit). Bethesda, MD: NCBI. p. I. Dondoshansky2002Blastclust (NCBI Software Development Toolkit). Bethesda, MD: NCBI.p
- 50. Piredda R, Simeone MC, Attimonelli M, Bellarosa R, Schirone B (2011) Prospects of barcoding the Italian wild dendroflora: oaks reveal severe limitations to tracking species identity. Mol Ecol Resources 11: 72–83.R. PireddaMC SimeoneM. AttimonelliR. BellarosaB. Schirone2011Prospects of barcoding the Italian wild dendroflora: oaks reveal severe limitations to tracking species identity. Mol Ecol Resources 11: 72–83.
- 51. Barrett RDH, Hebert PDN (2005) Identifying spiders through DNA barcodes. Can J Zool 491: 481–491. doi. pp. 10.1139/Z05–024.RDH BarrettPDN Hebert2005Identifying spiders through DNA barcodes. Can J Zool 491: 481–491.doi10.1139/Z05024
- 52. Little DP, Stevenson DW (2007) Cladistics A comparison of algorithms for the identification of specimens using DNA barcodes: examples from gymnosperms. Cladistics 23: 1–21.DP LittleDW Stevenson2007Cladistics A comparison of algorithms for the identification of specimens using DNA barcodes: examples from gymnosperms.Cladistics23121
- 53. Newmaster SG, Fazekas AJ, Steeves RAD, Janovec J (2007) Testing candidate plant barcode regions in the Myristicaceae. Mol Ecol Notes: 1–11.SG NewmasterAJ FazekasRAD SteevesJ. Janovec2007Testing candidate plant barcode regions in the Myristicaceae. Mol Ecol Notes: 1–11.
- 54. Starr JR, Naczi RFC, Chouinard BN (2009) Plant DNA barcodes and species resolution in sedges (Carex, Cyperaceae). Mol Ecol Resources 9 Suppl s1: 151–163.JR StarrRFC NacziBN Chouinard2009Plant DNA barcodes and species resolution in sedges (Carex, Cyperaceae). Mol Ecol Resources 9 Suppl s1: 151–163.
- 55. Chase MW, Salamin N, Wilkinson M, Dunwell JM, Kesanakurthi RP, et al. (2005) Land plants and DNA barcodes: short-term and long-term goals. Philos Trans R Soc Lond B Biol Sci 360: 1889–1895. MW ChaseN. SalaminM. WilkinsonJM DunwellRP Kesanakurthi2005Land plants and DNA barcodes: short-term and long-term goals. Philos Trans R Soc Lond B Biol Sci 360: 1889–1895.
- 56. Burgess KS, Fazekas AJ, Kesanakurti PR, Graham SW, Husband BC, et al. (2011) Discriminating plant species in a local temperate flora using the rbcL+matK DNA barcode. Meth Ecol Evol 2: 333–340.KS BurgessAJ FazekasPR KesanakurtiSW GrahamBC Husband2011Discriminating plant species in a local temperate flora using the rbcL+matK DNA barcode. Meth Ecol Evol 2: 333–340.
- 57. Gonzalez MA, Baraloto C, Engel J, Mori S a, Pétronelli P, et al. (2009) Identification of Amazonian trees with DNA barcodes. PLoS ONE 4: e7483. MA GonzalezC. BaralotoJ. Engela. Mori SP. Pétronelli2009Identification of Amazonian trees with DNA barcodes. PLoS ONE 4: e7483.
- 58. Stoeckle MY, Gamble CC, Kirpekar R, Young G, Ahmed S, et al. (2011) Commercial Teas Highlight Plant DNA Barcode Identification Successes and Obstacles. Scientific Reports 1: 1–7. MY StoeckleCC GambleR. KirpekarG. YoungS. Ahmed2011Commercial Teas Highlight Plant DNA Barcode Identification Successes and Obstacles. Scientific Reports 1: 1–7.
- 59. Devey DS, Chase MW, Clarkson JJ (2009) A stuttering start to plant DNA barcoding: microsatellites present a previously overlooked problem in non-coding plastid regions. Taxon 58: 7–15.DS DeveyMW ChaseJJ Clarkson2009A stuttering start to plant DNA barcoding: microsatellites present a previously overlooked problem in non-coding plastid regions.Taxon58715
- 60. Cotton CM (1996) Ethnobotany: principles and applications. Wiley Chichester, UK. p. CM Cotton1996Ethnobotany: principles and applications. Wiley Chichester, UK.p
- 61. Downie S, Katz-Downie D, Watson M (2000) A phylogeny of the flowering plant family Apiaceae based on chloroplast DNA rpl16 and rpoc1 intron sequences: towards a suprageneric classification of subfamily Apioideae. Am J Bot 87: 273–292.S. DownieD. Katz-DownieM. Watson2000A phylogeny of the flowering plant family Apiaceae based on chloroplast DNA rpl16 and rpoc1 intron sequences: towards a suprageneric classification of subfamily Apioideae.Am J Bot87273292
- 62. Downie SR, Plunkett GM, Watson MF, Spalik K, Katz-Downie DS, et al. (2001) Tribes and Clades Within Apiaceae Subfamily Apioideae: the Contribution of Molecular Data. Edinb J Bot 58: 301–330. SR DownieGM PlunkettMF WatsonK. SpalikDS Katz-Downie2001Tribes and Clades Within Apiaceae Subfamily Apioideae: the Contribution of Molecular Data. Edinb J Bot 58: 301–330.
- 63. Boulos L (1983) Medicinal plants of North Africa. Algonac, Michigan: Reference Publications Inc. 286 p. L. Boulos1983Medicinal plants of North Africa. Algonac, Michigan: Reference Publications Inc.286 p
- 64. Guo Z, Miao X (2010) Growth changes and tissues anatomical characteristics of giant reed (Arundo donax L.) in soil contaminated with arsenic, cadmium and lead. J Central South Uni Tech 17: 770–777.Z. GuoX. Miao2010Growth changes and tissues anatomical characteristics of giant reed (Arundo donax L.) in soil contaminated with arsenic, cadmium and lead.J Central South Uni Tech17770777
- 65. Mirza N, Mahmood Q, Pervez A, Ahmad R, Farooq R, et al. (2010) Phytoremediation potential of Arundo donax in arsenic-contaminated synthetic wastewater. Bioresource Technol 101: 5815–5819. N. MirzaQ. MahmoodA. PervezR. AhmadR. Farooq2010Phytoremediation potential of Arundo donax in arsenic-contaminated synthetic wastewater. Bioresource Technol 101: 5815–5819.
- 66. Fennane M, Ibn Tattou M, Mathez J, Ouyahya A, El Oualidi J, editors (1999) Flore pratique du Maroc. Volume 1. Rabat, Morocco: Institut Scientifique, Université Mohammed V. 558 p. M. FennaneM. Ibn TattouJ. MathezA. OuyahyaJ. El Oualidieditors1999Flore pratique du Maroc. Volume 1.Rabat, Morocco: Institut Scientifique, Université Mohammed V. 558 p
- 67. Fennane M, Ibn Tattou M, Ouyahya A, El Oualidi J, editors (2007) Flore pratique du Maroc. Volume 2. Rabat, Morocco: Institut Scientifique, Université Mohammed V. 636 p. M. FennaneM. Ibn TattouA. OuyahyaJ. El Oualidieditors2007Flore pratique du Maroc. Volume 2.Rabat, Morocco: Institut Scientifique, Université Mohammed V. 636 p
- 68. Valdés B, Rejdali M, Achhal El Kadmiri A, Jury SL, Montserrat JM (2002) Catalogue des plantes vasculaires du nord du Maroc, incluant des clés d’identification. Vol 1. Madrid, Spain: Consejo Superior de Investigaciones Cientificas. p. B. ValdésM. RejdaliA. Achhal El KadmiriSL JuryJM Montserrat2002Catalogue des plantes vasculaires du nord du Maroc, incluant des clés d’identification. Vol 1. Madrid, Spain: Consejo Superior de Investigaciones Cientificas.p
- 69. Valdés B, Rejdali M, Achhal El Kadmiri A, Jury SL, Montserrat JM (2002) Catalogue des plantes vasculaires du nord du Maroc, incluant des clés d’identification. Vol. 2. Madrid, Spain.: Consejo Superior de Investigaciones Cientificas. p. B. ValdésM. RejdaliA. Achhal El KadmiriSL JuryJM Montserrat2002Catalogue des plantes vasculaires du nord du Maroc, incluant des clés d’identification. Vol. 2. Madrid, Spain.: Consejo Superior de Investigaciones Cientificas.p
- 70. Fennane M, Ibn Tattou M (1998) Catalogue des plantes vasculaires rares, menacées ou endémiques du Maroc. Bocconea 8: 1–243.M. FennaneM. Ibn Tattou1998Catalogue des plantes vasculaires rares, menacées ou endémiques du Maroc.Bocconea81243
- 71. Fennane M, Ibn Tattou M (2005) Flore vasculaire du Maroc: Inventaire et chorologie 1. Rabat, Morocco: Institut Scientifique, Université Mohammed V. 483 p. M. FennaneM. Ibn Tattou2005Flore vasculaire du Maroc: Inventaire et chorologie 1.Rabat, Morocco: Institut Scientifique, Université Mohammed V. 483 p
- 72. Ibn Tattou M, Fennane M (2008) Flore vasculaire du Maroc: Inventarie et chorologie 2. Rabat, Morocco: Institut Scientifique, Université Mohammed V. 398 p. M. Ibn TattouM. Fennane2008Flore vasculaire du Maroc: Inventarie et chorologie 2.Rabat, Morocco: Institut Scientifique, Université Mohammed V. 398 p
- 73. Yoon CS, Glawe A, Shaw PD (1991) A method for rapid small-scale preparation of fungal DNA. Mycologia 83: 835–838.CS YoonA. GlawePD Shaw1991A method for rapid small-scale preparation of fungal DNA.Mycologia83835838
- 74. RBG-K (2007) Royal Botanic Gardens Kew, DNA Barcoding. RBG-K2007Royal Botanic Gardens Kew, DNA Barcoding.Available:http://www.kew.org/barcoding/protocols.html.Accessed 1 January 2007. Accessed 1 January 2007.
- 75. White T, Bruns T, Lee S, Taylor J (1990) Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics. In: Innis M, Gelfand D, Shinsky J, White T, editors. pp. 315–322.T. WhiteT. BrunsS. LeeJ. Taylor1990Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics.M. InnisD. GelfandJ. ShinskyT. White315322editors.PCR Protocols: A Guide to Methods and Applications.San Diego: Academic Press. San Diego: Academic Press.
- 76. Sang T, Crawford DJ, Stuessy TF (1995) Documentation of reticulate evolution in peonies (Paeonia) using internal transcribed spacer sequences of nuclear ribosomal DNA: implications for biogeography and concerted evolution. PNAS 92: 6813–6817.T. SangDJ CrawfordTF Stuessy1995Documentation of reticulate evolution in peonies (Paeonia) using internal transcribed spacer sequences of nuclear ribosomal DNA: implications for biogeography and concerted evolution.PNAS9268136817
- 77. Sang T, Crawford D, TF (1997) Chloroplast DNA phylogeny, reticulate evolution, and biogeography of Paeonia (Paeoniaceae). Am J Bot 84: 1120–1136.T. SangD. CrawfordTF1997Chloroplast DNA phylogeny, reticulate evolution, and biogeography of Paeonia (Paeoniaceae).Am J Bot8411201136
- 78. Bonfield JK, Smith KF, Staden R (1995) A new DNA sequence assembly program. Nucleic Acids Res 23: 4992.JK BonfieldKF SmithR. Staden1995A new DNA sequence assembly program.Nucleic Acids Res234992
- 79. Staden R (1996) The Staden sequence analysis package. Mol Biotechnol 5: 233–241.R. Staden1996The Staden sequence analysis package.Mol Biotechnol5233241
- 80. Stajich JE, Block D, Boulez K, Brenner SE, Chervitz S a, et al. (2002) The Bioperl toolkit: Perl modules for the life sciences. Genome Res 12: 1611–1618. doi:10.1101/gr.361602. JE StajichD. BlockK. BoulezSE Brennera. Chervitz S2002The Bioperl toolkit: Perl modules for the life sciences. Genome Res 12: 1611–1618.doi:10.1101/gr.361602
- 81. Biegert A, Mayer C, Remmert M, Söding J, Lupas AN (2006) The MPI Bioinformatics Toolkit for protein sequence analysis. Nucleic Acids Res 34: W335–9. doi:10.1093/nar/gkl217. A. BiegertC. MayerM. RemmertJ. SödingAN Lupas2006The MPI Bioinformatics Toolkit for protein sequence analysis. Nucleic Acids Res 34: W335–9.doi:10.1093/nar/gkl217
- 82. Katoh K, Misawa K, Kuma K, Miyata T (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30: 3059–3066.K. KatohK. MisawaK. KumaT. Miyata2002MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.Nucleic Acids Res3030593066
- 83. Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22: 2688–2690.A. Stamatakis2006RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.Bioinformatics2226882690
- 84. Stamatakis A, Hoover P, Rougemont J (2008) A rapid bootstrap algorithm for the RAxML web servers. Syst Biol 57: 758.A. StamatakisP. HooverJ. Rougemont2008A rapid bootstrap algorithm for the RAxML web servers.Syst Biol57758
- 85. Miller M, Holder MT, Vos R, Midford P, Liebowitz T, et al. (2010) The CIPRES portals. M. MillerMT HolderR. VosP. MidfordT. Liebowitz2010The CIPRES portals.CIPRES Website http://www.phylo.org/sub_sections/portal [accessed 06 January 2010]. CIPRES Website http://www.phylo.org/sub_sections/portal [accessed 06 January 2010].
- 86. Huson DH, Richter DC, Rausch C, Dezulian T, Franz M, et al. (2007) Dendroscope: An interactive viewer for large phylogenetic trees. BMC Bioinformatics 8: 460.DH HusonDC RichterC. RauschT. DezulianM. Franz2007Dendroscope: An interactive viewer for large phylogenetic trees.BMC Bioinformatics8460