A Novel Mini-DNA Barcoding Assay to Identify Processed Fins from Internationally Protected Shark Species

There is a growing need to identify shark products in trade, in part due to the recent listing of five commercially important species on the Appendices of the Convention on International Trade in Endangered Species (CITES; porbeagle, Lamna nasus, oceanic whitetip, Carcharhinus longimanus scalloped hammerhead, Sphyrna lewini, smooth hammerhead, S. zygaena and great hammerhead S. mokarran) in addition to three species listed in the early part of this century (whale, Rhincodon typus, basking, Cetorhinus maximus, and white, Carcharodon carcharias). Shark fins are traded internationally to supply the Asian dried seafood market, in which they are used to make the luxury dish shark fin soup. Shark fins usually enter international trade with their skin still intact and can be identified using morphological characters or standard DNA-barcoding approaches. Once they reach Asia and are traded in this region the skin is removed and they are treated with chemicals that eliminate many key diagnostic characters and degrade their DNA (“processed fins”). Here, we present a validated mini-barcode assay based on partial sequences of the cytochrome oxidase I gene that can reliably identify the processed fins of seven of the eight CITES listed shark species. We also demonstrate that the assay can even frequently identify the species or genus of origin of shark fin soup (31 out of 50 samples).


Introduction
The collagenous protein fibers inside shark fins, also known as the ceratotrichia, are the primary constituent of the Asian delicacy shark fin soup [1]. The trade in shark fins to supply the market for this soup is arguably the most significant driver of shark fisheries worldwide, many of which have proven to be unsustainable [2][3][4][5]. In response to this issue Parties to the Convention on International Trade in Endangered Species (CITES) have voted to list eight species of sharks on Appendix II of the Convention. These species are the whale (Rhincodon typus; . International trade of these species and their parts requires export permits certifying that the trade in each specimen is not detrimental to the survival of the species. Customs personnel of both exporting and importing nations will therefore have to be able to recognize the traded products of these species in order to identify illicit trade (i.e., trade without permits) in order to be able to effectively enforce CITES.
The initial processing of sharks fins involves sun or oven drying them after excision from the carcass. International trade from source nations to Asian markets frequently involves fins of this kind (i.e., with the skin still on). Provisional identification of these types of fins from several CITES listed species is possible using morphological characters (e.g., www.sharkfinid.org). Standard DNA barcoding (i.e., amplification of~500-650 bp of the mitochondrial cytochrome c oxidase I gene [COI]) and polymerase chain reaction (PCR) techniques are available to then confirm species-of-origin or to identify fins that cannot be identified using morphology [6][7][8][9]. The next stage of fin processing is more intensive, however, and involves the removal of the outermost layer of skin (denticles), followed by chemical treatment to obtain the bleached, skinned retail product (hereafter referred to as a "processed fin", [1]). Most of this processing is performed in Asia and there is significant regional international trade of these processed fins, as well some as trade outside of the region to satisfy demand for shark fin soup elsewhere [1]. The genomic DNA of these fins is degraded during processing and it is frequently difficult to obtain large PCR amplicons, potentially complicating the use of standard genetic identification techniques [6][7][8][9]. The objectives of this study were two-fold. First, we present a mini-barcode cytochrome c oxidase I assay that yields a short (~110-130 bp) sequence from degraded shark DNA templates (e.g., processed fins and individual ceratotrichia extracted from shark fin soup). We then show that this sequence can be used to confidently identify products from seven of the eight CITES listed shark species using freely available databases (Fish Barcode of Life Initiative [FISH-BOL] and the National Center for Biotechnology Information [NCBI] GenBank) and/or species diagnostic nucleotides.

Methods
The mitochondrial cytochrome c oxidase I (COI) gene is used as a barcode for most animal life [10]. Two COI sequences from 41 shark species, including many that are common in the global fin trade [2] and all 8 of the CITES listed species were downloaded from the National Center for Biotechnology Information (NCBI) GenBank (http://www.ncbi.nlm.nih.gov/genbank/). We included at least one species from five of the eight living Orders (Carcharhiniformes, Lamniformes and Orectolobiformes, Hexanchinformes and Squaliformes). These sequences were aligned and compared with one another to identify conserved and variable regions using the sequence-editing program GeneDoc [11]. A conserved region was found~150 bp upstream of the beginning of these sequences. We designed a novel reverse primer (Shark COI-MINIR 5'-AAGATTACAAAAGCGTGGGC-3') to anneal to this conserved region. We anticipated that a multiplex polymerase chain reaction (PCR) including this primer and the two overlapping M13 labeled universal fish barcoding forward primers ( [12]; FishF2_t1 5'-TGTAAAAC-GACGGCCAGTCGA CTAATCATAAAGATATCGG CAC-3'and VF2_t1 5'-TGTAAAAC-GACGGCCAGTCAACC AACCA CAAAGACATTGGC AC-3') would yield an amplicon of1 50 bp with an M13 tail from all shark genomic DNA templates. This amplicon could then be sequenced using an M13 primer and would be expected in practice to yield a sequence of1 10-130 bp.
To validate that the assay worked as intended in terms of amplification and sequencing performance, PCR was carried out using genomic DNA template isolated from one of three types of shark tissue sample: (1) ethanol preserved tissues collected from live or dead sharks of known species identity (N = 4 individuals from each of 21 species, Table 1); (2) processed fins of unknown species origin (Table 2) sampled in retail markets (Hong Kong, N = 41) and (3) individual ceratotrichia of unknown species origin removed from shark fin soup purchased in restaurants in the U.S.A. (N = 50, Table 2). Genomic DNA was extracted from~10-25 mg of tissue using the DNeasy commercial kit (Qiagen, Valencia, California). PCR was performed in a volume of 50 uL, which included 1 uL of the extracted genomic DNA, 10 pmol of each forward and reverse primer, 1X PCR buffer, 200 uM dNTPs, and 1 unit of HotStar© Taq Polymerase (Qiagen, Valencia, California). All reactions were run with a positive (i.e., shark genomic DNA previously confirmed to amplify with other primers) and negative (i.e., no DNA). Thermal cycling conditions consisted of a 15 min activation of the polymerase at 95°C, followed by 35 cycles of 1 min at 94°C, 1 min at 52°C, and 2 min at 72°C, followed by a final extension step of 72°C for 5 min. Amplified fragments were resolved on a 2% agarose gel and visualized using ethidium bromide dye to verify successful amplification. PCR products were purified with Exo-SAP-IT (Affymetrix, Inc., Santa Clara, CA, USA) and sequenced using the Big Dye Terminator v3.1 cycle sequencing kit (Applied Biosystems, Foster City, CA, USA) with a Bio-Rad DYAD thermal cycler (Bio-Rad Laboratories, Hercules, CA, USA). The M13 forward primer was used for sequencing. The resulting products were precipitated with 125mM EDTA and 100% ethanol and run on an ABI 3730 DNA Analyzer (Applied Biosystems). Resulting sequences were validated by eye, trimmed for quality and any primer sequence present was removed.
The next objective was to verify that mini-barcode sequences obtained in this manner could be used to identify CITES-listed sharks. Unknown COI sequences can be identified using searchable databases, character based approaches (i.e., diagnostic nucleotoides) or phylogenetics (e.g., Lowenstein et al. 2009). The latter approach is limited if the obtained sequences are short and was therefore not pursued in this study. We downloaded all available COI sequences for the CITES listed shark species on GenBank and the Fish Barcode of Life Initiative (FISH-BOL) online database (www.fishbol.org). Sequences were trimmed to only include the section contained within the expected fragment that we typically obtained readable sequence from. These shortened sequences were used in BOLD (FISH-BOL) and BLAST (GenBank) searches to identify them to the lowest taxon possible. We considered the species identifiable with this short COI fragment when BOLD returned a 100% species level match and when the sequence's closest match in BLAST was unambiguously and exclusively one of the CITES listed species. We also constructed a character-based identification key for each of the CITES listed species and their closest relatives determined in a previous phylogenetic study [13][14]. The downloaded and trimmed DNA sequences from the CITES species were aligned with the same sequences from their closest relatives in BioEdit v. 7.1.3.0 [15]. A consensus sequence was created for each species and then compared to its closest relatives, with the exception of Sphyrna tudes because only one DNA sequence was available. For each CITES species we isolated a combination of nucleotides at multiple positions that collectively distinguish the species from its closest relatives, which we hereafter refer to as the compound character attribute or cCA. All unknown processed fin and soup samples were identified to the lowest taxon possible by first using the searchable BOLD and BLAST databases. If these searches indicated that they were or could possibly be one of the CITES listed species we then used the cCA we had previously constructed to confirm the identification. Table 1. Samples of known species identity tested for amplification (AMP) and sequencing (SEQ) performance with the mini-barcode assay. whether the samples were used to test for positive amplification ("AMP") and/or sequencing this amplicon ("SEQ").

Results and Discussion
The mini-barcode assay we developed led to the successful amplification of the expected fragment of all of the known shark samples that we tested (Table 1). Given the taxonomic diversity of these tissue samples we suggest that it is likely that this assay will amplify all or most shark species. At this stage we can confidently state that it consistently amplifies all of the tested species, which include all sharks currently listed on CITES. The assay also amplified all 41 processed fins of unknown species identity from Hong Kong. This indicates that the degraded DNA in these types of products will typically yield sufficient PCR product for sequencing. Thirty-one out of fifty shark fin soup samples amplified and were sequenced while the remaining nineteen samples failed to amplify after multiple attempts. This demonstrates that even highly degraded, cooked shark products can in many cases be amplified and sequenced using this assay. The partial COI sequence that would be produced by our assay was obtained from Genbank and FISH-BOL entries for white (N = 16), whale (N = 19), basking (N = 52), scalloped  hammerhead (N = 215), smooth hammerhead (N = 38), great hammerhead (N = 38), oceanic whitetip (N = 34) and porbeagle sharks (N = 85). We identified several errors among the shark sequences in GenBank, as has been found for other taxa [16][17][18][19]. Two DNA sequences on Gen-Bank identified as CITES species, for example, have DNA sequences most closely related to non-CITES species. For instance a white shark (accession number JQ654702.1) has a 99% sequence identity to blue shark Prionace glauca but only 83 percent sequence identity to the closest white shark. Likewise, a basking shark sequence (accession number JQ654736.1) had a 99% identity with oceanic whitetip and common thresher Alopias vulpinus (one or both of which were also misidentified given that they are from different orders) but only an 86% sequence identity with the most similar basking shark. After omitting these misidentified sequences each distinct partial COI haplotype (S1 Table) found for each CITES species was used as a query in BOLD and BLAST searches. All of them except the oceanic whitetip sequences were correctly identified to the species level based on this sequence alone (i.e., they were considered a 100% correct species level match based on genetic distance in BOLD and their closest BLAST match was the correct species). The tissue samples that we amplified and sequenced from these species yielded the same results (Table 1). In all searches the oceanic whitetip sequences were matched to the genus Carcharhinus but the mini-barcode sequences were identical or nearly identical to dusky (Carcharhinus obscurus) and Galapagos (C. galapagensis) sharks. GenBank and BOLD-obtained sequences from each of the CITES listed species were aligned with its closest relatives. A cCA of 21 bases was constructed that was unique for each species ( Table 3). All of the CITES listed species varied from its closest relatives by at least 2bp with the exception of the oceanic whitetip, which has only one base separating it from at least three other Carcharhinus spp. We would caution against using this base alone (at position 60, Table 3 and S1 Table) to identify this species at this stage for two reasons. First, there may be variation at this site in one or more of these species that have yet to be observed in sequenced individuals. Position 60 is also in the early part of the sequence that is not as well resolved as the rest and, consequently, we do not always use this part of the sequence to identify unknown samples (S1 Table). Overall, the expected cCA was present in each of the known samples that we amplified and sequenced from these species (Table 1).
It is important to highlight that some individuals of the CITES species might have alternative polymorphisms at one or more of these proposed diagnostic sites and therefore have a slightly different cCA to that presented here. We therefore suggest an identification strategy that combines multiple approaches to determine whether or not a mini-barcode sequence is from a CITES listed shark species. First, the sequence should be entered in BOLD and BLAST.
Position numbers (bold) are from the beginning of the COI gene. doi:10.1371/journal.pone.0114844.t003 All but one of the CITES listed species (the oceanic whitetip) are identifiable in this manner alone but at the very least this approach can identify them to genus with 100% certainty. The species diagnosis can then be independently checked by using the cCA presented for each of these species in Table 3 and the alignment in S1 Table, acknowledging that slight variation from this sequence may be seen in some individuals. Oceanic whitetip mini-barcode sequences are confidently identified to genus in BOLD, while BLAST and the cCA narrow it down to being either an oceanic whitetip, dusky (Carcharhinus obscurus), Galapagos (C. galapagensis) or Caribbean reef shark (C. perezi, Table 3 and S1 Table). A fin testing like this in a CITES enforcement and monitoring context would require additional sequencing to confirm the species identity. However oceanic whitetip dorsal, pectoral and lower caudal fins have a very distinctive rounded apex even after processing that is quite unlike the more pointed fins of the dusky, Galapagos or Caribbean reef sharks [20]. This suggests that a combination of visual and mini-barcoding results enable provisional species-level identification of oceanic whitetip fins. We used the mini-barcode assay and the proposed sample identification strategy (i.e., BOLD/BLAST search followed by visual inspection of the aligned sequence for the cCA) to identify processed shark products. All BOLD, BLAST and cCA (when needed for CITES species) results were concordant. Of 40 dried processed fins 32 were requiem sharks Carcharhinus spp., 5 were school sharks (Galeorhinus galeus), 2 were blue sharks and there was a single spottail (Carcharhinus sorrah) fin ( Table 2). The cCA of one of the Carcharhinus fins was an exact match to an oceanic whitetip shark but we again caution that further sequencing should be employed to verify this. Thirty-one out of 50 shark fin soup samples amplified and were identified to genus or species ( Table 2). The species found in soup were predominantly blue sharks, school sharks and requiem sharks (Carcharhinus spp.), but also included shortfin mako (Isurus oxyrhyncus), dogfish (Squalaus spp.), smoothound (Mustelus schmitti, Table 2). The BLAST/ BOLD/cCA approach showed three soup samples originated from two CITES-listed species of hammerheads (scalloped Sphyrna lewini and smooth, S. zygaena, Table 2).
We envision three primary applications of this mini-DNA barcoding protocol. First, border control personnel have a legal obligation to inspect shark fin imports and exports for CITESlisted species. We suggest that our protocol could be employed to identify processed fins from these species, either by randomly sampling and genetically testing fin in trade or by testing fins that are suspected to originate from these species given their size, shape or other types of field evidence. All CITES-listed species except the oceanic whitetip are identifiable with this protocol alone. A second application stems from existing and proposed bans on trade of processed shark fins and shark fin soup in several countries and U.S. states. Our protocol may be used in law enforcement contexts to confirm that processed fin products or soup are derived from sharks as opposed to certain batoids (e.g., guitarfish), teleost fish or other substitute products that may be legally traded in these states or countries. Finally, we propose that this protocol may also be useful for basic retail fin market surveys to quantify the occurrence of different shark genera and species in this trade. The unknown samples used in this study were not collected in a systematic or random manner and thus do not provide any information on the overall species composition of the trade in our sampling regions. Nevertheless, our limited sampling efforts uncovered ongoing retail trade in the processed fins or soup derived from 2 of the CITES-listed hammerhead sharks as well as several species (e.g., soupfin, shortfin mako) that are listed as "Vulnerable" (A2bd+3d+4bd) by the International Union for the Conservation of Nature (IUCN). The mini-barcode protocol presented here could be used to monitor the contribution of certain species and/or genera to dried shark fin retail markets, which is clearly needed given the large volume of the catch [5] and the documented declines in several of the most heavily exploited or vulnerable species [3][4].
Supporting Information S1 Table. An alignment of the consensus sequences for 31 species obtained on Genbank and BOLD together with sequences recovered from the processed fin and fin soup samples . The base numbers correspond to the base position within the COI gene with respect to the start codon. Because the S. tudes sequence is not a consensus sequence, the Genbank accession number is listed. The sample numbers correspond to those in Table 2