Australian scorpion venoms have been poorly studied, probably because they do not pose an evident threat to humans. In addition, the continent has other medically important venomous animals capable of causing serious health problems. Urodacus yaschenkoi belongs to the most widely distributed family of Australian scorpions (Urodacidae) and it is found all over the continent, making it a useful model system for studying venom composition and evolution. This communication reports the whole set of mRNA transcripts produced by the venom gland. U. yaschenkoi venom is as complex as its overseas counterparts. These transcripts certainly code for several components similar to known scorpion venom components, such as: alpha-KTxs, beta-KTxs, calcins, protease inhibitors, antimicrobial peptides, sodium-channel toxins, toxin-like peptides, allergens, La1-like, hyaluronidases, ribosomal proteins, proteasome components and proteins related to cellular processes. A comparison with the venom gland transcriptome of Centruroides noxius (Buthidae) showed that these two scorpions have similar components related to biological processes, although important differences occur among the venom toxins. In contrast, a comparison with sequences reported for Urodacus manicatus revealed that these two Urodacidae species possess the same subfamily of scorpion toxins. A comparison with sequences of an U. yaschenkoi cDNA library previously reported by our group showed that both techniques are reliable for the description of the venom components, but the whole transcriptome generated with Next Generation Sequencing platform provides sequences of all transcripts expressed. Several of which were identified in the proteome, but many more transcripts were identified including uncommon transcripts. The information reported here constitutes a reference for non-Buthidae scorpion venoms, providing a comprehensive view of genes that are involved in venom production. Further, this work identifies new putative bioactive compounds that could be used to seed research into new pharmacological compounds and increase our understanding of the function of different ion channels.
Citation: Luna-Ramírez K, Quintero-Hernández V, Juárez-González VR, Possani LD (2015) Whole Transcriptome of the Venom Gland from Urodacus yaschenkoi Scorpion. PLoS ONE 10(5): e0127883. https://doi.org/10.1371/journal.pone.0127883
Academic Editor: Mandë Holford, The City University of New York-Graduate Center, UNITED STATES
Received: September 11, 2014; Accepted: April 20, 2015; Published: May 28, 2015
Copyright: © 2015 Luna-Ramírez et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: RNA-seq data is available at SRA of NCBI with the following accesion number SRP045734. Additionally, the identified precursors are registred on GenBank (EST database: dbEST JZ818592 - JZ818692).
Funding: This work was supported by the Struan Sutherland Fund awarded to the Australian Venom Research Unit (KLR). Partial support came from the grant IN200113 of Direccion General de Asuntos del Personal Academico (DGAPA-UNAM) and Consejo Nacional de Ciencia y Tecnologia (CONACyT) given to LDP.
Competing interests: The authors have declared that no competing interests exist.
Scorpions are arthropods belonging to the group of arachnids that have been living on this planet for over 400 million years . Currently, around 1500 living scorpion species have been described . Millions of years of evolution have resulted in a high degree of specific and efficient scorpion venom components. These venoms are true arsenals, containing important biomolecules selected for the immobilization of prey and serving in defense against predators.
Scorpions are classified in 18 families , the Buthidae family being the most comprehensively studied thus far. The family contains 30 different genera of scorpions dangerous to humans. Scorpion venom possesses different classes of toxins that mainly modify the function of ion channels and receptors in excitable membranes [4–7]. In addition, scorpion venom possesses a great variety of components: salts, nucleotides, biogenic amines, enzymes such as: phospholipase, hyaluronidase, L-aminoacid oxidase , metalloproteinase , serine-protease, mucoproteins; toxic peptides, proteins and antimicrobial peptides active against bacteria, fungi, yeast and viruses. Examples of the latter are: mucroporin-M1, which inhibits the amplification of the hepatitis-B virus and peptide Kn2-7, which possesses anti-HIV-1 activity [10, 11].
To date, 24 transcriptomes and proteomes have been reported on Buthidae scorpions [12–37] whereas only 15 studies have been performed with non-Buthidae scorpions, which are not dangerous to humans [19, 21, 38–49]. Non-Buthidae venoms contain a low percentage of sodium channel specific toxins. Interestingly, and contrary to the Buthidae scorpions, non-Buthidae venoms have a high number of antimicrobial peptides , potassium channel toxins, calcins and peptides with anti-malarial activity. Recently, a peptide named scorpine, isolated from Pandinus imperatus, was used successfully as an anti-malarial agent in biological models [51, 52].
An estimated total of 300,000 different peptides are present in the venom of extant scorpion species. Approximately 1% of these scorpion components have been characterized,  the majority of these being toxins. This is understandable because of the medical interest. Important efforts have been focused at identifying components responsible for human envenomation, and to a lesser extent for structure-function studies required to recognize toxin targets. This focus on toxins has left many potentially bioactive venom compounds unexplored.
As early as 1967, Rochat et al reported the isolation and characterization of several toxic peptides from the venom of a Buthidae scorpion [54–56]. Nowadays, scorpion toxins are still being characterized biochemically and pharmacologically in order to determine the number of proteins in the venom and their bioactivity. In the 1980s, electrospray ionization mass spectrometry (ESI-MS/MS) increased the speed of the task of venom characterization. High-throughput protein identification techniques by mass spectrometry allowed the proteomic analysis of venoms and facilitated the identification of hundreds of unknown different molecular weights components.
Several studies have reported complete mass fingerprinting of venom using proteomic analysis of venom components . It is conceivable though, that not all components were identified using this technique given that some components are present in venom at very low concentrations . This reveals the power of a venom gland transcriptomic analysis: all protein content and toxin-like peptides are potentially identified. 
Studying venom gland components at the transcriptomic level was made possible by the advent of the polymerase chain reaction. Studies have been performed using cDNA libraries of scorpion venom glands, which allowed for the identification of many venom components. However, the cDNA libraries that have been constructed with milked venom glands [15, 17, 32, 39, 40, 43, 45, 46, 48] and the ones constructed with “replete” venom glands [18, 37, 41] have reported only a few complete sequences (tens of genes). These genes code mainly for toxic peptides, antimicrobial peptides and in rare cases, for genes involved in cell regulation and metabolism.
Whole venom gland transcriptomes can now be produced with high-throughput sequencing technologies such as Next Generation Sequencing (NGS) (also called RNA-seq). Several platforms using NGS (454 pyrosequencing, Illumina (SOLEXA) sequencing, SOLiD sequencing, ion semiconductor sequencing, DNA nanoball sequencing) have proven to be powerful tools for research in genome sequencing, miRNA expression profiling and especially de novo transcriptome sequencing of non-model organisms [59–65]. NGS is a low-cost sequencing alternative capable of producing thousands or millions of sequences at once. The resulting dataset reveals information about genes that code for toxins, peptides with pharmaceutical interest and other components among which are enzymes and housekeeping genes present in the venom gland.
In the work presented here, NGS Illumina sequencing was used to perform a de novo assembly of the transcriptome of the venom gland of the scorpion Urodacus yaschenkoi. The aim of this study was to characterize in depth the complete set of mRNA transcripts present in the venom gland of a non-Buthidae scorpion. A further aim was to correlate this data with the already reported venom proteome and compare it with the cDNA library shotgun approach previously constructed by our group .
The coverage of the transcriptome was found to be 8.4 Gb, revealing hundreds of genes involved in the process of venom making. Several subfamilies of scorpion toxins and hundreds of genes related to biological processes, molecular functions and cellular components were identified. Further, we report 210 venom transcripts with full-length coding sequences assumed to code for 111 unique venom compounds, among which there are sequences that code for venom toxins, peptides and venom-specific proteins.
Finally, a comparison with the transcriptome of Centruroides noxius  and with the reported genes of Urodacus manicatus  was made. The comparison with C. noxius transcriptome revealed that components involved in biological processes, molecular function and cellular components are conserved between these species. The toxins however, are very different in a Buthidae and in an Urodacidae scorpion (non-Buthidae). Conversely, the toxins reported for U. manicatus are of the same subfamilies of toxins found in U. yaschenkoi scorpion. This dataset will contribute to the public information platform to accelerate studies in venomics research.
Material and Methods
Sample collection and RNA extraction
The Urodacus yaschenkoi specimen was obtained from the Australian desert on New South Wales close to Nanya (GPS coordinates -33.22422, 141.306059) on May 2011. The captured organism was taxonomically identified according to Koch  and maintained in a plastic box with water ad libitum and was fed fortnightly with crickets.
The total RNA was extracted from a flash frozen (immediately frozen in dry ice) ‘replete’ venom gland with the Animal Tissue RNA Purification Kit from Norgen Biotek Corporation according to the manufacturer’s instructions. The quality of the RNA was verified using a 2100 Bioanalyzer (Agilent Technologies).
cDNA Library preparation and sequencing
The cDNA library for the high-throughput sequencing was made with Illumina TruSeq RNA Sample Preparation Kit.
The cDNA library preparation consisted on the following steps: i) mRNA enrichment and fragmentation, ii) cDNA synthesis, iii) paired- end and adaptors ligation (adenylate 3’Ends), iv) PCR amplification (DNA enrichment) and v) High-throughput sequencing (RNA-Seq method, Illumina Next-Gen sequencing technology).
The total RNA was purified to obtain the poly-A containing mRNA molecules using oligo-dT-attached magnetic beads applying two rounds of purification. During the second elution of the poly-A mRNA, the RNA was also fragmented using elevated temperatures (94°C) and primed for cDNA synthesis. Then, first strand cDNA was synthesized by reverse transcription (using SuperScript II reverse transcriptase) of the cleaved RNA fragments primed with random hexamers. Immediately, the RNA template was removed and the double-stranded (ds) cDNA is synthesized (using DNA polymerase I, dNTPs and RNA displacement with RNAse H). Ampure XP beads were used to separate the ds cDNA from the second strand reaction mix. The ds cDNA was subjected to end repair by converting the overhangs resulting from fragmentation into blunt ends, using an End Repair (ERP) mix. A single ‘A’ nucleotide was added to the 3’ ends of the blunt fragments to prevent them from ligating to one another during the adapter ligation reaction. A corresponding single ‘T’ nucleotide on the 3’ end of the adapter provided a complementary overhang for ligating the adapter to the fragment. Finally, DNA enrichment was done using PCR to selectively enrich those DNA fragments that have adapter molecules on both ends and to amplify the amount of DNA in the library. The PCR was performed with a PCR primer cocktail that anneals to the ends of the adapters. The purified cDNA library was used for cluster generation on Illumina’s Cluster Station and then sequenced using High-throughput RNA-sequencing (Illumina Next-Generation sequencing platform) on Illumina HiSeq 2000 following vendor’s instruction.
Assembly and analysis of transcriptome
The raw sequencing intensities were transformed by base calling into sequence data using Illlumina’s RTA software, followed by sequence quality filtering using GELRAD (Illumina). Paired-end reads were 100 nt in length. The extracted sequencing reads were saved as fastq files (SRA accession number SRP045734). Adaptor fragments were removed from the raw reads to yield the clean read required for the analysis. De novo transcriptome assembly of these short reads was performed using Trinity RNA seq software (http://trinityrnaseq.sourceforge.net/).
First, the RNA-seq was assembled into the unique sequences of transcripts, often generating full-length transcripts for a dominant isoform. Then the assembled sequences were clustered. Each cluster represents the full transcriptional complexity for a given gene (or sets of genes that share sequences in common). These were designated as contigs. Then a further assembly step followed rendering unique gene sequences that were designated as unigenes.
Abundance estimation and quality control
To estimate transcripts abundance, raw reads were mapped back to the assembled contigs using the Tophat/Cufflinks suite (http://tophat.cbcb.umd.edu/ and http://cufflinks.cbcb.umd.edu/). TopHat is a fast splice junction mapper for RNA-Seq reads. It uses Bowtie (ultra high-throughput short read aligner) to align RNA-Seq red to mammalian-sized genomes then analyzes the mapping results to identify splice junctions between exons. Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. To generate potential novel transcripts, Cufflinks was run without known reference transcripts. The relative abundance of the transcripts is based on how many reads support each one. Expression level was estimated and presented in FPKM (fragments per kilobase of transcript per million fragments mapped).
The quality of raw sequence data was assessed with FastQC software (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and with CLC bio software. Eleven parameters were measured to assure the library construction, sequencing and de novo assembly were well done. At the same time, basic statistics of reads and detection of sequencing errors were obtained.
The assembled contigs were first blasted against a database containing only toxins from the scorpions Hadrurus gertschi, Opisthacanthus cayaporum and Tityus discrepans created in situ and collected from the NCBI non-redundant (nr) database. These annotated proteins were aligned to the assembled contigs to identify the homologous genes in Urodacus yaschenkoi using TBLASTN (E-value < 0.1). The main purpose at this stage was to verify if the assembly was correct. Then, the whole set of unigenes were blasted with tBlastn and later with NCBI-nr BlastX (E-value <10-5) to search identity or similarity. The sequences presenting hits in these databases were analyzed with Blast2go software (www.blast2go.com); the Gene Onthology (Go terms) was obtained as well by this mean. In brief, the bioinformatic analysis used for the assembled transcriptome was as follows: first, the 243,870 assembled sequences (contigs) were searched against an in situ scorpion toxin database. Once confirmed that the assembly in fact rendered hits for scorpion toxins, then the whole set of unigenes (62,505) were searched against tBlastn and then with NCBI-nr BlastX (E-value <10-5). Later, all the unigenes were analyzed using Blast2go with the default parameters to find the Go terms. Finally, a sub-dataset was created with the unigenes that gave hits with sequences reported in GenBank related with venom and housekeeping genes. The annotation was made using Blast2go software.
In parallel, a second sub-dataset of sequences having hits only with toxins and venom components of any species were then used to extract the coding DNA sequence (CDS) and identify their mature sequence. Sequences of this second sub-dataset were deposited in GenBank (EST database: dbEST JZ818592—JZ818692) and analyzed as follows: nucleotide sequences were translated to obtain precursor peptides using ExPasy-Translate tool program (http://web.expasy.org/translate). The signal peptide was predicted with Signal P 4.0 program (http://www.cbs.dtu.dk/services/SignalP/) and the propeptide was determined by using Prop 1.0 software (http://www.cbs.dtu.dk/services/ProP/). The theoretical monoisotopic molecular mass of putative mature peptide was obtained using ProtParam (http://web.expasy.org/protparam).
Furthermore, the presence of post-translational modifications (amidation and disulfide bridges) was determined manually by comparison within other scorpion toxins or cytotoxic (antimicrobial) peptides. Multiple alignments were performed with CLUSTALX v2.0 and the percentage of identity was determined with DNA Strider 1.3.
Comparison with Centruroides noxius transcriptome and with Urodacus manicatus sequences
The sequences having hits in the NCBI database were compared with the transcriptome of C. noxius  using Geneious software and the tool ‘map to reference’ with strict parameters (high sensitivity/medium speed and 5 iterations with 35 bp of overlap). The same criteria was followed to compared the 19 genes reported for U. manicatus (GenBank accession numbers: GALI01000001-GALI01000019) . The similar sequences obtained were then manually analyzed to find the CDS and ORF and were used to build alignments with the already identified U. yaschenkoi sequences.
Results and Discussion
Urodacus yashenkoi venom gland transcriptome sequencing output
To comprehensively cover the U. yaschenkoi venom gland transcriptome, the total RNA of the venom gland was extracted, and the mRNA was isolated, enriched, fragmented and reverse-transcribed into cDNA. The cDNA was sequenced on Illumina HiSeq 2000 and the resulting sequencing data were subjected to bioinformatics analysis.
After sequencing, 83,812,864 raw reads were obtained. After removal of adaptors, ambiguous reads and low quality reads the sequenced data resulted in 83,808,178 mappable reads. The average length read was 101 nt. The transcriptome size was equal to 8,464,625,978 nucleotides (8.4 Gb).
The clean and high-quality reads were assembled de novo using Trinity software, and resulted in 67,026,681 mapped reads and 243,870 assembled contigs ranging from 109 to 15,222 bp length being the mean assembled length of 260 bp. The number of unigenes found was 62,505 with an N50 of 1,139 bp (i.e. 50% of the assembled bases were incorporated into contigs of 1,139 bp or longer). A summary of the Illumina sequencing results and assembly output is outlined in Table 1. The size distribution of the contigs from Urodacus yashenkoi venom gland is shown in Fig A in S1 file.
The Quality Control report (QC) showed that the length distribution of most sequences (more than 99%) were 100–101 bp, no ambiguous base-content in 98.93% of the sequences and the coverage (number of sequences that support the individual base position) was 100%. Twenty-four 5’-end of the sequence was found multiple times but their particular percentages were not more than 0.5%; in most cases, no identity was found. In general, these data means that the assembly was well performed.
A total of 3900 sequences (unigenes), that gave hits with sequences reported in GenBank related with venom and housekeeping genes, were obtained (see Materials and Methods). These sequences subsequently were analyzed manually to select only those sequences that codify toxins, antimicrobial peptides and venoms specific components. Further analysis encompassed the determination of the CDS and delimitation of sequence precursors, identification of signal peptides, propeptides, mature peptides and posttranslational modifications (Table 2). Additionally, the theoretical mass was calculated and conserved domains were found. By this mean, 210 sequences coding for 111 unique amino acid sequences including venom toxins and proteins involved in venom production were comprehensively identified with all the parameters above mentioned (see Table 2). The sequences identified belong to the following subfamilies of known scorpion toxins: α-KTx (alpha-type of K+-channel specific peptides), β-KTx (beta-type of K+-channel specific peptides), calcium-channel toxins (calcins), ascaris-type protease inhibitor peptides, venom proteins, several enzymes, antimicrobial peptides, sodium-channel toxins, toxin-like peptides, venom allergens and La1-like peptides. This shows that the identified sequences comprise a wide array of diversity in venom components.
Expression level of transcripts was assessed using Tophat/Cufflinks suite. The most abundant transcripts were those giving hits with venom toxins, hypothetical proteins, antimicrobial peptides, and α-KTxs. Then enzymes, such as: NADH-dehydrogenases, phospholipases, sulfotransferases, elastases and hyalorunidases, which were highly expressed in the venom. The less abundant transcripts were those having hits with sodium-channel specific peptides and cytochrome oxidase I (COI). In S1 Table the FPKM for venom related compounds and housekeeping genes is shown.
Unigene annotations provide functional information, including protein sequence similarities and gene ontology (GO) information. Only 31,807 unigenes had hits with the searched databases and therefore only those were annotated. The fact that only 51% of the total number of unigenes had significant hits suggests that many scorpion-specific genes still remains undiscovered. This data agrees with the annotated sequences of other scorpion transcriptomes for example: the C. noxius transcriptome .
The GO database comprised three ontology domains: molecular function (MF), cellular component (CC) and biological processes (BP). For the first Go term, 5,642 unigenes were matched with 392 GO terms. For the cellular component term, 7611 sequences were annotated with 257 Go-terms and for the biological processes component 18,554 sequences were annotated with 1690 Go-terms (Fig 1), making the BP the most abundant and diverse term. In addition, a graph showing the most abundant Go-term categories per domain is shown in Fig 2. Supplemental data show graphs for the sub-dataset containing only toxins and venom related components: most abundant Go-terms, pie charts with the most abundant Go term per domain and enzyme distribution (Fig B, Fig C (A-C) and Fig D in S1 file, respectively). Fig E in S1 file of supplemental data shows the most abundant family of enzymes found in the whole transcriptome.
The three Go terms domains are plotted with the number of annotated unigenes and also, the variety within each domain is showed (different categories of Go term per domain).
Scorpion toxins and venom components identified in the venom gland transcriptome of U. yaschenkoi
From the 62,505 unigenes only 51% had significant hits against the searched databases. From those sequences, only 3,900 had sequences similar to toxins, venom related components (such as hyaluronidases, phospholipase, and other enzymes) and housekeeping genes (for example, heat shock protein, β-actin, RNA binding protein). These sequences were further analyzed as described in Material and Methods. Eleven subfamilies of scorpion toxins were identified (Fig 3) and 210 delimited sequences code for 111 unique amino acid sequences are shown in Table 2.
A total of 62,505 unigenes were searched against the NCBI-nr database, only 51% had an identity against the databases; from those 3900 were related to venom components and housekeeping genes and 210 sequences codify toxins and enzymes in scorpion venom were identified. The diagram shows the relative proportion expressed as percentages, of each subfamily of scorpion toxins found in the analysis of the transcripts from U. yaschenkoi venom gland transcriptome.
The following lines describe each one of these subfamilies of putative proteins/peptides identified in this transcriptomic analysis, starting with the most abundant components found.
It is well known that scorpion venoms consist of a heterogeneous mixture of 100 to 700 different components, among which are: inorganic salts, lipids, nucleotides, free amino acids, mucopolysaccharides, peptides and proteins. Among the proteins, enzymes are the most abundant .
From the 210 identified and analyzed nucleotide sequences of U. yaschenkoi (Fig 3 and Table 2), 31% correspond to enzymes, including hyaluronidases, glucosaminidases, phospholipases, serine proteinases, phosphatases and kinases.
For example, component label comp120806_c0_seq1 in Table 2 corresponds to a partial sequence of a putative hyaluronidase-1 (isoform 1) showing 93.58% identity with hyaluronidase Uro-1 previously identified in the transcriptome of U. manicatus . In addition, the sequences comp7041_c0_seq2 and comp7071_c0_seq2 code for putative hyaluronidase-3 isoform x3 (Table 2).
The sequence comp374_c0_seq1 (Table 2) from U. yaschenkoi shows similarity with a chymotrypsin-like protease-1 from Mesobuthus eupeus.
However, the most abundant enzymes found correspond to proteins with phopholipase activity (Table 2), which is in agreement with literature data found in venoms of non-Buthidae scorpions, such as phospholipin and a heterodimeric phospholipase A2 isolated from the venom of Pandinus imperator  and sequences found in another transcriptomic study conducted with the scorpion P. cavimanus , which are assumed to be phospholipases.
Linear peptides containing no disulfide bridges (NDBPs) have been abundantly found in the venoms of scorpions of the families non-Buthidae, contrary to what is reported for scorpions of the family Buthidae. The latter contains a substantial amount of peptides tightly joined by disulfide bridges. Among the NDBPs found are peptides displaying antimicrobial activity (AMPs). Until now, at least 40 AMPs from scorpions are described in the literature . These were subdivided in three main categories: long chain peptides (over 35 amino acids long), medium length peptides (20–35) and short chain peptides (less than 20 residues) . However, venom from Buthidae and non-Buthidae scorpions shows the presence of peptides with antimicrobial activity that do contain disulfide bridges (DBPs), such as the defensins and scorpines.
In this work sequences that code for antimicrobial-like peptides were the second most abundant group having 43 sequences (21%). These sequences codify different subfamilies of putative antimicrobial peptides, such as short antimicrobial peptides (IsCT-like) from the subfamily of non-disulfide bridges peptides-5 (NDBP-5), defensins and long-chain scorpine-like peptides, being the first ones the most abundant within this category of AMPs. The short chain AMPs of the IsCT type were initially isolated from the venom of the scorpion Opisthacanthus madagascariensis . The IscT peptides are derived from large precursors, characterized for having a signal peptide of 23–24 amino acids, a mature amidated peptide with less than 20 amino acids long and a propeptide containing from 31 to 46 amino acid residues. In addition they show a conserved sequence of three amino acids of the type GRR and GKR, which marks the end of the mature peptide and the beginning of the propeptide (see Fig 4). The identified NDBP-5 sequences found in this work were mostly 13 amino acid long peptides, but seven sequences were 14 residues long and only one sequence was 18 residues long. All of them were manually analyzed and the signal peptide, the posttranslational modification motif (GKR) and the propeptide were identified and delimited. Fig 4 shows the alignment of NBDP-5 sequences found in this transcriptomic analysis. Interestingly, peptide GFWGKLWEGVKNAI was codified by 7 different genes showing different propeptides, suggesting a wide array of mechanisms for toxins production. Overall, five different antimicrobial peptides with unique sequence were also found. One of these transcripts is the peptide ILSAIWSGIKSLF which was previously found in a shotgun cDNA library previously reported by our group  and its theoretical molecular weight can be found in the venom proteome, at RT 13.88.
Multiple alignment of sequences obtained from the transcriptome of U. yaschenkoi that codify antimicrobial peptides from the subfamily NDBP-5. These sequences are compared with CYLIP-Uro-1 (GenBank: GALI01000003.1), CYLIP-Uro-2 (GenBank: GALI01000004.1), CYLIP-Uro-3 (GenBank: GALI01000005.1), CYLIP-Uro-4 (GenBank: GALI01000006.1) and CYLIP-Uro-5 (GenBank: GALI01000007.1) from U. manicatus . The predicted signal peptide is underlined; the mature peptide is in bold and highlighted in yellow, the conserved proteolytic site GKR is in italics and underlined and the propetide in italics. The hyphen (-) in the name of U. yaschenkoi sequences indicates that the amino acid sequence was found within different nucleotide sequences (transcripts), for example: comp17_c0_seq1-4 means that four different nucleotide sequences codify the same peptide. The percentage of identity of the mature peptide is indicated at the right (% Identity) with respect to the peptide encoded by comp17_c0_seq1-4; additionally, at the far right, the theoretical molecular weights of the antimicrobial peptides for U. yaschenkoi are shown. The theoretical molecular weight of the peptide encoded by the sequences comp17_c0_seq1-4 (in bold) has a perfect match with the U. yaschenkoi proteome previously reported by  at the retention time 13.88. Note: several sequences of transcripts found in the U. yaschenkoi transcriptome codify the same precursors, such as: comp17_c0_seq1-4 and comp18_c0_seq1-2 (not shown); comp17_c0_seq5 and comp18_c0_seq3-4 (not shown); comp192_c0_seq1-2, comp192_c0_seq4-5 (not shown), comp192_c0_seq7-9 (not shown) and comp196_c0_seq1-7 (not shown) codify the same precursor.
Here we compared the sequences of AMPs found in U. yaschenkoi with those reported (19 sequences) from the transcriptome of Urodacus manicatus . Data analysis shows that the subfamily of peptides having the most shared similarities between these two species is the short antimicrobial peptides, NDBP-5 (Fig 4). Sequences CYLIP-Uro-1, CYLIP-Uro-2, CYLIP-Uro-3, CYLIP-Uro-4 and CYLIP-Uro-5 from U. manicatus have structural similarities with those coding for UyCT1 and UyCT3 antimicrobial peptides previously reported from U. yaschenkoi . In this work, several new UyCT1-like and UyCT3-like peptides were identified within the U. yaschenkoi transcritpome (Fig 4). Interestingly, there is a mature peptide sequence having 100% sequence identity within both species of scorpions. The mature peptide sequence is ILSAIWSFIKSLF and can be found in comp17_c0_seq1-4 and comp18_c0_seq1-2 from U. yaschenkoi and in CYLIP-Uro-2 from U. manicatus. However, the precursors in those sequences are different (Fig 4). The percentage of identity of the mature antimicrobial peptides (NDBP-5) relative to the ILSAIWSFIKSLF sequence ranged from 23 to 100% identity showing the diversity of gene sequences.
Results of the comparative analysis also showed that the opistoporine-like peptides of both species of scorpions are similar. The sequence comp42_c0_seq1 (Table 2) of U. yaschenkoi compares well with Csab-Uro4 from U. manicatus and the sequence of comp336_c0_seq1 (Table 2) from U. yaschenkoi is equivalent to Csab-Uro3 from U. manicatus. These sequences show similarities with opiscorpine-3 from Opistophthalmus carinatus  and peptide SC11 previously reported for U. yaschenkoi .
The fact that all the antimicrobial peptides reported for U. manicatus had a match with the U. yaschenkoi transcriptome was expected because these two scorpions belong to the same genus and family of scorpion, both found in Australia.
As previously mentioned, the abundant presence of AMPs peptides in the venoms of non-Buthidae scorpions, is a particular characteristic of these species. In addition, these AMPs are example of leading components with potential application as antibiotics due to their demonstrated antimicrobial activity.
Scorpine, the first kind of this peptide described in the literature, is structurally a hybrid protein containing amino acid sequence similar to AMPs and K+-channel blocking peptides. It was purified from the venom of the scorpion Pandinus imperator and shown to be a potent anti-malarial agent against Plasmodium bergei .
The N-terminal domain of Scorpine is a linear segment capable of forming α-helix with cytolitic and antimicrobial activities, whereas the C-terminal domain is tightly linked by three disulfide bridges and was shown to have a β-KTx activity against potassium channels . The presence of two distinct domains in this type of scorpine-like peptides  makes difficult to classify them as either AMPs or K+-channel specific peptides. For this reason they were initially called orphan peptides .
In this transcriptomic analysis four sequences similar to the precursors of scorpine-like peptides were found (Fig 5). Two different sequences: comp42_c0_seq1 and comp47_c0_seq1 from U. yaschenkoi encode the same scorpine-like, called type 1, because it shows 57.33% identity with Hg-scorpine-like-1 from H. gertschi, and also has 94.74% identity with CSab-Uro-4 from U. manicatus (Fig 5A). The other two sequences from U. yaschenkoi: comp324_c0_seq1 and comp336_c0_seq1 code for the same scorpine-like type 2; which shows a longer stretch of amino acids. This sequence has 58.54% identity with Hg-scorpine-like2 of H. gertschi and 84.34% identity with CSab-Uro-3 from U. manicatus, both defined as scorpine-like toxins (Fig 5B).
A) The sequence obtained from the U. yaschenkoi transcriptome that codifies for a scorpine type 1 is shown and it is aligned with the reference scorpine Hg-scorpine-like1 from Hadrurus gertschi. Sequence CSab-Uro-4 from Urodacus manicatus  codifies as well for a scorpine type 1 and is included in the alignment. B) The sequence comp324_c0_seq1 found in the U. yaschenkoi transcriptome that codes for a scorpine type 2 is shown and aligned with Hg-scorpine-like2 from H. gertschi. Also, sequence CSab-Uro-3 from U. manicatus  is included. Both alignments show the percentage of identity of each sequence with respect to the reference sequence. The cysteine pattern (6 Cys) is highlighted in yellow. Note: comp42_c0_seq1 and comp47_c0_seq1 (not shown) code for the same precursor, comp324_c0_seq1 and comp336_c0_seq1 (not shown), code for the same precursor.
We had foreseen the existence of these scorpine-like sequences in the transcriptome of U. yaschenkoi. In fact, only the non-Buthidae scorpions have been reported, thus far, to contain scorpine-like peptides. However, due to the dual structural characteristics of these peptides, some erroneous classification of peptides from Buthidae families of scorpions were said to be scorpine-like components, when in reality they should have been classified simply as putative K+-channel toxins, because they show sequence similarities only with the C-terminal domain of scorpine.
Finally, the bi-functionality of the scorpine-like peptides found in venoms of scorpions is very interesting and promising for the possible development of drugs with antimicrobial and/or anti-malarial activity.
Another well represented group of sequences corresponds to toxin-like ones. They comprehend 15% of the analyzed transcripts (Table 2). This class of putative toxins has several cysteines that may form disulfide bridges and have been found in several scorpion transcriptomes. For many of them, their function has not been directly evaluated. The majority of toxin-like components were identified in transcriptomes of Buthidae scorpions (12–37) and to a lesser extend those from non-Buthidae scorpions (19, 21, 38–49).
Sequences similar to potassium channel specific toxins.
Potassium channel toxins (KTxs) are well known scorpion venom components. They are found in Buthidae and non-Buthidae species. A common feature of these peptides is the presence of one segment of α-helix and three β-sheet structures cross-linked and stabilized by disulfide bridges (Cs α/β structure), which forms 3 and/or 4 disulfide bridges, and were classified as α-, β-, γ- and κ-KTxs [75, 76]. The most abundant and best studied are the peptides of the family α-KTx, from which more than 140 different peptides are known and were sub-divided into 30 subfamilies (http://www.uniprot.org/docs/scorpktx and . Usually they are short peptides containing 23 to 42 amino acids, whereas the β-KTxs are longer, with more than 50 amino acid residues in their primary structures. Examples of β-KTxs are peptide TsTx-Kβ of the scorpion Tityus serrulatus and BmTXKβ of Buthus marthensii Karsch. The first is a blocker of Kv1.1 potassium channels, which shows an IC50 of 96 nM . A recombinant format of peptide BmTXKβ  was expressed heterologously and shown to be a bona fide blocker of potassium channel.
The transcriptomic analysis of U. yaschenkoi allowed the identification of 17 different sequences with structural similarities with other known blockers of voltage-gated potassium channels that belong to the short potassium channel blocker scorpion toxin family. Within these sequences, 13 are similar to α-KTx-6 subfamily that is characterized by having 4 disulfide bridges. In fact, a complete precursor of urotoxin, which was previously reported by our group , was among these 13 sequences described here. Additionally, four sequences similar to toxins of the α-KTx family containing 3 disulfide bridges were also found (Fig 6). These sequences showed 60% identity with the α-KTx8 toxin of the scorpion Lychas mucronatus .
(A) Alignment of sequences found in the transcriptome of U. yaschenkoi that code for putative α-KTXs with six cysteines. These sequences are compared against the alpha-toxin KTx8 (UniProtKB/Swiss-Prot: A9QLM3.1) from Lychas mucronatus. Note: The precursor encoded by comp2100_c0_seq1 is also encoded by sequences comp1991_c0_seq1 to seq5 (not shown). (B) Alignment of sequences from U. yaschenkoi that code for alpha-potassium toxins with eight cysteines compared against the α-KTx 6.10 toxin (UniProtKB/Swiss-Prot: Q6XLL5.1) from Opistophthalamus carinatus. For all sequences, the percentage of identity of U. yaschenkoi mature peptides (%I) is shown in relation to the toxin of reference. The theoretical molecular weight (MW) of each U. yaschenkoi peptide is also shown. The signal peptide is underlined; the mature peptide is in bold; the residues probably involved in amidation and the propeptide are indicated in italics. The conserved cysteines are highlighted in yellow. The symbol “&” indicates that the sequence was encoded by two different nucleotide sequences (transcripts). MW in bold indicates that this molecular weight was found in the venom mass fingerprint previously reported  and is indicated the retention time (RT) in which it was found. Note: comp849_c0_seq6 (not shown) codes for the same precursor as sequence comp1069_c0_seq1; comp1069_c0_seq5 (no t shown) codes for the same precursor as sequence comp849_c0_seq3&8; comp1069_c0_seq3 (not shown) and comp849_c0_seq5 (not shown) code for the same precursor as comp849_c0_seq1; comp2981_c0_seq1 (not shown) codes for the same precursor as comp2965_c0_seq1.
Concerning the identification of β-KTxs Class 2 subfamily of potassium channel blocker toxins, 3 sequences were found that code for two putatives β-toxins, showing similarities with CSab-Uro-2 from the scorpion U. manicatus and Hge-beta-KTx of Hadrurus gertschi (Fig 7). All these sequences show to contain 6 cysteines sharing 21% to 51% identity with that of Hge-beta-KTx.
Multiple alignments of U. yaschenkoi sequences that code for putative β-KTx. These sequences belong to the long chain scorpion toxin family, Class 2 subfamily. U. yaschenkoi β-KTx sequences are compared against CSab-Uro-2 from U. manicatus and with the Hge-β-KTx from Hadrurus gertschi. All these sequences have 6 cysteines (highlighted in yellow). The percentage of identity (%I) is shown in relation to Hge-β-KTx. The theoretical molecular weight (MW) of U. yaschenkoi toxins is shown. In bold, the MW found in the mass fingerprint previously reported . The retention time (RT) of this component is also indicated. Note: Sequence comp596_c0_seq1 (not shown) codes for the same mature peptide as the sequence comp588_c0_seq1 from this alignment.
The number of sequences that are assumed to code for putative KTxs toxins found in this work is greater than those reported for other non-Buthidae scorpions. Only one α-KTx transcript was identified in O. cayaporum , and eight in the scorpion S. jendeki . Concerning the putatives β-KTx only one was identified in the transcriptomic analysis of H. gertschi  and one in P. cavimanus . However, this was expected due to the methodology used in the present work compared with the techniques used in the other mentioned species (cDNA library and Sanger sequencing).
Additional sequences representing 9% of the transcripts of this work correspond to proteins containing more than 70 amino acid residues and show similarities annotated as “venom proteins” of other transcriptomic analysis, such as venom protein-5 and venom protein-2 of the scorpion Mesobuthus eupeus (GenBank: ABR21071.1 and ABR21036.1, respectively) and SCO-spondin-like from Bombyx mori (NCBI Reference Sequence: XP_004924398.1), as shown in Table 2. This type of sequences has been found both in Buthidae and non-Buthidae scorpions. They might be present in any scorpion family, contrary to what was described previously concerning AMPs or α- and β-KTxs.
Different types of proteins that control calcium ion permeability across biological membranes are known, and are defined as calcium channels. Among these are the voltage-gated, voltage-independent and ligand-activated channels. The last ones include the ryanodine sensitive calcium channels (RyRs) of the endoplasmic reticulum of heart and skeletal muscle, which are recognized by some peptides found in the venom of scorpions and are generically named calcins . The first calcins characterized were the imperatoxins IpTxi and IpTxa, isolated from the venom of the African scorpion Pandinus imperator . Imperatoxin A (IpTxa) is a 33 amino acid long peptide stabilized by three disulfide bridges, structurally organized in a special folding arrangement known as the “inhibitor cysteine knot”. This three-dimensional folding is commonly found in toxins from spiders and snails that affect voltage-dependent calcium channels  . IpTxa affects the RyRs receptors modifying the channel activity . Other calcins with similar structure and function were also isolated and characterized, such as: hemicalcin, opicalcin-1, opicalcin-2, hadrurin and maurocalcin (revised in ). The sequence Comp749_c0_seq1 identified in the transcriptome of U. yaschenkoi (Table 2) codifies for a putative imperatoxin-A-like calcin. It has 33 amino acids, six cysteines and shares 69% of identity with imperatoxin-A and 87% with maurocalcin (Fig 8A). Furthermore, the scorpion Liocheles australasiae has a peptide called toxin LaIT1, described to affect the function of the RyRs channels. This peptide has 36 amino acid residues, similar to the known calcins, but is rather toxic to insects than to mammalians . The peptide Phi-liotoxin-Lw1a, isolated from the Australian scorpion Liocheles waigiensis , also affect the activity of both ryanodine-sensitive calcium channels RyR1 and RyR2 with high potency and has sequence similarities with LaIT1. Its structure shows two-stranded beta-sheets or DDH for disulfide-directed beta-hairpin, stabilized by 2 disulfide bridges . In the present work with U. yaschenkoi, the sequence comp10032_c0_seq1 is a LaIT1-like peptide. It has 36 amino acid residues with four cysteines and shows sequence identity over 75% with the other LaIT1-like peptide (Fig 8B). The sequence comp10032_c0_seq 1 of U. yaschenkoi resembles the three DDH-like peptides reported from U. manicatus: DDH-Uro1, DDH-Uro2 and DDH-Uro3 (Fig 8B). The high degree of similarity of these peptides is certainly due to the fact that they belong to related Australian scorpions.
Three different multiple alignments of sequences are shown that code for: (A) calcins. Comp749_c0_seq1 codes for a calcin of 33 amino acid similar to other scorpion calcins as Hadrucalcin (UniProtKB/Swiss-Prot: B8QG00.1) from Hadrurus gertschi, Imperatoxin-A (UniProtKB/Swiss-Prot: P59868.1) from Pandinus imperator, Maurocalcin (UniProtKB/Swiss-Prot: P60254.1) from Scorpio maurus palmatus and Opicalcin-1 (UniProtKB/Swiss-Prot: P60252.1) from Opistophthalmus carinatus. All of them have 6 conserved cysteine; (B) LaIT1-like calcins. Sequence comp10032_c0_seq1 of U. yaschenkoi is compared with DDH-Uro-1 (GenBank: GALI01000015.1), DDH-Uro-2 (GenBank: GALI01000016.1) and DDH-Uro-3 (GenBank: GALI01000017.1) from U. manicatus, Insecticidal toxin LaIT1 from Liocheles australasiae (UniProtKB/Swiss-Prot: P0C5F2.1) and Phi-liotoxin-Lw1a (UniProtKB/Swiss-Prot: P0DJ08.1) from Liocheles waigiensis. This class of calcins has four cysteines. The sequence reported for DDH-Uro-3 contains undefined nucleotides and therefore the XX undefined amino acids. Finally, (C) Omega Agatoxin-like calcins. Putative calcium channel specific toxins encoded by sequences Comp27527_c0_seq1 and comp104104_c0_seq1 from U. yaschenkoi are shown and compared with DAPPUDRAFT_310236 of Daphnia pulex, LOC100163563 of insect Acyrthosiphon pisum and Omega Agatoxin IVB (Omega-Aga-IVB; GenBank: P37045) of spider Agelenopsis aperta. This class of calcin has eight cysteines. The percentage of identity (% Identity) is shown for all alignments with respect to the first sequence of each alignment. Conserved cysteines and amino acids are highlighted in yellow and bold, respectively.
Finally, four sequences were found in the transcriptome of U. yaschenkoi code for peptides similar to spider toxins such as Omega-agatoxin-IVB from Agelenopsis aperta. This peptide blocks P-type calcium channels of cerebellar Purkinje neurons . The same peptide paralyzes insects by blocking the neuromuscular transmission. The peptides of the transcripts found in the transcriptome of U. yaschenkoi here described, are thought to share structural similarities to the cystine knots of spider toxins, which are formed by a triple-stranded antiparallel beta-sheet, stabilized by 4 disulfide bridges. The sequences comp27527_c0_seq1 and comp10414_c0_seq1 code for a 42 and 47 amino acid agatoxin-like calcins, respectively (Fig 8C). In these sequences the 8 cysteines are conserved, including the double CC sequence and the triple amino acid GTN of these calcins (Fig 8C).
In conclusion, the data reported here indicate that U. yaschenkoi is the first scorpion whose transcriptomic analysis of the venom gland shows sequences that code for three distinct types of putative calcins. In addition, it also suggests that non-Buthidae species of scorpion seems to contain more calcin-like peptides than the Buthidae species.
The venom of the scorpion Liocheles australasiae apart from the insect toxin peptide described in the precedent section also contains a very abundant peptide, simply called La1. It is the most abundant component of the venom. It is composed by 73 amino acid residues with 8 cysteines forming 4 disulfide bridges. This peptide was assayed for possible insecticide activity on crickets and mammalian specific toxicity in mice, without any effect (74). Similar peptides were reported to exist in transcriptomic analysis of the scorpions of the species O. cayaporum, P. cavimanus, S. margarisonae, H. petersii and Scorpio maurus palmatus [19, 39, 40, 43, 46]. In the transcriptomic analysis of U. yaschenkoi, we have identified 8 sequences that code for precursors similar to La1. They all show to contain 8 cysteines and sequences identities varying from 36 to 64% of that of La1 of L. australasiae (Fig 9). Unfortunately their abundant presence and possible function still remains unknown.
The alignment compares the La-1-like peptides encoded by U. yaschenkoi transcripts with the model La1 peptide from Liocheles australasiae. La1 is amidated with a molecular weight of 7781.6 Da. Only the theoretical molecular weight amidated of comp12_c0_seq1 (in bold) was detected in the proteome of U. yaschenkoi , in retention time 30.59. La1 peptide has eight cysteines that are conserved in all the putative La-1-like peptides found herein (highlighted in yellow). The percentage of identity with respect to La1 is shown. Note: comp12_c0_seq1 and comp15_c0_seq1 (not shown) encode the same peptide; comp3687_c0_seq2 and comp4167_c0_seq2 (not shown) encode the same peptide; comp3687_c0_seq1 and comp4167_c0_seq1 (not shown) code for the same peptide and comp13_c0_seq1 and comp16_c0_seq1 (not shown) code for the same peptide.
Ascaris-Type Protease inhibitors.
Proteins with proteolytic activity are found in all living organisms, from bacteria to arthropods, plants and vertebrate animals, with a large number of activities and specificities, which are exquisitely controlled by inhibitors, in order to avoid unspecific proteolytic digestion of other existing proteins in their living cells. They are found in the body of organisms, but also in their secretions (revised in ). Peptides with protease inhibitor activity were described to be present in scorpion venoms, specially the Kunitz-type peptides. Some of which were reported as ion channel blockers, also known as Kunitz-type toxins. The first described is the peptide SdPI of the scorpion Lychas mucronatus . Other Kunitz-like peptides were described in transcriptomic analysis of scorpions, such as: LmKTT-1.a, LmKTT-1.b, and LmKTT-1.c from Lychas mucronatus ; BmKTT-2, BmKTT-3, BmKTT-1 from Mesobuthus martensii [37, 90] and Hg1 from Hadrurus gertschi [48, 91].
However, the Kunitz-type protease inhibitor peptides are not the only ones present in the venom gland of scorpions. The peptide SjAPI identified in Scorpiops jendeki is capable of inhibiting serine-proteases. It shows a structural folding similar to the “Ascaris-type inhibitor” . In U. yaschenkoi transcriptome we have found 7 distinct sequences that are assumed to code for putative Ascaris-type protease inhibitors (Fig 10). The seven sequences reported here show similarities to peptide SjAPI with identities from 27 to 41%; all of them having 10 cysteine residues like the Ascaris-type inhibitors (Fig 10).
Multiple alignment of U. yaschenkoi sequences that code for ascaris-type protease inhibitor compared against the ascaris-type protease inhibitor precursor from Scorpiops jendeki scorpion (SjAPI; GenBank: P0DM55). The sequence of comp75842_c0_seq1 is partial. The predicted signal peptide is underlined; putative propeptides is in italics and the common trypsin inhibitor like cysteine rich domain in bold with its ten cysteines highlighted in yellow. The percentage of identity (%I) with respect to SjAPI is shown. Note. Sequence comp4363_c0_seq1 encodes the same precursor as sequence comp4053_c0_seq1 (not shown); comp5534_c0_seq1 encodes the same precursor as comp4356_c0_seq1 (not shown) and comp135491_c0_seq1 (not shown) codes for the same partial sequence as comp75842_c0_seq1.
The venom gland of both Buthidae and non-Buthidae scorpions contain protease inhibitors. It is assumed that their principal function is to protect the other venom components from being degraded and play an important role for survival of the venomous animals [92–94].
Sequences that are assumed to code for venom allergens were found at the level of 2% of the transcripts. Allergens are known to occur in the venom of: bee, wasp, ant, spider and centipede [95–98]. The first described was the bee venom allergen-5, which was reported to cause allergy in humans . In the transcriptome described here we encountered 4 nucleotide sequences thought to code for different allergens (Table 2). The sequences found in U. yaschenkoi have close to 30% identity with allergen-5 of Tityus serrulatus, which is composed of 212 amino acid residues and contains disulfide bridges (UniProtKB/Swiss-Prot: P85840.1). For two sequences, comp4029_c0_seq1 and comp4170_c0_seq1 (see Table 2); it was possible to identify the signal peptide and the mature peptide of these allergen-like sequences, which have 237 and 236 amino acid residues, respectively. These sequences have similarities with CAP-Uro-1 y CAP-Uro-2 from the scorpion U. manicatus . From this analysis it is clear that allergens are present in Buthidae and non-Buthidae scorpion venoms.
Sodium-channel specific toxins.
Scorpion toxins specific for Na+-channels (NaTxs) recognize and modulate the function of sodium channels of excitable and non excitable cells and are the most important venom components medically speaking, because are the ones responsible for the intoxication symptoms of humans (7). They contain usually 58–76 amino acid residues tightly cross-linked by 4 disulfide bridges. Two main physiological functions are described for these toxic peptides: alpha-toxins (α-NaScTxs) and beta-toxins (β-NaScTxs). They both are modulators of the gating mechanism of Na+-channels. The first one prolong the action potential making the closing mechanism of the channel longer in time; the β-NaScTxs modify the open mechanism by producing an activation of the channel at less negative potentials (reviewed in ). At this moment, there is more than 300 known NaTxs, either directly isolated from scorpion venoms or identified based on gene cloning (see UniProt in www.uniprot.org). It is well known that these peptides occurs mainly on Buthidae species and are poorly represented in venoms from non-Buthidae species of scorpions (reviewed in ). Only three sequences assumed to code for putative NaTxs were found in U. yaschenkoi transcriptome (Table 2). This finding is in agreement with proteomic and transcriptomic studies conducted comparatively between Buthidae and non-Buthidae scorpions (revised in ).
Comparison of transcriptome and proteome components found in U. yaschenkoi
In our previous proteomic work , the molecular masses of several components of U. yaschenkoi venom were obtained. Table 3 compares the values of 16 theoretical molecular weight expected of putative toxins and antimicrobial peptides deducted from the sequences obtained by the high-throughput transcriptome with 16 experimentally (LC-MS/MS) determined molecular weights of components identified by proteome analysis. From these correlations, eight putative potassium channel specific toxins from the subfamily α-KTx-3 were found and matched. We also found that the antimicrobial peptide UyCT3 and the putative antimicrobial peptide encoded by sequence comp1267_c0_seq1 are the same. A putative β-KTx toxin encoded by sequences comp588_c0_seq1 and that of comp596_c0_seq1 were coincident with the proteome found components. Similarly, the putative calcin encoded by sequences comp10032_c0_seq1 and comp11072_c0_seq1 are the same. Finally, peptide La1-like coded by sequence comp12_c0_seq1 matches with a peptide found in the proteome analysis (see Table 3).
A comparison between results of the cDNA library shotgun approach  and the results found with the NGS RNA-seq transcriptome showed that both techniques are reliable for characterization of venom components found in scorpion venom glands. The same type of family components was identified by both methodologies. As it can be seen in Fig 3 of  and in Fig 3 of this communication, the subfamilies of toxins and peptides are almost the same and furthermore, they have the same proportions. Both studies reports antimicrobial peptides (UyCT3), calcin-like (Contig 20), scorpine-like peptides, La-1–like peptides, alpha and beta potassium channel specific toxins. The NGS approach reported here allowed the identification of a greater number of mass sequences than the shotgun methodology reported earlier by our group, when the results are compared with the mass values obtained in the proteome analysis.
One might argue that the correlation found is somehow limited. However there are several plausible reasons that explain these findings: a) the crude venom used for the experiments reported previously  and the venom glands used for RNAseq studies reported here were not the same, b) the extracted vemom, during handling and isolation procedures might suffer small modifications, which does not allow a perfect match of the molecular masses determined in proteomic analysis; c) an important number of sequences of putative peptides and proteins obtained by transcriptome analysis are not fully characterized, thus it is difficult to predict the exact molecular mass expected; d) a few sequences could be subjected to post-translational modifications, which will again make difficult to predict the exact molecular mass expected.
Comparison of this transcriptome with that of Centruroides noxius scorpion
A comparison of the data between C. noxius  and U. yaschenkoi transcriptomes resulted in 273 similar sequences. The percent pairwise identity of most sequences is above 75%. Similar sequences are mainly represented by enzymes or components involved in biological processes. For example: kinases, enolases, helicases, phosphatases, actins, zing finger proteins and RNA related proteins. It supports the conclusion that both species of scorpions share the same machinery to produce the venom, although the venom components are certainly different.
Both scorpions, C. noxius and U. yaschenkoi, have in their venom hyaluronidase enzymes, venom allergens, venom insulin-like growth factors and of course, venom toxins but for C. noxius the most represented are the sodium channel toxins while for U. yaschenkoi few putatives sodium channel toxins were identified.
One of the most abundant components identified from U. yaschenkoi transcriptome were the antimicrobial peptides (21%) whereas for C. noxius transcriptome only one isogroup containing 6 reads was similar to porine, an antimicrobial peptide. Once again, this finding was expected because non-Buthidae scorpions are a well known source of antimicrobial peptides [100–103] while Buthidae scorpions lack these compounds.
Despite the fact that studies related to scorpion venom components have been steadily increasing over the past four decades, the first whole transcriptome  and the first genome  of singular species have only recently been obtained. Both studies were made with Buthidae scorpions (Centruroides noxius and Mesobuthus martensii) and the results gave a general view of the cellular and molecular processes in the assembly of the scorpion venom components. Additionally, metabolic pathways and the dynamics of expansion of scorpion gene families were elucidated. On the contrary, for scorpions of non-Buthidae species, there is lack of similar information, especially at the transcriptomic level. The results reported here should be considered as an extended analysis of the genes expressed in the venom gland of an Urodacidae scorpion, filling in the missing information. This communication reports the whole transcriptomic analysis, in which hundreds of venom components are fully characterized, contributing to the large-scale discovery of scorpion toxin sequences and should serve as a reference for comparative studies and subjects related to evolution of venoms and venomous animals.
A total of 210 different nucleotidic sequences that code for 111 unique and specific toxins, peptides and proteins in the Urodacus yaschenkoi venom were identified; some of them were previously found in the U. yaschenkoi cDNA library shotgun approach reported by our group. The correlation between the proteome and this data set permitted the identification of 16 theoretical molecular weights deducted from the whole transcriptome. An extended cross-reference to other scorpion known venom components is included. This work analyzed in detail the whole array of transcripts expressed in the venom gland of a non-Buthidae scorpion of the family Urodacidae. It is expected that the identification of these new toxins and peptides will contribute to the production of new putative bioactive compounds or pharmacological tools. Meanwhile, this dataset will serve as a public information platform to accelerate studies in venomics research and will serve as a reference for non-Buthidae scorpions.
S1 file. Distribution of contigs and sequences from Urodacus yashenkoi venom gland.
Size distribution of the Urodacus yashenkoi venom gland contigs obtained from the de novo assembly of high-quality clean reads (Fig A). Most abundant Go-terms for the sub-dataset containing only toxins and venom related components (Fig B). Pie charts with the most abundant Go term per domain for the sub-dataset containing only toxins and venom related components. Fig C-A: cellular component, Fig C-B: biological process and Fig C-C: molecular function (Fig C). Enzyme distribution for the sub-dataset containing only toxins and venom related components: Oxireductases, transferases, hydrolases, lyases, isomerases and ligases (Fig D). Most abundant families of enzymes found in the whole transcriptome (Fig E).
Technical assistance on computer programming performed by M. Sc. Juan Manuel Hurtado Ramírez and advice on the use of Geneious given by PhD Sandra Pineda Gonzalez is thoroughly acknowledged.
Conceived and designed the experiments: KLR VQH LDP. Performed the experiments: KLR VQH. Analyzed the data: KLR VQH VRJG LDP. Contributed reagents/materials/analysis tools: KLR LDP. Wrote the paper: KLR VQH LDP. Data submission: VRJG KLR.
- 1. Jeyaprakash A, Hoy MA. First divergence time estimate of spiders, scorpions, mites and ticks (subphylum: Chelicerata) inferred from mitochondrial phylogeny. Experimental and Applied Acarology. 2009;47(1):1–18. pmid:18931924
- 2. Fet V, Sissom WD, Lowe G, Braunwalder ME. Catalog of the Scorpions of the World (1758–1998). New York: New York Entomological Society; 2000. 690 p.
- 3. Prendini L, Wheeler WC. Scorpion higher phylogeny and classification, taxonomic anarchy, and standards for peer review in online publishing. Cladistics. 2005;21(5):446–94.
- 4. Catterall WA. Neurotoxins that act on voltage-sensitive sodium channels in excitable membranes. Annual Review of Pharmacology and Toxicology. 1980;20:15–43. pmid:6247957
- 5. Valdivia HH, Kirby MS, Lederer WJ, Coronado R. Scorpion toxins targeted against the sarcoplasmic reticulum Ca2+- release channel of skeletal and cardiac muscle. Proc Natl Acad Sci U S A. 1992;89(24):12185–9. pmid:1334561
- 6. Gordon D, Savarin P, Gurevitz M, Zinn-Justin S. Functional anatomy of scorpion toxins affecting sodium channels. Journal of Toxicology—Toxin Reviews. 1998;17(2):131–59.
- 7. Possani LD, Becerril B, Delepierre M, Tytgat J. Scorpion toxins specific for Na+-channels. European Journal of Biochemistry. 1999;264(2):287–300. pmid:10491073.
- 8. Andreotti N, Jouirou B, Mouhat S, Mouhat L, Sabatier JM. Therapeutic Value of Peptides from Animal Venoms. In: Liu H-W, Mander L, editors. Comprehensive Natural Products II. Oxford: Elsevier; 2010. p. 287–303.
- 9. Ortiz E, Rendon-Anaya M, Rego SC, Schwartz EF, Possani LD. Antarease-like Zn-metalloproteases are ubiquitous in the venom of different scorpion genera. Biochim Biophys Acta. 2014;6(46):19.
- 10. Chen Y, Cao L, Zhong M, Zhang Y, Han C, Li Q, et al. Anti-HIV-1 activity of a new scorpion venom peptide derivative Kn2-7. PLoS ONE. 2012;7(4).
- 11. Zhao Z, Hong W, Zeng Z, Wu Y, Hu K, Tian X, et al. Mucroporin-M1 inhibits hepatitis B virus replication by activating the mitogen-activated protein kinase (MAPK) pathway and down-regulating HNF4α in vitro and in vivo. J Biol Chem. 2012;287(36):30181–90. pmid:22791717
- 12. Caliskan F, García BI, Coronas FIV, Batista CVF, Zamudio FZ, Possani LD. Characterization of venom components from the scorpion Androctonus crassicauda of Turkey: Peptides and genes. Toxicon. 2006;48(1):12–22. pmid:16762386
- 13. Oukkache N, Rosso JP, Alami M, Ghalim N, Saïle R, Hassar M, et al. New analysis of the toxic compounds from the Androctonus mauretanicus mauretanicus scorpion venom. Toxicon. 2008;51(5):835–52. pmid:18243273
- 14. Caliskan F, Quintero-Hernandez V, Restano-Cassulini R, Batista CV, Zamudio FZ, Coronas FI, et al. Turkish scorpion Buthacus macrocentrus: general characterization of the venom and description of Bu1, a potent mammalian Na(+)-channel alpha-toxin. Toxicon. 2012;59(3):408–15. pmid:22245624
- 15. Kozminsky-Atias A, Bar-Shalom A, Mishmar D, Zilberberg N. Assembling an arsenal, the scorpion way. BMC Evol Biol. 2008;8:333–45. pmid:19087317
- 16. Rendon-Anaya M, Delaye L, Possani LD, Herrera-Estrella A. Global transcriptome analysis of the scorpion Centruroides noxius: new toxin families and evolutionary insights from an ancestral scorpion species. PLoS ONE. 2012;7(8):17.
- 17. Valdez-Velázquez LL, Quintero-Hernández V, Romero-Gutiérrez MT, Coronas FIV, Possani LD. Mass Fingerprinting of the Venom and Transcriptome of Venom Gland of Scorpion Centruroides tecomanus. PLoS ONE. 2013;8(6).
- 18. Morgenstern D, Rohde BH, King GF, Tal T, Sher D, Zlotkin E. The tale of a resting gland: Transcriptome of a replete venom gland from the scorpion Hottentotta judaicus. Toxicon. 2011;57(5):695–703. pmid:21329713
- 19. Ma Y, He Y, Zhao R, Wu Y, Li W, Cao Z. Extreme diversity of scorpion venom peptides and proteins revealed by transcriptomic analysis: Implication for proteome evolution of scorpion venom arsenal. Journal of Proteomics. 2012;75(5):1563–76. pmid:22155128
- 20. Nascimento DG, Rates B, Santos DM, Verano-Braga T, Barbosa-Silva A, Dutra AAA, et al. Moving pieces in a taxonomic puzzle: Venom 2D-LC/MS and data clustering analyses to infer phylogenetic relationships in some scorpions from the Buthidae family (Scorpiones). Toxicon. 2006;47(6):628–39. pmid:16551474
- 21. Smith JJ, Jones A, Alewood PF. Mass landscapes of seven scorpion species: The first analyses of Australian species with 1,5-DAN matrix. J Venom Res. 2012;3:7–14. pmid:23236582
- 22. Cao Z, Yu Y, Wu Y, Hao P, Di Z, He Y, et al. The genome of Mesobuthus martensii reveals a unique adaptation model of arthropods. Nature Communications. 2013;4.
- 23. Xu X, Duan Z, Di Z, He Y, Li J, Li Z, et al. Proteomic analysis of the venom from the scorpion Mesobuthus martensii. J Proteomics. 2014;106:162–80. pmid:24780724
- 24. Newton KA, Clench MR, Deshmukh R, Jeyaseelan K, Strong PN. Mass fingerprinting of toxic fractions from the venom of the Indian red scorpion, Mesobuthus tamulus: biotope-specific variation in the expression of venom peptides. Rapid Commun Mass Spectrom. 2007;21(21):3467–76. pmid:17918210
- 25. Diego-García E, Caliskan F, Tytgat J. The Mediterranean scorpion Mesobuthus gibbosus (Scorpiones, Buthidae): Transcriptome analysis and organization of the genome encoding chlorotoxin-like peptides. BMC Genomics. 2014;15(1). pmid:25542255
- 26. Inceoglu B, Lango J, Jing J, Chen L, Doymaz F, Pessah IN, et al. One scorpion, two venoms: Prevenom of Parabuthus transvaalicus acts as an alternative type of venom with distinct mechanism of action. Proc Natl Acad Sci U S A. 2003;100(3):922–7. pmid:12552107
- 27. Mille BG, Peigneur S, Diego-García E, Predel R, Tytgat J. Partial transcriptomic profiling of toxins from the venom gland of the scorpion Parabuthus stridulus. Toxicon. 2014;83:75–83. pmid:24631597
- 28. Pimenta AMC, Stcklin R, Favreau P, Bougis PE, Martin-Eauclaire MF. Moving pieces in a proteomic puzzle: Mass fingerprinting of toxic fractions from the venom of Tityus serrulatus (Scorpiones, Buthidae). Rapid Communications in Mass Spectrometry. 2001;15(17):1562–72. pmid:11713783
- 29. Batista CVF, Del Pozo L, Zamudio FZ, Contreras S, Becerril B, Wanke E, et al. Proteomics of the venom from the Amazonian scorpion Tityus cambridgei and the role of prolines on mass spectrometry analysis of toxins. J Chromatogr, B: Anal Technol Biomed Life Sci. 2004;803(1):55–66. pmid:15025998
- 30. Diego-García E, Batista CVF, García-Gómez BI, Lucas S, Candido DM, Gómez-Lagunas F, et al. The Brazilian scorpion Tityus costatus Karsch: Genes, peptides and function. Toxicon. 2005;45(3):273–83. pmid:15683865
- 31. Batista CVF, Román-González SA, Salas-Castillo SP, Zamudio FZ, Gómez-Lagunas F, Possani LD. Proteomic analysis of the venom from the scorpion Tityus stigmurus: Biochemical and physiological comparison with other Tityus species. Comparative Biochemistry and Physiology—C Toxicology and Pharmacology. 2007;146(1–2 SPEC. ISS.):147–57.
- 32. D'Suze G, Schwartz EF, García-Gómez BI, Sevcik C, Possani LD. Molecular cloning and nucleotide sequence analysis of genes from a cDNA library of the scorpion Tityus discrepans. Biochimie. 2009;91(8):1010–9. pmid:19470401
- 33. Batista CVF, D'Suze G, Gómez-Lagunas F, Zamudio FZ, Encarnación S, Sevcik C, et al. Proteomic analysis of Tityus discrepans scorpion venom and amino acid sequence of novel toxins. Proteomics. 2006;6(12):3718–27. pmid:16705749
- 34. Barona J, Batista CVF, Zamudio FZ, Gomez-Lagunas F, Wanke E, Otero R, et al. Proteomic analysis of the venom and characterization of toxins specific for Na+- and K+-channels from the Colombian scorpion Tityus pachyurus. Biochim Biophys Acta Proteins Proteomics. 2006;1764(1):76–84. pmid:16309982
- 35. Almeida DD, Scortecci KC, Kobashi LS, Agnez-Lima LF, Medeiros SR, Silva-Junior AA, et al. Profiling the resting venom gland of the scorpion Tityus stigmurus through a transcriptomic survey. BMC Genomics. 2012;13:362. pmid:22853446
- 36. Alvarenga E, Mendes T, Magalhaes B, Siqueira F, Dantas A, Barroca T, Horta C, Kalapothakis E. Transcriptome analysis of the Tityus serrulatus scorpion venom gland. Open Journal of Genetics. 2012;2:210–20.
- 37. Ruiming Z, Yibao M, Yawen H, Zhiyong D, Yingliang W, Zhijian C, et al. Comparative venom gland transcriptome analysis of the scorpion Lychas mucronatus reveals intraspecific toxic gene diversity and new venomous components. BMC Genomics. 2010;11(1):452–66.
- 38. He Y, Zhao R, Di Z, Li Z, Xu X, Hong W, et al. Molecular diversity of Chaerilidae venom peptides reveals the dynamic evolution of scorpion venom components from Buthidae to non-Buthidae. Journal of Proteomics. 2013;89:1–14. pmid:23774330
- 39. Abdel-Rahman MA, Quintero-Hernandez V, Possani LD. Venom proteomic and venomous glands transcriptomic analysis of the Egyptian scorpion Scorpio maurus palmatus (Arachnida: Scorpionidae). Toxicon. 2013;74:193–207. pmid:23998939
- 40. Diego-García E, Peigneur S, Clynen E, Marien T, Czech L, Schoofs L, et al. Molecular diversity of the telson and venom components from Pandinus cavimanus (Scorpionidae Latreille 1802): Transcriptome, venomics and function. Proteomics. 2012;12(2):313–28. pmid:22121013
- 41. Luna-Ramírez K, Quintero-Hernández V, Vargas-Jaimes L, Batista CVF, Winkel KD, Possani LD. Characterization of the venom from the Australian scorpion Urodacus yaschenkoi: Molecular mass analysis of components, cDNA sequences and peptides with antimicrobial activity. Toxicon. 2013;63(1):44–54. pmid:21837551
- 42. Bringans S, Eriksen S, Kendrick T, Gopalakrishnakone P, Livk A, Lock R, et al. Proteomic analysis of the venom of Heterometrus longimanus (Asian black scorpion). Proteomics. 2008;8(5):1081–96. pmid:18246572
- 43. Ma Y, Zhao Y, Zhao R, Zhang W, He Y, Wu Y, et al. Molecular diversity of toxic components from the scorpion Heterometrus petersii venom revealed by proteomic and transcriptome analysis. Proteomics. 2010;10(13):2471–85. pmid:20443192
- 44. Roeding F, Borner J, Kube M, Klages S, Reinhardt R, Burmester T. A 454 sequencing approach for large scale phylogenomic analysis of the common emperor scorpion (Pandinus imperator). Molecular Phylogenetics and Evolution. 2009;53(3):826–34. pmid:19695333
- 45. Ma Y, Zhao R, He Y, Li S, Liu J, Wu Y, et al. Transcriptome analysis of the venom gland of the scorpion Scorpiops jendeki: Implication for the evolution of the scorpion venom arsenal. BMC Genomics. 2009;10:290–304. pmid:19570192
- 46. Silva ECN, Camargos TS, Maranhão AQ, Silva-Pereira I, Silva LP, Possani LD, et al. Cloning and characterization of cDNA sequences encoding for new venom peptides of the Brazilian scorpion Opisthacanthus cayaporum. Toxicon. 2009;54(3):252–61. pmid:19379768
- 47. Schwartz EF, Camargos TS, Zamudio FZ, Silva LP, Bloch C Jr, Caixeta F, et al. Mass spectrometry analysis, amino acid sequence and biological activity of venom components from the Brazilian scorpion Opisthacanthus cayaporum. Toxicon. 2008;51(8):1499–508. pmid:18502464
- 48. Schwartz EF, Diego-Garcia E, Rodríguez de la Vega RC, Possani LD. Transcriptome analysis of the venom gland of the Mexican scorpion Hadrurus gertschi (Arachnida: Scorpiones). BMC Genomics. 2007;8:119. pmid:17506894
- 49. Sunagar K, Undheim EA, Chan AH, Koludarov I, Munoz-Gomez SA, Antunes A, et al. Evolution stings: the origin and diversification of scorpion toxin peptide scaffolds. Toxins. 2013;5(12):2456–87. pmid:24351712
- 50. Almaaytah A, Zhou M, Wang L, Chen T, Walker B, Shaw C. Antimicrobial/cytolytic peptides from the venom of the North African scorpion, Androctonus amoreuxi: Biochemical and functional characterization of natural peptides and a single site-substituted analog. Peptides. 2012;35(2):291–9. pmid:22484288
- 51. Conde R, Zamudio FZ, Rodriíguez MH, Possani LD. Scorpine, an anti-malaria and anti-bacterial agent purified from scorpion venom. FEBS Letters. 2000;471(2–3):165–8.
- 52. Carballar-Lejarazú R, Rodríguez MH, De La Cruz Hernández-Hernández F, Ramos-Castañeda J, Possani LD, Zurita-Ortega M, et al. Recombinant scorpine: A multifunctional antimicrobial peptide with activity against different pathogens. Cellular and Molecular Life Sciences. 2008;65(19):3081–92. pmid:18726072
- 53. Rodriguez de la Vega RC, Vidal N, Possani LD. Chapter 59—Scorpion Peptides. In: Kastin AJ, editor. Handbook of Biologically Active Peptides (Second Edition). Boston: Academic Press; 2013. p. 423–9.
- 54. Rochat C, Rochat H, Miranda F, Lissitzky S. Purification and some properties of the neurotoxins of Androctonus australis Hector. Biochemistry. 1967;6(2):578–85. pmid:6047642
- 55. Rochat H, Rochat C, Kupeyan C, Miranda F, Lissitzky S, Edman P. Scorpion neurotoxins: A family of homologous proteins. FEBS Lett. 1970;10(5):349–51. pmid:11945430
- 56. Rochat H, Rochat C, Miranda F, Lissitzky S, Edman P. The amino acid sequence of neurotoxin I of Androctonus australis Hector. European journal of biochemistry / FEBS. 1970;17(2):262–6. pmid:5500394
- 57. Rodríguez de la Vega RC, Schwartz EF, Possani LD. Mining on scorpion venom biodiversity. Toxicon. 2010;56(7):1155–61. pmid:19931296
- 58. Terrat Y, Biass D, Dutertre S, Favreau P, Remm M, Stöcklin R, et al. High-resolution picture of a venom gland transcriptome: Case study with the marine snail Conus consors. Toxicon. 2012;59(1):34–46. pmid:22079299
- 59. Angeloni F, Wagemaker CAM, Jetten MSM, Op den Camp HJM, Janssen-Megens EM, Francoijs KJ, et al. De novo transcriptome characterization and development of genomic tools for Scabiosa columbaria L. using next-generation sequencing techniques. Molecular Ecology Resources. 2011;11(4):662–74. pmid:21676196
- 60. Cahais V, Gayral P, Tsagkogeorga G, Melo-Ferreira J, Ballenghien M, Weinert L, et al. Reference-free transcriptome assembly in non-model animals from next-generation sequencing data. Molecular Ecology Resources. 2012.
- 61. Costa V, Angelini C, De Feis I, Ciccodicola A. Uncovering the complexity of transcriptomes with RNA-Seq. Journal of Biomedicine and Biotechnology. 2010;2010.
- 62. Durban J, Juarez P, Angulo Y, Lomonte B, Flores-Diaz M, Alape-Giron A, et al. Profiling the venom gland transcriptomes of Costa Rican snakes by 454 pyrosequencing. BMC Genomics. 2011;12(1):259.
- 63. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology. 2011;29(7):644–52. pmid:21572440
- 64. Li C, Weng S, Chen Y, Yu X, Lü L, Zhang H, et al. Analysis of Litopenaeus vannamei Transcriptome Using the Next-Generation DNA Sequencing Technique. PLoS ONE. 2012;7(10).
- 65. Lulin H, Xiao Y, Pei S, Wen T, Shangqin H. The first Illumina-based De Novo transcriptome sequencing and analysis of Safflower flowers. PLoS ONE. 2012;7(6).
- 66. Koch L. The taxonomy, geographic distribution and evolutionary radiation of Australo-Papuan scorpions. In: Lovell AF, editor. Records of the Museum. Perth, Australia: Western Australian Museum; 1977. p. 83–367.
- 67. Rodríguez de la Vega RC, Possani LD. Overview of scorpion toxins specific for Na+ channels and related peptides: biodiversity, structure-function relationships and evolution. Toxicon. 2005;46(8):831–44. pmid:16274721
- 68. Conde R, Zamudio FZ, Becerril B, Possani LD. Phospholipin, a novel heterodimeric phospholipase A2 from Pandinus imperator scorpion venom. FEBS Letters. 1999;460(3):447–50. pmid:10556514
- 69. Almaaytah A, Albalas Q. Scorpion venom peptides with no disulfide bridges: A review. Peptides. 2014;51:35–45. pmid:24184590
- 70. Harrison PL, Abdel-Rahman MA, Miller K, Strong PN. Antimicrobial peptides from scorpion venoms. Toxicon. 2014;88:115–37. pmid:24951876
- 71. Lee K, Shin SY, Kim K, Lim SS, Hahm KS, Kim Y. Antibiotic activity and structural analysis of the scorpion-derived antimicrobial peptide IsCT and its analogs. Biochem Bioph Res Co. 2004;323(2):712–9. pmid:15369808
- 72. Zhu S, Tytgat J. The scorpine family of defensins: Gene structure, alternative polyadenylation and fold recognition. Cellular and Molecular Life Sciences. 2004;61(14):1751–63. pmid:15241551
- 73. Diego-García E, Schwartz EF, D'Suze G, González SAR, Batista CVF, García BI, et al. Wide phylogenetic distribution of Scorpine and long-chain β-KTx-like peptides in scorpion venoms: Identification of "orphan" components. Peptides. 2007;28(1):31–7. pmid:17141373
- 74. Zhu S, Gao B, Aumelas A, del Carmen Rodríguez M, Lanz-Mendoza H, Peigneur S, et al. MeuTXKβ1, a scorpion venom-derived two-domain potassium channel toxin-like peptide with cytolytic activity. Biochim Biophys Acta Proteins Proteomics. 2010;1804(4):872–83. pmid:20045493
- 75. Tytgat J, Chandy KG, Garcia ML, Gutman GA, Martin-Eauclaire MF, Van Der Walt JJ, et al. A unified nomenclature for short-chain peptides isolated from scorpion venoms: α-KTx molecular subfamilies. Trends in Pharmacological Sciences. 1999;20(11):444–7. pmid:10542442
- 76. Rodriíguez de la Vega RC, Possani LD. Current views on scorpion toxins specific for K+-channels. Toxicon. 2004;43(8):865–75. pmid:15208019
- 77. Diego-García E, Abdel-Mottaleb Y, Schwartz EF, De La Vega RCR, Tytgat J, Possani LD. Cytolytic and K+ channel blocking activities of β-KTx and scorpine-like peptides purified from scorpion venoms. Cellular and Molecular Life Sciences. 2008;65(1):187–200. pmid:18030427
- 78. Cao Z, Xiao F, Peng F, Jiang D, Mao X, Liu H, et al. Expression, purification and functional characterization of a recombinant scorpion venom peptide BmTXKβ. Peptides. 2003;24(2):187–92. pmid:12668201
- 79. Luna-Ramirez K, Bartok A, Restano-Cassulini R, Quintero-Hernandez V, Coronas FI, Christensen J, et al. Structure, molecular modeling, and function of the novel potassium channel blocker urotoxin isolated from the venom of the Australian scorpion Urodacus yaschenkoi. Mol Pharmacol. 2014;86(1):28–41. pmid:24723491
- 80. Wu W, Yin S, Ma Y, Wu YL, Zhao R, Gan G, et al. Molecular cloning and electrophysiological studies on the first K+ channel toxin (LmKTx8) derived from scorpion Lychas mucronatus. Peptides. 2007;28(12):2306–12. pmid:18006119
- 81. Narasimhan L, Singh J, Humblet C, Guruprasad K, Blundell T. Snail and spider toxins share a similar tertiary structure and 'cystine motif'. Nat Struct Biol. 1994;1(12):850–2. pmid:7773771
- 82. Pallaghy PK, Nielsen KJ, Craik DJ, Norton RS. A common structural motif incorporating a cystine knot and a triple- stranded β-sheet in toxic and inhibitory polypeptides. Protein Science. 1994;3(10):1833–9. pmid:7849598
- 83. Quintero-Hernández V, Jiménez-Vargas JM, Gurrola GB, Valdivia HH, Possani LD. Scorpion venom components that affect ion-channels function. Toxicon. 2013;76:328–42. pmid:23891887
- 84. Matsushita N, Miyashita M, Sakai A, Nakagawa Y, Miyagawa H. Purification and characterization of a novel short-chain insecticidal toxin with two disulfide bridges from the venom of the scorpion Liocheles australasiae. Toxicon. 2007;50(6):861–7. pmid:17681581
- 85. Smith JJ, Hill JM, Little MJ, Nicholson GM, King GF, Alewood PF. Unique scorpion toxin with a putative ancestral fold provides insight into evolution of the inhibitor cystine knot motif. Proc Natl Acad Sci U S A. 2011;108(26):10478–83. pmid:21670253
- 86. Smith JJ, Vetter I, Lewis RJ, Peigneur S, Tytgat J, Lam A, et al. Multiple actions of f-LITX-Lw1a on ryanodine receptors reveal a functional link between scorpion DDH and ICK toxins. Proc Natl Acad Sci U S A. 2013;110(22):8906–11. pmid:23671114
- 87. Adams ME, Mintz IM, Reily MD, Thanabal V, Bean BP. Structure and properties of ω-agatoxin IVB, a new antagonist of P-type calcium channels. Molecular Pharmacology. 1993;44(4):681–8. pmid:8232218
- 88. Chen Z, Wang B, Hu J, Yang W, Cao Z, Zhuo R, et al. SjAPI, the First Functionally Characterized Ascaris-Type Protease Inhibitor from Animal Venoms. PLoS ONE. 2013;8(3).
- 89. Zhao R, Dai H, Qiu S, Li T, He Y, Ma Y, et al. SdPI, the first functionally characterized Kunitz-type trypsin inhibitor from scorpion venom. PLoS ONE. 2011;6(11).
- 90. Chen Z, Luo F, Feng J, Yang W, Zeng D, Zhao R, et al. Genomic and structural characterization of Kunitz-type peptide LmKTT-1a highlights diversity and evolution of scorpion potassium channel toxins. PLoS ONE. 2013;8(4):3.
- 91. Chen ZY, Hu YT, Yang WS, He YW, Feng J, Wang B, et al. Hg1, novel peptide inhibitor specific for Kv1.3 channels from first scorpion Kunitz-type potassium channel toxin family. J Biol Chem. 2012;287(17):13813–21. pmid:22354971
- 92. Minagawa S, Ishida M, Shimakura K, Nagashima Y, Shiomi K. Isolation and amino acid sequences of two Kunitz-type protease inhibitors from the sea anemone Anthopleura aff. xanthogrammica. Comp Biochem Physiol B Biochem Mol Biol. 1997;118(2):381–6. pmid:9440231
- 93. Cheng YC, Yan FJ, Chang LS. Taiwan cobra chymotrypsin inhibitor: Cloning, functional expression and gene organization. Biochim Biophys Acta Proteins Proteomics. 2005;1747(2):213–20. pmid:15698956
- 94. Lu J, Yang H, Yu H, Gao W, Lai R, Liu J, et al. A novel serine protease inhibitor from Bungarus fasciatus venom. Peptides. 2008;29(3):369–74. pmid:18164783
- 95. Baek JH, Oh JH, Kim YH, Lee SH. Comparative transcriptome analysis of the venom sac and gland of social wasp Vespa tropica and solitary wasp Rhynchium brunneum. Journal of Asia-Pacific Entomology. 2013;16(4):497–502.
- 96. Bouzid W, Klopp C, Verdenaud M, Ducancel F, Vétillard A. Profiling the venom gland transcriptome of Tetramorium bicarinatum (Hymenoptera: Formicidae): The first transcriptome analysis of an ant species. Toxicon. 2013;70:70–81. pmid:23584016
- 97. Heavner ME, Gueguen G, Rajwani R, Pagan PE, Small C, Govind S. Partial venom gland transcriptome of a Drosophila parasitoid wasp, Leptopilina heterotoma, reveals novel and shared bioactive profiles with stinging Hymenoptera. Gene. 2013;526(2):195–204. pmid:23688557
- 98. Liu ZC, Zhang R, Zhao F, Chen ZM, Liu HW, Wang YJ, et al. Venomic and transcriptomic analysis of centipede scolopendra subspinipes dehaani. Journal of Proteome Research. 2012;11(12):6197–212. pmid:23148443
- 99. King TP. Protein allergens of white-faced hornet, yellow hornet, and yellow jacket venoms. Biochemistry. 1978;17(24):5165–74. pmid:83154
- 100. Corzo G, Escoubas P, Villegas E, Barnham KJ, He W, Norton RS, et al. Characterization of unique amphipathic antimicrobial peptides from venom of the scorpion Pandinus imperator. Biochemical Journal. 2001;359(1):35–45.
- 101. Luna-Ramirez K, Sani MA, Silva-Sanchez J, Jimenez-Vargas JM, Reyna-Flores F, Winkel KD, et al. Membrane interactions and biological activity of antimicrobial peptides from Australian scorpion. Biochim Biophys Acta. 2014;9(8):4.
- 102. Zeng X-C, Zhou L, Shi W, Luo X, Zhang L, Nie Y, et al. Three new antimicrobial peptides from the scorpion Pandinus imperator. Peptides. 2013;45(0):28–34.
- 103. Torres-Larios A, Gurrola GB, Zamudio FZ, Possani LD. Hadrurin, A new antimicrobial peptide from the venom of the scorpion Hadrurus aztecus. European Journal of Biochemistry. 2000;267(16):5023–31. pmid:10931184