Global Transcriptome Analysis of the Tentacle of the Jellyfish Cyanea capillata Using Deep Sequencing and Expressed Sequence Tags: Insight into the Toxin- and Degenerative Disease-Related Transcripts

Background Jellyfish contain diverse toxins and other bioactive components. However, large-scale identification of novel toxins and bioactive components from jellyfish has been hampered by the low efficiency of traditional isolation and purification methods. Results We performed de novo transcriptome sequencing of the tentacle tissue of the jellyfish Cyanea capillata. A total of 51,304,108 reads were obtained and assembled into 50,536 unigenes. Of these, 21,357 unigenes had homologues in public databases, but the remaining unigenes had no significant matches due to the limited sequence information available and species-specific novel sequences. Functional annotation of the unigenes also revealed general gene expression profile characteristics in the tentacle of C. capillata. A primary goal of this study was to identify putative toxin transcripts. As expected, we screened many transcripts encoding proteins similar to several well-known toxin families including phospholipases, metalloproteases, serine proteases and serine protease inhibitors. In addition, some transcripts also resembled molecules with potential toxic activities, including cnidarian CfTX-like toxins with hemolytic activity, plancitoxin-1, venom toxin-like peptide-6, histamine-releasing factor, neprilysin, dipeptidyl peptidase 4, vascular endothelial growth factor A, angiotensin-converting enzyme-like and endothelin-converting enzyme 1-like proteins. Most of these molecules have not been previously reported in jellyfish. Interestingly, we also characterized a number of transcripts with similarities to proteins relevant to several degenerative diseases, including Huntington’s, Alzheimer’s and Parkinson’s diseases. This is the first description of degenerative disease-associated genes in jellyfish. Conclusion We obtained a well-categorized and annotated transcriptome of C. capillata tentacle that will be an important and valuable resource for further understanding of jellyfish at the molecular level and information on the underlying molecular mechanisms of jellyfish stinging. The findings of this study may also be used in comparative studies of gene expression profiling among different jellyfish species.


Introduction
In recent decades, frequent outbreaks of jellyfish have occurred in oceans, potentially due to overfishing by humans, nutrient pollution and global warming. Jellyfish outbreaks have a strong adverse impact on marine ecological balance. However, the large amount of jellyfish biomass could be considered a valuable source of bioactive compounds. Thus, the overall development and comprehensive utilization of jellyfish have also triggered interest among many scientists.
Jellyfish bodies contain a great variety of natural bioactive components, among which the most studied are jellyfish nematocyst toxins. Nematocysts are densely located on the tentacles, and each contains a tiny dose of venom. People stung by toxic jellyfish may develop severe pain, dyspnea or even cardiorespiratory failure [1]. Many studies have explored the physicochemical properties of nematocyst toxins, which are now believed to be a type of novel protein or peptide. Jellyfish nematocyst toxins exhibit various bioactivities, such as hemolytic, enzymatic, neurotoxic, myotoxic and cardiovascular activities [2][3][4]. In addition to nematocyst toxins, the jellyfish body contains a wide range of novel proteins or peptides that exhibit activities such as antioxidation, antibiosis and immune reinforcing. Antioxidant activity of the giant jellyfish Nemopilema nomurai was observed by Kazuki [5]. We previously reported the first peroxiredoxin (Prx) and thioredoxin (Trx) genes from the jellyfish Cyanea capillata; both of these genes exhibit general intracellular antioxidant activity [6][7]. The jellyfish body also contains abundant collagens, which have immunostimulatory effects without inducing allergic complications [8].
Recent research has investigated and identified the bioactive components of jellyfish, particularly their toxins. However, the low efficiency of traditional isolation and purification methods has hampered the large-scale identification of novel toxins and bioactive components from jellyfish. No complete genome sequencing of jellyfish has been reported. In the absence of genome sequencing, the transcriptome represents a valuable searchable database. However, of the nearly 250 jellyfish species, only three species, Stomolophus meleagris, Chironex fleckeri and Aurelia aurita, have been sequenced using the transcriptome approach [9][10][11]. Only a small number of jellyfish sequences (54,247 ESTs and 3,795 nucleotides, as of Mar 10, 2015) have been deposited in the National Center for Biotechnology Information (NCBI) database, seriously limiting our understanding of the diverse bioactivities of this abundant marine zooplankton. Therefore, more sequence data and comprehensive analysis of jellyfish species transcriptomes are desired to explore more toxins and other bioactive components.
High-throughput next-generation sequencing technologies provide platforms to perform deep sequence analysis, which has greatly boosted comprehensive genetic research on some relatively uncharacterized species. C. capillata is one of the most common venomous jellyfish in the East China Sea. We previously demonstrated that a tentacle extract from C. capillata exhibits diverse bioactivities, including hemolytic, proteolytic, cardiovascular, cytolytic and antioxidant activities [12][13][14]. However, the underlying mechanisms of these bioactivities at the molecular level remain unclear. In the present study, we performed de novo transcriptome sequencing of the tentacle tissue of C. capillata using the Illumina HiSeq™ 2000 platform. A systematic bioinformatics strategy was used to conduct an in-depth and integrated analysis of this transcriptome, explore the venom composition in detail, and identify other important molecules in C. capillata.

Jellyfish sample collection and RNA isolation
Samples of the jellyfish C. capillata were collected in July 2013 in the Sanmen Bay, East China Sea. No specific permit was required to catch C. capillata. The tentacle tissues were quickly excised manually after capture and frozen immediately in liquid nitrogen. Total RNA was isolated using TRIzol reagent (Invitrogen, CA, USA) and treated with RNase-free DNase I (Takara Biotechnology, China). RNA integrity was validated with a 2100 Bioanalyzer (Agilent Technologies, CA, USA).

Illumina sequencing
Illumina sequencing analysis was performed according to the methods described previously [15][16]. Briefly, poly(A) mRNA was isolated using Oligo (dT) beads and interrupted to short fragments. These fragments were then transcribed into first-strand cDNA, followed by synthesis of the second strand. The synthesized cDNA products were purified using a QiaQuick PCR extraction kit (Qiagen, Valencia, CA, USA) and dissolved in EB buffer for end repair and poly (A) addition. Subsequently, the cDNA fragments were ligated to the sequencing adapters and subjected to size selection using agarose gel electrophoresis. Suitable fragments were amplified by PCR, and the cDNA library was sequenced using an Illumina HiSeq™ 2000 sequencer at the Beijing Genomics Institute (BGI; Shenzhen, China).

De novo assembly and functional annotation
The image data output from the sequencer was transformed into sequence data called raw reads. After filtering low-quality reads and reads containing more than 5% unknown nucleotides, the sequencing adaptors were removed from the raw reads. Subsequently, the raw reads were assembled into contigs and unigenes by de novo assembly, which was performed with the Trinity program [17]. Finally, unigenes were aligned by BLASTx (e-value 10 −5 ) to protein databases, including the NCBI non-redundant protein (Nr) database (http://www.ncbi.nlm. nih.gov), Swiss-Prot protein database (http://www.expasy.ch/sprot), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database (http://www.genome.jp/kegg) and Cluster of Orthologous Groups (COG) database (http://www.ncbi.nlm.nih.gov/COG). Proteins with the highest sequence similarity with the given unigenes were used to determine the sequence direction, functional annotation and protein coding region. A preferential order of Nr, Swiss-Prot, KEGG and COG was followed if the results from these databases were inconsistent. If no hits were obtained for a unigene in these databases, ESTScan software [18] was used to decide the sequence direction and protein coding region. Based on Nr annotations, the Blast2GO program [19] was then used to obtain the gene ontology (GO) annotations of the unigenes, followed by GO classification using WEGO software [20]. COG and KEGG were also used to obtain functional annotations for the unigenes and analyze gene products involved in metabolism.

Identification of toxin-like transcripts
According to our previous studies of C. capillata and other reports on various jellyfish, the toxic effects of jellyfish venom primarily include vasoconstriction, hemorrhage, and hemolytic and cardiovascular toxicities. To explore the underlying molecular mechanisms of these toxic actions and identify as many putative toxin transcripts in C. capillata as possible, three strategies were used. First, we compared the unigene sequences to a toxin database in Swiss-Prot, Tox-Prot (http://www.uniprot.org/program/Toxins), based on sequence homology. Second, to make the screening more complete, we also manually searched the annotations of the unigenes under the term 'toxin' or 'venom'. Third, according to the symptoms after jellyfish envenomation, we referred to many previous reports on venomous components in different types of venomous animals, such as snakes, scorpions, spiders, wasps and sea anemones, to construct a reference guide of estimated toxin-like transcripts.

Analysis of transcripts related to degenerative diseases
Sequences encoding proteins associated with degenerative diseases, including Huntington's disease (HD), Alzheimer's disease (AD) and Parkinson's disease (PD), were identified by BLAST results against the Nr database, with a cut-off value of e-value 10 −5 .

Results and Discussion
Illumina sequencing and reads assembly A total of 54,109,750 raw reads were obtained using the Illumina HiSeq™ 2000 platform. After cleaning and removing dirty reads containing adapters, unknown or low quality bases, a total of 51,304,108 clean reads corresponding to more than 4.61 billion clean nucleotides were generated ( Table 1). The average length of the clean reads was 90 bp, consistent with the sequencing capacity of the Illumina device. The Q20 percentage, N percentage and GC percentage were 99.15%, 0.02% and 40.26%, respectively. The original sequencing data for the clean reads have been deposited in the NCBI Sequence Read Archive (SRA) database (accession number SRP056566).
Using the Trinity program, a total of 125,058 contigs corresponding to more than 32 million nucleotides were assembled from the short reads. Among these assembled contigs, 90.50% (113,176) were between 100 and 500 bp in length, 5.96% (7,451) were between 500 and 1000 bp, 2.99% (3,740) were between 1000 and 2000 bp, and 0.55% (691) were more than 2000 bp. Finally, the contigs were connected, and 50,536 unigenes were generated, with a mean length of 503 bp. Although most unigenes (36,224,71.68%) were between 100 and 500 bp, we obtained 14,312 unigenes that were greater than 500 bp in length. The length distributions of these assembled contigs and unigenes are shown in S1 Fig. Protein coding sequences (CDS) of all assembled unigenes were predicted. A total of 20,892 potential CDSs were identified by BLAST searches, and 5,817 CDSs were predicted by ESTScan.
Transcriptome analysis using Illumina sequencing technology is one of the most popular tools for gene discovery, and it has recently been applied to several species that lack genomic sequence information [21][22][23]. Therefore, the transcriptome data for C. capillata obtained here will enrich the sequence information previously available for jellyfish in public databases. In addition, this transcriptome could provide more detailed and general genetic data to facilitate large-scale discovery and rapid characterization of novel important genes from jellyfish.

Functional annotation and classification
To obtain functional annotations of the predicted proteins, the assembled unigenes were used as a query for Blastx alignments to several public protein databases, including the Nr, Swiss-Prot, KEGG and COG databases. An e-value < 10 −5 was used as a cut-off for confident homologue detection. Of a total of 50,536 unigenes, hits were obtained for 20,629 (40.8%) in the Nr database. This large percentage of hits was anticipated. In addition, hits were obtained for 17,006 (33.7%), 13,663 (27.0%) and 6,202 (12.3%) unigenes in the Swiss-Prot, KEGG and COG protein databases, respectively. We also aligned all of the unigenes by Blastn to the Nt nucleotide database, and 4,878 (9.7%) had significant hits. The functional annotations by BLAST searches of the 50,536 unigenes are presented in S1 Table. The e-value distributions were also calculated. Among the 20,629 unigenes that had homologous proteins in the Nr protein database, more than half of the matched sequences (11,793,57.2%) had an e-value ranging between 1e -10 and 1e -50 , and 6.6% (1,368) had homology with an e-value smaller than 1e -100 , indicating strong reliability of the alignment. To sum up, a total of 21,357 (42.3%) unigenes were significantly similar to the unique protein accessions in each of the above databases. However, more than half of the unigenes had no significant matches due to the limited sequence information for jellyfish and their closely related species or partly due to a high number of species-specific transcripts and novel sequences. The species distributions of the top BLAST hits against the Nr database were also analyzed. The assembled unigenes had the greatest number (3,900, 18.91%) of matches with Hydra magnipapillata. Because jellyfish and hydra are both included in the phylum of Cnidaria, sequence similarities are likely to be highest with closely related species. The next species were Saccoglossus kowalevskii (2,218, 10.75%), Strongylocentrotus purpuratus (1.368, 6.63%), Danio rerio (1,008, 4.89%), Xenopus tropicalis (778, 3.77%) and another typical species of Cnidaria, Nematostella vectensis (713, 3.46%). Other species with proportions of greater than 1% are also shown in Fig 1. However, few matches corresponded to jellyfish species, which might also be due to the limited number of jellyfish protein sequences available in the database. GO annotation based on sequence homology was used to determine the GO terms of the unigenes. A total of 6,657 (13.2%) unigenes were categorized into at least one group of 49 subcategories of three independent ontology categories (S2 Fig). Among the 26 sub-categories of "biological process", "cellular process" (3412 unigenes) was the most dominant group, followed by "metabolic process" (2578 unigenes) and "biological regulation" (1467 unigenes), indicating that many extensive metabolic activities and rapid growth may occur in the tentacle of C. capillata. For the "cellular component" category, the most representative of the assignments were "cell" (4373 unigenes), "cell part" (4041 unigenes) and "organelle" (2701 unigenes). Within the 12 groups corresponding to "molecular function", the dominant distributions were from "binding" (2895 unigenes) and "catalytic activity" (2718 unigenes). These GO annotations represented the general gene expression profile characteristics for the tentacle of C. capillata. To further predict and classify possible functions of the unigenes, COG assignments were used (S3 Fig). The category of "general function prediction only", which contained 2,079 unigenes (33.52%), was the largest group, followed by "replication, recombination and repair" (898, 14.48%), "translation, ribosomal structure and biogenesis" (841, 3.94%) and "posttranslational modification, protein turnover, chaperones" (769, 12.40%). The categories of "extracellular structures" (3, 0.05%) and "nuclear structure" (8, 0.13%) were the smallest groups. The GO and COG functional classifications thus provided valuable and detailed information for investigating specific processes and functions in jellyfish tentacle.
To identify the active metabolic pathways in the tentacle of C. capillata, the annotated unigenes were mapped to KEGG pathways. A total of 13,663 unigenes were assigned to 241 KEGG pathways consisting of the categories of "metabolism" (86 pathways), "genetic information processing" (21 pathways), "environmental information processing" (17 pathways), "cellular processes" (14 pathways), "organismal systems" (52 pathways) and "human diseases" (51 pathways) (Fig 2A). Among the mapped pathways, "metabolic pathways" contained 1,912 unigenes (13.99%) and was obviously larger than the other groups, such as "pathways in cancer" (513, 3.75%), "focal adhesion" (477, 3.49%), "'Huntington's disease" (476, 3.48%) and "regulation of actin cytoskeleton" (420, 3.07%). The top 10 pathways are shown in Fig 2B, and all pathways are summarized in S2 Table. In summary, based on de novo sequencing and in-depth analysis, we obtained a well-annotated transcriptome that could provide valuable information for identifying novel genes and investigating specific metabolic pathways in the tentacle of the jellyfish C. capillata.

Analysis of putative toxin transcripts in C. capillata
Interest in the biological activities of jellyfish toxins has increased considerably in recent years. However, due to the technical difficulties of obtaining toxins and their labile nature, the molecular mechanisms of jellyfish toxins remain largely unknown. Few molecules in jellyfish venom have been described, hampering the exploration of the pathophysiology of jellyfish envenomation and proper patient care. Many studies have demonstrated that jellyfish venom consists of a complex mixture of proteins and peptides with toxic or enzymatic actions [13,[24][25][26]. Therefore, one of the main goals of the present study is to identify as many putative toxin transcripts in C. capillata as possible and build a reference database of toxin sequences to facilitate the future analysis of other jellyfish species.
1. Venom constituents similar to known toxin families. Based on the symptoms of jellyfish stings, we identified a large number of transcripts encoding proteins similar to several well-known toxin families that have been widely described in various venomous animals. These toxin transcripts are further classified and summarized in Table 2.
Phospholipases: We identified many members and isoforms of phospholipase A2s (PLA2s) and phospholipase D (PLD) in the C. capillata tentacle transcriptome. PLA2s are the most common type of phospholipases identified in the venom of various toxic animals, such as snakes, scorpions and ants [27][28][29][30]. High levels of PLA2 activity have also been described in the tentacles of scyphozoan and cubozoan species [31]. PLA2s were also recently reported to be one of the most abundant toxins in the venom of the jellyfish S. meleagris [9]. In this study, we identified 10 unique transcripts encoding PLA2s in the tentacle of C. capillata (Table 2), and six of the 10 transcripts had clear open reading frames. Generally, hemolysis is considered to be the direct result of PLA2s and hemolysin, which can interact with ion channels, membrane proteins and membrane ion pumps. We also previously observed both in vitro and in vivo hemolysis of the tentacle extract from C. capillata [12]. Hemolytic activity is considered one of the most common biological activities of jellyfish venom [32][33]. In addition to hemolytic effects, venom PLA2s can also mediate several other toxic responses, such as cytotoxicity, cardiotoxicity, neurotoxicity, myotoxicity, edema and blood coagulation disturbance [34][35].
In addition to PLA2s, we also identified five transcripts for PLDs in the transcriptome. This is the first report of PLDs in jellyfish species. PLD has been reported to act as a dermonecrotic factor in the venom of brown spiders and plays a role in the necrotic effect and severe inflammatory response [36]. Interestingly, jellyfish venom exhibits an obvious dermonecrotic effect [37][38]. C. capillata venom can induce dermonecrotic lesions in the skins of rats and Guinea pigs [39]. Thus, the presence of PLDs in the C. capillata tentacle transcriptome is a significant finding that may help to advance the discovery of the dermonecrotic mechanism of jellyfish venom.
Metalloproteases: Two types of metalloproteases were identified: matrix metalloproteinases and astacin-like metalloproteases ( Table 2). The identification of these diverse metalloprotease transcripts is not surprising, because metalloproteases have been described as the central toxic component in various venomous animals [40][41][42]. In snake venoms, metalloprotease toxins are predominantly responsible for local pathological effects, such as tissue damage, necrosis and hemorrhage [43]. In this study, 11 unique transcripts encoding matrix metalloproteinases families, including matrix metalloproteinase-14, -1, -9, -24 and -25, were identified, and three of the 11 transcripts aligned best with metalloproteinase-14. This identification supports a previous report describing metalloprotease-14 as the main venom-derived proteins in the jellyfish Nemopilema nomurai [44].
In addition, six transcripts for the astacin-like metalloproteases were also identified. Astacin-like metalloproteases have been reported to be the main components of Loxosceles toxins [45]. Multiple alignment analysis revealed that all of these deduced amino acid sequences contain the astacin family signatures HEXXHXXGXXHE (enzymatic catalytic domain) and MXY (Met-turn), similar to the astacin-like toxins in the Loxosceles genus and other astacin familyrelated members (Fig 3). Loxosceles astacin toxins can degrade extracellular matrix proteins and assist the spread of other venom components. However, the study of astacin-like metalloproteases in jellyfish venom remains in its infancy. Scyphozoan jellyfish venom has significant gelatinolytic, caseinolytic, and fibrinolytic activities [46]. Astacin-like metalloproteases were also recently identified as an important component of the venoms of S. meleagris and N. nomurai jellyfish [9,44]. Based on these findings, these astacin-like metalloproteases identified from C. capillata likely also play an important role in the pathological processes of C. capillata envenomation.
Serine proteases: Serine proteases are the best-characterized venom component. In this study, we identified 15 unique sequences of this family ( Table 2), but only six were confirmed to have a complete ORF due to their high molecular weights. Among the 15 transcripts, six sequences were significantly homologous to serine proteases from the jellyfish Aurelia aurita. Serine proteases have been described in the venoms of snakes, spiders and scorpions [28,36,47]. Generally, this toxin family affects a wide array of physiological functions, including platelet aggregation and fibrinolytic pathways [48]. They can also play a role in post-translation modification and spreading other toxins [49]. However, the exact role of serine proteases in jellyfish envenomation remains to be clarified. Serine protease inhibitors: Several serine protease inhibitors were identified in this study, including Kazal-type (4 transcripts) and Kunitz-type (KUNs) (4 transcripts). The Serpin family (5 transcripts) was also identified ( Table 2). Serine protease inhibitors have been widely found in the venoms of many well-known toxic animals [50][51][52]. However, few toxins of this type have been reported in jellyfish venom. Among this toxin family, Kunitz-type inhibitors have a Full-length and not full-length open reading frames (ORFs) are indicated by "F" and "N", respectively. b Transcripts with or without a signal peptide are indicated by "Y" and "N", respectively, whereas transcripts in which the presence of a signal peptide was unclear are indicated by "U" (Unknown).
doi:10.1371/journal.pone.0142680.t002 been commonly observed in snake venoms and inhibit both serine proteases and calcium ion channels. They are characterized by three disulfide bonds belonging to a highly conserved motif of C-8X-C-15X-C-4X-YGGC-12X-C-3X-C [50]. In this study, multiple alignment analysis demonstrated that compared with other typical Kunitz-type inhibitors, most of these identified Kunitz-type inhibitor transcripts possess a native Kunitz architecture (Fig 4). The function of serine protease inhibitors in the various venoms has been suggested to be primarily related to the protection of toxin integrity. Additionally, this toxin family may play a role in various physiological processes, such as blood coagulation, fibrinolysis and host defense. However, whether serine protease inhibitors in jellyfish venoms have a similar function remains to be explored, and further investigations are needed.
2. Other possible venom components. In addition to the well-known types of venom toxin families described above, some transcripts resembled putative molecules with potential toxic activities reported in other venomous animals. Moreover, most of these sequences have not yet been described in jellyfish species. Thus, these transcripts were classified as 'other possible venom components', and the main features of these molecules are presented in Table 3.
In this study, we identified a transcript (Unigene6213) with significant similarity to the Nterminal sequences of a family of known jellyfish toxins, including TX2 isolated from Aurelia   5E-94 Potential inhibitor of platelet aggregation a Full-length and not full-length open reading frames (ORFs) are indicated by "F" and "N", respectively.
b Transcripts with or without a signal peptide are indicated by "Y" and "N", respectively; transcripts in which the presence of a signal peptide was unclear are indicated by "U" (Unknown). doi:10.1371/journal.pone.0142680.t003 Global Transcriptome Analysis of the Tentacle of Cyanea capillata aurita, CfTX-1,2 and CfTX-A,B from Chironex fleckeri, CqTX-A from C. quadrigatus, CrTX-A from C. rastoni and CaTX-A from C. alata [53][54]. The multiple sequence alignment and phylogenetic analysis of these similar jellyfish toxins are presented in Fig 5. This toxin family is primarily associated with potent hemolytic activity and pore-forming action [53,55]. As shown in Fig 5, a putative transmembrane spanning region (TSR1) that is highly conserved in the N-terminal sequences of this toxin family is also present in the predicted amino acid range (6-50) of Unigene6213. The presence of the transmembrane spanning region in these jellyfish toxins is very important because it may play a role in the pore-forming process [50]. Therefore, Unigene6213 is likely a new member of this family of jellyfish toxins, even though it does not contain a full-length ORF. Further research is needed to determine the full-length sequence, structure and biological role of this putative toxin in C. capillata.
In addition to the homologues of known jellyfish toxins, we identified a number of toxinlike transcripts similar to some unique venom components isolated from other venomous animals. Two putative toxin transcripts with high similarity to plancitoxin-1-like, a gene closely related to plancitoxin-1 from the crown-of-thorns starfish Acanthaster planci, were also identified (Fig 6). Plancitoxin-1 is one of the few known toxic DNase II proteins and exhibits hepatotoxic properties in Acanthaster planci, the species from which this toxin was first described [56][57]. Additionally, plancitoxin-1 toxin was also previously identified in several nemertean species and the jellyfish S. meleagris [9,58]. A single transcript with a full-length ORF was identified that encoded a peptide exhibiting high identity with venom toxin-like peptide-6 (Gen-Bank accession number ABR21046.1) and venom protein 2 (ABR21036.1), which were both isolated from the venom of the scorpion Mesobuthus eupeus (Fig 7). The sequence of the transcript comprises 71 amino acids and displays 45% and 44% identity, respectively, with these two scorpion toxins. However, the functions of these venom toxins in envenomation remain unknown. Two transcripts similar to Ci-120 and Ci-80a were also characterized (S4 Fig). Ci-120 and Ci-80a are both venom proteins identified from the parasitic wasp Chelonus inanitus   [40]. The Ci-120 protein, which may affect proteoglycan metabolism, exhibits high sequence similarity to alpha-N-acetylglucosaminidase. Ci-80a is a member of the papain family, and its role in envenomation remains to be determined. In addition, we identified a transcript encoding a protein similar to venom dipeptidyl peptidase 4(DPP4) isoform X2 (S5 Fig). DPP4 has been widely identified in snake venoms [59], suggesting an important role in envenomation. Venom DPP4 is involved in the processing of venom peptides and may also play a role in cardiovascular disorders caused by the venom [60][61].
In addition to these toxin-like candidates, some other putative venom components were also identified in the transcriptome of C. capillata tentacle by BLAST searches. Translationally controlled tumor protein (TCTP) has been identified in the gland secretions of ticks and mites [62][63]. TCTP is also called histamine-releasing factor (HRF) and has been reported as a venom toxin in several spider species [36]. In our transcriptome of C. capillata, a transcript exhibiting significant similarity with TCTP (or HRF) was identified (Fig 8). TCTP induces histamine release in basophil leukocytes and is one of the molecules responsible for histamineassociated symptoms. Interestingly, local tissue edema, the main effect of histamine release, always occurs after jellyfish stinging. Therefore, we suggest that TCTP in C. capillata tentacle may play a role in histamine release during jellyfish envenomation.
We also identified a transcript exhibiting significant similarity to various angiotensin-converting enzyme-like (ACE-like) proteins and a transcript similar to endothelin-converting enzyme 1-like (ECE 1-like) proteins (S6 Fig). ACE and ECE are both metalloproteases. ACE plays a role in converting angiotensin I to angiotensin II, which can induce vasoconstriction and elevate blood pressure. ECE is involved in the processing of ET-1, a potent vasoconstrictor, from its inactive precursor. Expression of ACE (or ACE-like) and ECE proteins in the venoms of several species of cone snails and wasps has been reported [40,64]. However, this is the first report of the presence of these molecules in jellyfish. Cardiovascular toxicity is the major bioactivity of jellyfish venoms. We previously demonstrate that C. capillata venom induces marked vasoconstriction [65]. Therefore, the identification of ACE and ECE-1 in C. capillata tentacle strongly suggests that these molecules might contribute to the disruption of cardiovascular function caused by jellyfish venoms.
Additionally, a transcript that exhibited significant sequence similarity to neprilysin (or neprilysin-like protein) was identified (Fig 9). Neprilysin is a zinc-dependent metalloprotease with affinity for a broad range of physiological targets, including natriuretic, vasodilatory and neuro peptides [66]. Notably, neprilysin has been identified in the venoms of several species of spiders and snakes and is likely correlated with neurotoxicity, potentially due to its activity in the inactivation of peptide transmitters and their modulators [66]. Interestingly, jellyfish venom comprises neurotoxins that immediately paralyze prey. We have also observed this neurotoxic activity in C. capillata venom. Therefore, neprilysin (or neprilysin-like protein) may also play a role in C. capillata envenomation, but its precise role requires further study.
We also identified four transcripts exhibiting similarity to the ectonucleotide pyrophosphatase/phosphodiesterase family (Table 3). Phosphodiesterase activity has been described in many snake venoms [67][68]. A phosphodiesterase family member has been identified as a toxin in the jellyfish S. meleagris [9]. However, the contributions of these proteins to the poisoning mechanism are poorly understood. Phosphodiesterases might exhibit inhibitory activity on ADP-induced platelet aggregation and contribute to hemostatic disturbances. Therefore, the discovery of an ectonucleotide pyrophosphatase/phosphodiesterase family in C. capillata tentacle implies that this superfamily might be constituents of jellyfish venom and their roles in envenomation deserve further investigation.
In addition, sequences with similarity to vascular endothelial growth factors (VEGFs) were also identified (Table 3). VEGFs, which contain several subclasses, have been described as minor venom constituents in the venom glands of several snake species [28][29]. Among these transcripts, two sequences exhibited similarity to VEGF-A, which has been reported to induce vasodilation and potently increase vascular permeability. It can also promote tachycardia and hypotension and diminish cardiac output [69]. We observed these symptoms when C. capillata venom was injected in rats. Therefore, this family might also play a role in C. capillata envenomation.
In addition to the molecules mentioned above, we also identified several other possible venom components, including lysosomal acid lipases (LALs), alkaline phosphatase, dipeptidyl peptidase 3 and ectonucleoside triphosphate diphosphohydrolase (Table 3). These transcripts have also been previously reported as atypical venom components in some venomous animals [26,67], and none of these transcripts have been previously described in jellyfish. However, their contributions to the bioactivities of jellyfish venom require experimental verification.

Transcripts relevant to degenerative diseases
In addition to putative toxin transcripts, disease-related transcripts were also identified (data not shown). Among them, the most surprising were transcripts relevant to three nervous system diseases, Huntington's disease (HD), Alzheimer's disease (AD) and Parkinson's disease (PD). We characterized 476 unigenes involved in the pathway of Huntington's disease, the fourth largest group in KEGG annotation. HD is an inherited neurodegenerative disease [70]. In this study, transcripts homologous to genes closely related to HD, including Huntingtin, Huntingtin-interacting protein 1 and Huntingtin-interacting protein K, were all identified (S3 Table). Sequence analysis revealed that the Huntingtin genes were highly conserved in various species (Fig 10). We also identified two transcripts encoding proteins similar to presenilin-2 (S7 Fig). Mutations in the genes encoding presenilin-1 and presenilin-2 are responsible for early-onset autosomal dominant Alzheimer's disease, the most frequent degenerative dementia among the elderly [71]. There are no effective treatments for HD or AD. Transcripts encoding proteins homologous to Parkinson disease-related proteins, including Parkinson protein 7 (protein DJ-1) and Parkinson disease 7 domain-containing protein 1, were also identified in the transcriptome of C. capillata tentacle (S3 Table). This is the first description of the expression of these nervous system disease-associated genes in jellyfish species. Interestingly, in a previous study of the genome of Hydra magnipapillata, genes associated with nervous system diseases, including HD and AD, were also identified [72]. Therefore, these results strongly suggest that degenerative disease-related genes are highly conserved from invertebrates to vertebrates. These findings also suggest the great potentials of these marine invertebrates as models in the study of degenerative diseases.

Conclusions
This study contributes to a more comprehensive view of the origin and functional diversity of venom proteins in jellyfish, provides a foundation and valuable resource for further investigations of bioactive components, and promotes the general development of jellyfish resources.