The authors have declared that no competing interests exist.
Conceived and designed the experiments: SDR HSH AWP RSN. Performed the experiments: SDR HSH LDM ATP. Analyzed the data: SDR. Contributed reagents/materials/analysis tools: HSH AWP RSN ATP. Wrote the paper: SDR.
Current address: Department of Biology, University of Utah, Salt Lake City, Utah, United States of America
Current address: The Department of Biochemistry and Molecular Biology, School of Biomedical Sciences, Monash University, Clayton, VIC, Australia
Animal venoms represent a vast library of bioactive peptides and proteins with proven potential, not only as research tools but also as drug leads and therapeutics. This is illustrated clearly by marine cone snails (genus
Animal venoms represent a vast library of bioactive peptides and proteins. This is illustrated elegantly in cone snails (genus
Molecular targets of individual conotoxins are diverse and include a range of voltage-gated ion channels, ligand-gated ion channels, G-protein coupled receptors and neurotransmitter transporters
The epithelial cells lining the duct of a cone snail’s venom gland, are rich in messenger RNAs (mRNAs) encoding conotoxins
The primary objective of this study was the large-scale discovery of novel conotoxin sequences from the venom gland of
RNA was extracted from the venom gland of
Assembly with MIRA produced 40,513 contigs (from 463,701 reads longer than 30 nt) with an average length of 588 nt (median: 528 nt), a maximum of 7,406 nt and minimum of 30 nt (user-defined). A general annotation of the transcriptome using BLASTX
While BLASTX was used for a general annotation of the transcriptome, profile hidden Markov models (pHMMs) were used (independently of BLAST) to annotate conotoxins. pHMM models were built based on known conotoxin superfamilies (as described in methods) and used to search the
The
A pHMM was built based on the sequences of known A-superfamily conotoxins and used to search the
*, Vc1.2 precursor
Other A-superfamily peptide precursor sequences identified in the venom gland transcriptome of
Six unique I1-superfamily conotoxins were identified in the venom gland transcriptome of
*, M11.2 mature peptide
Four unique I2-superfamily conotoxins were identified (
Four unique J-superfamily conotoxins were identified in the venom gland transcriptome of
*, pl14a
Several conotoxin sequences from each of the M1, M2 and conomarphin subgroups of the M-superfamily were identified (
Almost all of the M-superfamily sequences identified in
The M_conomarpin_Vc1 and M_conomarpin_Vc2 sequences clearly belong to the cysteine-free conomarphin class of conotoxins, although the predicted mature peptides of each differ substantially from previously identified conomarphins. M_Vc3, along with a sequence recently identified in
The O1-superfamily of conopeptides consists of δ- (which block inactivation of voltage-gated Na+ channels), μ- (voltage-gated Na+ channel blockers), κ- (voltage-gated K+ channel blockers) and ω-conopeptides (voltage-gated Ca2+ channel blockers), all of which share a type VI/VII cysteine framework (C-C-CC-C-C).
Several O1-superfamily sequences have been identified previously in
*, Vc6.1, Vc6.3, Vc6.4, Vc6.6
The remaining O1-superfamily sequences identified were completely novel, although some showed similarity to known ω-, δ-, and μ-conotoxins. Notably, the predicted mature peptide sequence of O1_Vc6.31 was 90% identical to μ-MrVIB, an O1-superfamily conotoxin from
A single cysteine-free sequence (O1_Vc1) from the O1-superamily may constitute a new class of conotoxin. Close inspection of the sequencing reads encoding this transcript (taking into account contig coverage and read quality) indicated that this unusual sequence was not simply the result of a frameshift due to sequencing error.
Eleven O2 conotoxin precursors were identified previously by cDNA sequencing of the
A pHMM was built based on the sequences of all known O2/contryphan-superfamily conotoxins and used to search the
*, TxVIIA
Contryphans are short single disulfide-containing conotoxins that display a diversity of function but could generally be described as Ca2+ channel modulators
All contryphans identified so far have either Pro/Hyp followed by D-Trp or Val followed by D-Leu at positions one and two of the intercystine loop. Hyp (or Pro) at position 1 of the disulfide loop appears to be necessary for slow conformational interconversion observed in these peptides
Other than its propeptide sequence and single pair of cysteines, contryphan_Vc1 shares no obvious sequence similarity to contryphan_Vc2, or indeed any other contryphans.
One O3 superfamily precursor was identified in
*, Bromosleeper peptide (GenBank: GQ981406.1)
Three P-superfamily precursor sequences, P_Vc9.1, P_Vc9.2 and P_Vc14.5, were identified in the venom gland transcriptome of
The predicted mature peptide sequence of P_Vc9.2 is 96% identical to GmIXA, a conotoxin from the venom of
The two S-superfamily conotoxins to have undergone pharmacological characterization displayed different activity: GVIIIA competitively inhibited the 5-HT3 serotonin receptor
The precursor sequences of 27 unique T-superfamily conotoxins were identified (
*, Vc5.4
Three of the 27 sequences had been identified previously in
Despite evidence that the T-superfamily is abundant, not only in
A pHMM was constructed based on the sequences of known conantokin precursors and was used to search the
*, Con-Gm
The original con-ikot-ikot was identified and characterized from the venom of the
A recently discovered conotoxin isolated from the venom of
Here we show that con-ikot-ikots are not limited to the fish-hunting species described above. A con-ikot-ikot precursor sequence was identified in
Secretory phospholipase-A2s (sPLA2s) have been reported in a wide variety of animal venoms, as well as mammalian tissues and bacteria. They catalyze the hydrolysis of the ester bond at the
Conodipine-M, a 13.6 kDa component of the venom of
Here we show that conodipines, like other sPLA2s, are encoded by a single precursor consisting of a signal peptide sequence followed by the α-chain, a propeptide linker and finally the β-chain (
Two of the precursors identified display remarkable similarity in their predicted mature peptide region to conodipine-M, including their cysteine framework and catalytic His-Asp dyad. The remaining sequence retains the general precursor structure of conodipine_Vc1 and 2 and the predicted catalytic dyad, but displays not only a unique signal peptide sequence but also a unique cysteine framework. Given its unique signal peptide sequence, this conotoxin could be considered the first member of a new superfamily.
In a previous study, several linear peptides identified in the venom proteome of
Based on alignment of two known B2-superfamily precursor sequences from
The E- and F-superfamilies of conotoxins were recently described from the venom gland transcriptome of
pHMMs were constructed based on each of the known precursor sequences and used to search the
*, Mr104
The precursor sequences of several novel conotoxins clearly belonged to the recently discovered H-superfamily of conotoxins from
A single H-superfamily sequence encoding a cysteine-free predicted mature peptide region was also encountered (H_Vc1), indicating that, like other superfamilies, the H-superfamily is not limited to a single cysteine framework. This unusual sequence probably constitutes a new class of conotoxin. As described above for O1_Vc1, a close inspection of the sequencing reads was performed to confirm that this unusual sequence was not simply the result of a frameshift due to sequencing error.
A recently described third I-superfamily (I3)
Construction of a pHMM based on these sequences enabled the identification of a single I4-superfamily member in the venom gland transcriptome of
Annotation of the
Although the pre- and propeptide sequences clearly differ from known conotoxin superfamilies, the U-superfamily peptides share the cysteine framework (VI/VII) of most members of the O1-, O2- and O3-superfamilies, as well as the H-superfamily. However, on comparison with these superfamilies it is apparent that there is little similarity either in the intercysteine loop composition or length
Discovery of the signal peptide sequence for this superfamily should allow the rapid identification of U-superfamily conopeptides in other
Given the sequence similarity in the mature peptide sequences of U_Vc7.3 and 7.4 to the textile convulsant peptide, it is likely that they share similar biological activity. Despite its potent biological activity, the molecular target of the textile convulsant peptide has not been identified.
While the venoms of
Annotation of the venom gland transcriptome of
Possible initiator codon in frame 2 is underlined in purple and the sequence encoding the predicted mature peptide in frame 1 is underlined in black.
To give a general indication of the relative expression levels of each conotoxin superfamily in the venom gland of
High abundance reads may be under-represented as a result of cDNA library normalization.
Known superfamilies searched for, but not identified in the venom gland transcriptome of
The traditional approach for venom peptide identification has been assay-directed fractionation, followed by isolation and peptide sequencing. This approach is labour-intensive and requires a large amount of venom, which is not always available. The use of targeted PCR amplification of venom duct cDNA increased the speed at which venom peptides could be identified and also reduced the amount of starting material required. Similarly, large-scale cloning of cDNA libraries and Sanger sequencing has also been performed and has successfully generated a large number of novel peptide sequences
One trade-off, however, with this technology is the higher error rate in homopolymer runs (compared with other sequencing platforms). Such errors can result in insertions or deletions, which can introduce frameshifts or amino acid changes in the resulting sequences. For this reason reporting of 454 reads prior to assembly is risky. Higher sequence coverage provided by the assembly process works to reduce sequencing errors, producing more reliable sequences and reducing the likelihood of reporting minor variants and unusual sequences that are simply the result of sequencing error.
Recently, it was demonstrated that pHMMs can be used to classify conotoxins and proposed that the use of pHMMs was a highly suitable approach for identifying conotoxin sequences in large datasets (e.g. transcriptomes)
Most of the conotoxins identified here display little amino acid sequence similarity to conotoxins with a defined molecular target. Moreover, several sequences define new classes of conotoxins and seem likely to display novel activity profiles. While each of the conotoxin precursor sequences described here is unique, several appear to encode mature peptides that are similar, if not identical, to known conotoxins (
Superfamily | Cysteineframework | # identified in |
Associated activities | Reference |
A | I | 2 | nAChRs inhibitors, GABAB receptor agonists, α1-adrenoceptor inhibitor | |
XXII | 1 | N.D. | ||
Conantokin (B) | Cysteine-free | 1 | NMDA receptor inhibitors | |
B2 | Cysteine-free | 1 | N.D. | |
E | N.D. | 1 | N.D. | |
F | N.D. | 1 | N.D. | |
H | VI/VII | 2 | N.D. | |
Cysteine-free | 1 | N.D. | ||
I1 | XI | 6 | Voltage-gated Na+ channel agonists | |
I2 | XI | 4 | K+ channel modulators | |
I4 | XII | 1 | N.D. | |
J | XIV | 4 | Neuronal and neuromuscular nAChR inhibitor and voltage gatedK+ channel inhibitor | |
M1 (M) | III | 4 | Excitatory symptoms in mice (IC), voltage-gatedNa+ channel agonist | |
M2 (M) | III | 6 | Excitatory symptoms in mice (IC) | |
Conomarphin (M) | Cysteine-free | 2 | N.D. | |
(M) | Single disulfide | 1 | N.D. | |
O1 | VI/VII | 20 | Voltage-gated Na+ channel agonists, voltage-gated K+ channel blockers, voltage-gated Na+ channel blockers or voltage-gatedCa2+ channel blockers | |
Cysteine-free | 1 | N.D. | ||
O2 | VI/VII | 18 | Neuronal pacemaker modulators | |
Contryphan (O2) | Single disulfide | 2 | Ca2+ channel modulators | |
O3 | Cysteine-free | 1 | N.D. | |
P | IX | 2 | Hyperactivity and spasticity in mice (IC) | |
XIV | 1 | N.D. | ||
S | VIII | 1 | 5-HT3 receptor inhibitor, nAChR inhibitor | |
T | V | 24 | Voltage-gated Na+ channel inhibitor, presynaptic Ca2+ channel inhibitor (or GPCR modulator), sst3 GPCR antagonist | |
XIII | 1 | N.D. | ||
X | 1 | Noradrenaline transporter inhibitors | ||
U | VI/VII | 2 | convulsions, stretching of limbs and jerking behavior in mice (IC) | |
Con-ikot-ikot | XXI | 1 | AMPA receptor modulator | |
Conodipine | 3 | Phospholipase-A2 |
Each conotoxin superfamily is divided into groups according to cysteine framework, with the number identified in
AMPA, α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid; GABA, γ-aminobutyric acid; GPCR, G protein-coupled receptor; IC, intracranial injection; nAChR, nicotinic acetylcholine receptor; N.D., not determined; NMDA, N-Methyl-D-aspartate; sst, somatostatin.
The naming of conotoxin precursors identified in this study was undertaken according to the conventional conotoxin nomenclature (where species is represented by one or two letters, cysteine framework by an Arabic numeral and, following a decimal, order of discovery by a second numeral)
Two of the conotoxins identified here (A_Vc22.1 and P_Vc14.5) displayed cysteine frameworks not previously associated with their particular superfamily. In the case of P_Vc14.5, comparison with the primary structures of framework IX P-superfamily conotoxins suggests that this change may only be subtle. However A_Vc22.1 is not at all similar to other A-superfamily conotoxins and could therefore be expected to display a unique activity profile. Cysteine-poor conotoxins were identified in several of the traditionally cysteine-rich superfamilies (M, O1, O2, O3, and H). Other than the conomarphins and contryphans, these sequences probably represent new conotoxin classes. A con-ikot-ikot conotoxin, previously limited to piscivorous species of
Several of the relatively uncharacterized conotoxin superfamilies were observed at high abundance in the venom gland transcriptome of
The goal of future studies utilizing the information presented here will be the functional characterization of the peptide products of new conotoxin sequences. The first step will be to determine the mature peptide(s) corresponding to each precursor sequence. While many mature peptide sequences and post-translational modifications can be predicted directly from a precursor sequence, some will require a more thorough examination of the venom of
Given the history of the small number of conotoxins so far characterized, we predict that components discovered in this work have the potential to become valuable research tools, if not drug leads or therapeutics. This study illustrates the arsenal of molecular weapons present in the venom gland of a single species of cone snail. Furthermore, it highlights the wonderful molecular resource that is animal venom.
Specimens of
cDNA library preparation, normalization and sequencing were performed by Eurofins, MWG Operon (Budendorf, GER). From the total RNA sample, poly(A)+ RNA was isolated and used for cDNA synthesis. An N6 randomized primer was used for first strand cDNA synthesis. 454 adapters A and B were then ligated to the 5′ and 3′ ends of the cDNA, respectively. The cDNA was finally amplified by PCR (11 cycles).
Normalization was carried out by one cycle of denaturation and re-association of the cDNA. Re-associated double-stranded cDNA was separated from the remaining single stranded-cDNA (normalized cDNA) by passing the mixture over a hydroxylapatite column. After hydroxylapatite chromatography, the single-stranded cDNA was PCR amplified (8 cycles). cDNA in the size range of 500–1100 nt was eluted from a preparative agarose gel for sequencing. 454 sequencing was performed using GS FLX+ chemistry.
During the assembly process, single reads are aligned with each other to form contigs (contiguous consensus sequences). All reads were initially trimmed to remove primer and barcode sequences. Reads were then cleaned using prinseq-lite-0.17.1
For a general annotation of the transcriptome we utilized BLAST+ (version 2.2.27+)
All conotoxin sequences available from ConoServer were downloaded and grouped according to superfamily (classification provided by ConoServer). Any identical sequences were removed. Full-length precursor sequences were used where available, but for superfamilies with less sequence information all available sequences were used.
Using the hmmbuild tool from the HMMER 3.0 package a single pHMM was built for each superfamily. The hmmsearch tool was then applied to the
All sequence alignments were performed with MAFFT version 7 using the L-INS-i method
Conotoxin prepropeptide sequences from this Transcriptome Shotgun Assembly project have been deposited at DDBJ/EMBL/GenBank [accession: GAIH00000000]. The version described in this paper is the first version, GAIH01000000. Raw sequencing data has been deposited in the NCBI sequence read archive [SRA accession: SRR833564].
Specimens of
We thank Johan Pas for specimen collection and Dr Shayne Bellingham for technical assistance with RNA quantification.