Despite the economic and environmental impacts that sea lice infestations have on salmon farming worldwide, genomic data generated by high-throughput transcriptome sequencing for different developmental stages, sexes, and strains of sea lice is still limited or unknown. In this study, RNA-seq analysis was performed using de novo transcriptome assembly as a reference for evidenced transcriptional changes from six developmental stages of the salmon louse Caligus rogercresseyi. EST-datasets were generated from the nauplius I, nauplius II, copepodid and chalimus stages and from female and male adults using MiSeq Illumina sequencing. A total of 151,788,682 transcripts were yielded, which were assembled into 83,444 high quality contigs and subsequently annotated into roughly 24,000 genes based on known proteins. To identify differential transcription patterns among salmon louse stages, cluster analyses were performed using normalized gene expression values. Herein, four clusters were differentially expressed between nauplius I–II and copepodid stages (604 transcripts), five clusters between copepodid and chalimus stages (2,426 transcripts), and six clusters between female and male adults (2,478 transcripts). Gene ontology analysis revealed that the nauplius I–II, copepodid and chalimus stages are mainly annotated to aminoacid transfer/repair/breakdown, metabolism, molting cycle, and nervous system development. Additionally, genes showing differential transcription in female and male adults were highly related to cytoskeletal and contractile elements, reproduction, cell development, morphogenesis, and transcription-translation processes. The data presented in this study provides the most comprehensive transcriptome resource available for C. rogercresseyi, which should be used for future genomic studies linked to host-parasite interactions.
Citation: Gallardo-Escárate C, Valenzuela-Muñoz V, Nuñez-Acuña G (2014) RNA-Seq Analysis Using De Novo Transcriptome Assembly as a Reference for the Salmon Louse Caligus rogercresseyi. PLoS ONE 9(4): e92239. https://doi.org/10.1371/journal.pone.0092239
Editor: Cynthia Gibas, University of North Carolina at Charlotte, United States of America
Received: October 31, 2013; Accepted: February 19, 2014; Published: April 1, 2014
Copyright: © 2014 Gallardo-Escárate et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This study was supported by the FONDAP (15110027) project granted by CONICYT-Chile. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Major challenges facing transcriptomic research in non-model organisms are increasing the speed and accuracy of discovering new genes and metabolic pathways, as well as determining how gene transcription variations are regulated by specific DNA polymorphisms. Understanding the transcriptome is essential for interpreting the functional elements of the genome, for revealing the molecular constituents of cells and tissues, and for understanding complex biological processes such as growth, reproduction, and immune response . Next-generation sequencing (NGS) technologies offer the opportunity to generate genome-wide sequence data sets for a reasonable cost and time –. Although these powerful and rapidly evolving technologies have only been available for a few years, they are already making substantial contributions to the understanding of genome expression and regulation under different conditions. A popular application of NGS is species transcriptome generation, which affords direct access to the coding sequences of many genes and information on their relative expression levels –. NGS transcriptome data analysis is therefore a useful source for mining molecular markers such as SNPs and EST-SSRs –.
Salmon lice are naturally occurring parasites for seawater salmon, and, compared to natural conditions, parasite infection and transmission are exacerbated under intensive fish farming. The salmon louse Caligus rogercresseyi is the main copepod ectoparasite responsible for significant economic losses of the farmed salmon industry in Chile . This parasite is known to cause surface damage to fish, which results in mucus breakdown and in turn leads to open sores and lesions. A further problem may arise if fish become stressed due to the presence of sea lice , . It has been observed that chronic stress in fish may result in immunosuppression and a subsequent increased susceptibility to secondary infections . Moreover, salmon lice infestations have been managed by antiparasite agents including organophosphates , , pyrethroids , hydrogen peroxide , and avermectins , . However, overexposure to these chemical agents tends to promote drug resistance in wild populations of parasites .
These concerns, added to the scarce genomic knowledge of the molecular pathways affected by salmon lice treatments in C. rogercresseyi, provide incentive for the scientific community to increase sequencing efforts in order to identify novel candidate genes that could be related to drug resistance and susceptibility. So far, EST datasets have been reported for a few copepod ectoparasites . As of October 2013, the NCBI EST-database retrieved 191,020 EST entries for parasitic copepods; a result that was comprised of 129,250 ESTs for Lepeophtheirus salmonis, 32,037 ESTs for Caligus rogercresseyi, 14,927 ESTs for Lernaeocera branchialis, and 14,806 ESTs for Caligus clemensi. These entries have provided a substantial number of sequences that are similar to already reported genes, but a large proportion do not show EST hits with known proteins. Whole transcriptome shotgun sequencing, or RNA sequencing (RNA-seq), tools allow for expression analysis in organisms without previously sequenced genomes, such as marine invertebrates where the majority of species do not have reference genomes available.
In this study, RNA-seq analysis was performed using de novo transcriptome assembly as a reference for the salmon louse C. rogercresseyi. The goal of this study was to produce whole transcriptome sequences, which would provide pivotal genomic knowledge on the processes involved in the life cycle of the salmon louse. In total, 83,444 transcripts were identified in association with all major signaling pathways and developmental processes of C. rogercresseyi. RNA-seq cluster analysis using the MiSeq Illumina platform between larval stages (nauplius I, nauplius II, copepodid and chalimus) and adult individuals (female and male) evidenced a wide diversity of candidate genes related to ontogenetic development, immune response, stress, drug resistance, the nervous system, and reproduction.
Materials and Methods
Salmon lice culturing
Female specimens of C. rogercresseyi were collected from recently harvested fish at a salmon farm located in Puerto Montt, located in the south of Chile. Individuals were transported back to the laboratory on ice, and their egg strings were then removed and placed in culture buckets supplied with seawater flow at 12°C and with gentle aeration. Eggs were allowed to hatch and develop until the infectious copepodid stage. These were then used to inoculate a tank containing host fish according Bravo . Prior to the collection of salmon lice, fish were anaesthetized. Salmon lice were then harvested for RNA extraction and cDNA library construction. All laboratory infections and culture procedure were carried out under guidelines approved by the ethics committee of University of Concepción and appropriate veterinary supervision.
The life cycle of C. rogercresseyi comprises eight development stages: nauplius 1–2, copepodid, chalimus 1–4 and adult . Herein, twenty individuals from each instars of C. rogercresseyi were separately collected. In the case of the chalimus stage, samples from the instars 3–4 were collected. Immediately after sampling, each salmon lice stage were pooled into two biological replicates in 1 mL of RNAlater stabilization solution (Ambion®, USA) and stored at −80°C. Total RNA was extracted from pools using the Ribopure™ kit (Ambion®, Life Technologies™, USA) following the manufacturer's instructions. The concentration and purity were measured with a spectrophotometer (ND-1000, Nanodrop Technologies), and the integrity was visualized with electrophoresis in MOPS/formaldehyde agarose gels at 1.2% staining with ethidium bromide at 0.001%. RNA was also checked for quality on the Bioanalyzer TapeStation 2200 (Agilent Technologies Inc., USA) using the R6K reagent kit according to the manufacturer's instructions. RNA extracts that presented 260/280 and 260/230 purity indices equal to or greater than 2.0 and integral RNA in electrophoresis and Bioanalyzer measurements (RIN>8) were selected. Subsequently, mRNA pools were precipitated overnight with 2× volume of absolute ethanol and 0.1× volume of 0.3 M sodium acetate at −80°C for cDNA library construction. Following this, double-stranded cDNA libraries were constructed using the TruSeq RNA Sample Preparation kit v2 (Illumina®, USA). Two biological replicates for each developmental stage were separately sequenced by the MiSeq (Illumina®) platform using sequenced runs of 2×250 paired-end reads at the Laboratory of Biotechnology and Aquatic Genomics, Interdisciplinary Center for Aquaculture Research (INCAR), University of Concepción, Chile.
The cleaned short read sequences were deposited in the Sequence Read Archive (SRA) (http://www.ncbi.nlm.nih.gov/sra) under the accession number SRR1106551. The de novo assembly sequence data is available from corresponding author on request.
De novo transcriptome assembly
The raw data for each pool of samples were separately trimmed and de novo assembled in a unique file using the CLC Genomics Workbench software (Version 6.0.1, CLC Bio, Denmark). The overlap settings for this assembly were a mismatch cost of 2, an insert cost of 3, a minimum contig length of 200 base pairs (bp), a similarity of 0.8, and a trimming quality score of 0.05. This assembly yielded 83,444 contigs that were annotated according to Gene Ontology terms with the Blast2Go program , that was executed as a plugin of CLC by mapping against the UniprotKB/Swiss-Prot database (http://uniprot.org) with a cutoff E-value of 1E-05. Furthermore, to determine putative gene descriptions, homology searches were carried out through querying the NCBI EST-database using the tBLASTx algorithm. Finally, the assembled sequences were compared to the Kyoto Encyclopedia of Genes and Genomes (KEGG) database . KEGG pathways were assigned to the assembled sequences using the KEGG Automatic Annotation Server (KAAS). The bidirectional best hit (BBH) method was used to obtain KEGG Orthology assignments for each developmental stage of the salmon louse.
Differential gene expression analysis and clustering
The consensus contigs generated by de novo assembly in the previous step were used as a reference for RNA-seq expression analysis. Using the CLC Genomic Workbench software, the readings for each biological replicate were separately mapped against 83,444 contigs. The RNA-seq settings were a minimum length fraction of 0.6 and a minimum similarity fraction (long reads) of 0.5. Then the number of reads per kilobase per million mapped reads (RPKM) was obtained with the same software . This normalized the number of reads to the size of assembled contigs and allowed for assessing the transcripts that were overexpressed among different groups. In order to identify differences between developmental stages, RNA-seq analyses were performed for nauplius I, nauplius II, copepodid and chalimus, and female and male adult stages. Following this, the transcripts that were differentially expressed in comparison to normalized expression values were visualized in a clustering heat map and selected according to the identified cluster. For an optimal comparison of the results, k-means clustering was performed to identify candidate genes involved in specific gene expression patterns. The distance metric was calculated with the Manhattan method, where the mean expression level in 5–6 rounds of k-means clustering was subtracted. Finally, a Volcano plot and Kal's statistical analysis test were used to compare gene expression levels for larval stages and adults in terms of the log2 fold change (P<0.0005, FDR corrected).
Validation by qRT-PCR
Nine genes were chosen for the confirmation of differentially expressed genes by qRT-PCR in the six studied developmental stages. Herein, specific primers were designed from acetoacetyl-CoA synthetase, flotillin, allatostatin precursor protein, tropomyosin, putative cuticle protein, vitellogenin 1, vitellogenin 2, argonaute 1 isoform C and vasa gene (Table S1). The qPCR runs were performed with StepOnePlus™ (Applied Biosystems, Life Technologies, USA) using the comparative ΔCt method. Each reaction was conducted with a volume of 10 µL using the Maxima® SYBR Green/ROX qPCR Master Mix (Thermo Scientific, USA). The amplification conditions were as follows: 95°C for 10 min, 40 cycles at 95°C for 30 s, 60°C for 30 s, and 72°C for 30 s. Three putative housekeeping genes (HKG), Elongation factor 1-alpha, β-actin and β-tubulin were statistically analyzed by NormFinder algorithm to assess their transcriptional expression stability. Here, β-tubulin was selected as HKG for gene normalization.
Sequencing analysis and assembly from C. rogercresseyi transcriptome
Six types of cDNA samples, which represented different developmental stages and adult tissues of C. rogercresseyi, were prepared and sequenced using the MiSeq Illumina platform. The sequencing runs yielded a total of 154.84 M reads with an average length of 171 bp. The CLC Genomic Workbench software was used with default parameters to screen for adapter sequences and eliminate poor quality reads. After quality trimming and removal of adapter sequences, 151.78 M reads, representing 97% of the raw reads, remained in the dataset. Of these, 132.5 M reads (88%) wholly or partially assembled into contigs, and 19.26 M reads remained singletons. The remaining reads were excluded from further analyses. The high-throughput sequencing performed for each developmental stage showed similar numbers of yielded reads and average length. Interestingly, the number of singletons did not show major differences among larval and adults stages. The number of nucleotides generated from the C. rogercresseyi transcriptome using Illumina technology was up to 25.9 Gigabases (Table 1). De novo assembly yielded 83,444 contigs with an average length of 819 bp, of which 58,320 contigs had a length between 300 and 2,000 bp and 25,124 contigs were longer than 2,000 bp. The average coverage among the contigs was 351.1 reads/bp, suggesting that every base pair in the salmon louse transcriptome was sequenced up to 300 times on average. The contigs yielded from the de novo assembly performed for each developmental stage ranged from 29,887 in nauplius I to 50,174 in copepodids, with an average length of 823 bp (Table 1). The sequencing results evidenced lower variation between the biological replicates for each stage. For instance, the average coverage did not show significant differences among replicates (data not shown).
Transcriptome annotation from C. rogercresseyi
Using the BLASTx program, sequence similarity searches of the SwissProt and NR Protein databases showed that 23,841 contigs (28.6% of total contigs) had significant blast matches with E-values≤1e−5, making them an annotatable gene set (Table S2). The most abundant BLAST hits were associated with arthropod species such as Daphia pulex (13.4%), Lepeophtheirus salmonis (7.9%), Caligus rogercresseyi (1.9%), Caligus clemens (1.6%), Litopenaeus vannamei (0.7%), and other crustaceans (10%) like Calanus finmarchicus, Artemia franciscana, and Penaeus monodon, among others. However, the highest hits for species distribution were associated with unknown species (64.4%).
Gene Ontology analysis was carried out to explore and summarize the functional categories of the genes sequenced in this study. Among the 83,444 assembled contigs, 15,314 were assigned to biological processes (27%), molecular functions (34%), and cellular components (39%). Within each of these three main categories, genes that annotated for translation, the nuclear-transcribed mRNA process, viral transcription, egg hatching, larval development, the response to drugs, protein biding, metal ion biding, and cytoplasm were the most abundant (Fig. 1). Important cell procedures related to early development were somewhat evidenced, such as with genes involved in cell motion, cell proliferation, cuticle formation, myogenesis, and locomotion.
Selected GO categories are shown with the top-level division of biological process, molecular function, and molecular function.
Final functional classification and pathway assignment were performed using bi-directional BLAST with an E-value of 1e -3 against the KEGG database. Of these sequences, 16,213 had significant matches in the database. Among the matched sequences, metabolic pathways, such as carbon metabolism, the biosynthesis of amino acids, oxidative phosphorylation, glycolysis, the citrate cycle, and lipid metabolism, were well represented in C. rogercresseyi sequences. Given the important roles of lipids in the copepod lifecycle, especially during ecdysis, greater attention was placed on lipid metabolism. Genes were found in several pathways involved in fatty acid biosynthesis, such as fatty acid elongation, steroid biosynthesis, and ether lipid metabolism. Furthermore, genes related to nervous system development were highly annotated to signaling pathways such as the neuroactive ligand-receptor, the GABAergic and glutamatergic synapse, axon guidance, and the cholinergic synapse. Interestingly, immune response genes were found associated with the NF-kappa B signaling pathway, the TNF signaling pathway, and the Toll-like receptor, among others.
Differentially expressed genes among developmental stages of C. rogercresseyi
In addition to obtaining gene annotations for the salmon louse, another major aim of the present transcriptomic study was to analyze the overall gene expression profile in order to identify genes participating in pivotal biological process and molecular functions related to the developmental stages, especially for larval stages and adult individuals. After de novo assembly, the contigs that showed matching reads for all samples were sorted to generate a gene reference dataset. Then, gene expression data was normalized for six RNA-seq experiments so as to separately compare the expression levels between larval stages, and female and male adult individuals. This approach was applied as the most critical physiological changes in the ontogeny of parasite copepods occur during the free-swimming (nauplius, copepodid), larval settlement (chalimus), and mature female and male adult phases –.
Cluster analysis was conducted for 83,444 genes and showed differential transcription expression values among the analyzed developmental stages. The overall expression profiles are displayed in Figure 2. Clustering of the profiles from the six stages evidenced an increasing expression ratio (log2) from the nauplius I to adult stages at about a 5-fold change (Fig. 2A). However, high-resolution analysis of transcription patterns among the salmon lice stages revealed specific upregulated or downregulated gene clusters from the nauplius to adult stages. Herein, transcription activity was found associated with gene clusters showing up-regulation from the nauplius stage to the last developmental stages, as well as an down-regulation from early larval stages to male adults (Fig. 2B and C, respectively). Furthermore, the k-means and distance were estimated by the Manhattan method to identify clusters of candidate genes involved in specific gene expression patterns (Fig. 3). Through this, four clusters were observed differentially expressed between nauplius I–II and copepodid, where 604 transcripts (Fig. 4) were mainly overregulated in the copepodid stage (Clusters 4) and nauplius I–II stages (Clusters 1) (Table 2). It is important to note that no significant expression differences between nauplius I and nauplius II were observed. Then, for further analysis the two larval instars were considered as nauplius I–II stage. In addition, five clusters were evidenced differentially expressed between the copepodid and chalimus stages, where 2,426 transcripts (Fig. 5) were mainly associated to chalimus stage (Clusters 3, 4, and 5) as compared to copepodids (Clusters 1 and 2). The greatest differences in transcription expression were found in 271 putative genes that comprised Cluster 4 (Table 2). In addition, genes from female and male adults that showed differential transcription were highly identified into six clusters containing 2,478 transcripts (Fig. 6). Interestingly, half of the clusters that evidenced differential transcription activity were overregulated in females (Clusters 2, 3, and 5) as well as in the male transcriptome (Clusters 1, 4, and 6). Two clusters (3 and 5) linked to female gene expression displayed the highest RPKM values of the analyzed clusters (Table 2).
(A) Overview of log2 expression ratios of all transcripts differentially expressed from nauplius I, nauplius II, copepodid, chalimus, adult females and adult males. (B and C) Two patterns of expression were detected by K-means algorithm using transformed expression values.
Dendrograms of the transcription patterns were estimated for 83,444 contigs generated by de novo assembling. The bar color reflects the gene expression levels.
Dendrograms of the transcription patterns were estimated for 83,444 contigs generated by de novo assembling. The bar color reflects the gene expression level from black (low), red (medium) to yellow (high). Contig annotations of these 4 clusters are listed in Table S2.
Dendrograms of the transcription patterns were estimated for 83,444 contigs generated by de novo assembling. The bar color reflects the gene expression level from black (low), red (medium) to yellow (high). Contig annotations of these 5 clusters are listed in Table S3.
Dendrograms of the transcription patterns were estimated for 83,444 contigs generated by de novo assembling. The bar color reflects the gene expression level from black (low), red (medium) to yellow (high). Contig annotations of these 6 clusters are listed in Table S4.
In regards to gene annotation, relevant genes were identified through transcriptome cluster analysis between the nauplius I–II and copepodid. For instance, genes related to mitochondrial metabolism, and also to molting cycle were mainly associated to clusters 1, 2 and 3. In contrast, the cluster 4 evidenced a wide diversity of proteins, including an important number of hypothetical proteins annotated for Lepeophtheirus salmonis and Daphnia pulex. Moreover, for copepodid and chalimus stages of C. rogercresseyi, clusters 1, 3, and 6 were comprised of genes related to nervous system development, such as the neuronal acetylcholine receptor subunit alpha-3, Cerebellin-3, High-affinity choline transporter, and GABA-alpha subunit. Clusters 2 and 4 were mainly annotated to genes associated with cuticle and contractile elements, such as the cuticle protein, ferritin, myosin and actin, and with some genes related to the immune response, such as akirin, agglutinin isolectin, E-selectin, peroxinectin, and cathepsin. In addition, clustering analysis between female and male adults revealed a major diversity of identified genes. Some genes were linked to the morphogenesis process and cellular proliferation, such as the cuticle protein 6, gamma-crystallin A, hemicentin protein, calreticulin, and vasa gene (Clusters 1–3). Genes involved in the processes of gametogenesis and reproduction, such as the proliferation-associated protein, insulin-like growth factor-binding protein, nuclear sperm protein, vitellogenin, and estradiol 17-beta-dehydrogenase, were also annotated (Clusters 4–6). A detailed list of relevant, identified genes is shown in the supplementary material (Table S3–S5 in File S2)
In order to identify the genes highly expressed between copepodid and chalimus stages, and between female and male adults, statistical analyses visualized on a Volcano plot were performed to evaluate fold change values (Fig. S1 in File S1). From this, genes associated with Gene Ontology terms such as amino acid transfer, repair and breakdown, metabolism, and nervous system development were identified for the nauplius I–II, copepodid and chalimus stages. An upregulation specific to the copepodid stage was observed for the genes metalloproteinase, arginine kinase, E-selectin, L-selectin, tropomyosin, cuticle proteins, flotillin, allotostatin, and opsin, among others. For the chalimus stage, the genes trypsin, alpha amylase, carboxipeptidase, bleomycin, gamma-crystalin A, nanos homolog, and vitellogenin were overregulated (Table 3 and 4). With regards to female and male adults, most candidate genes were related to cytoskeletal and contractile elements, reproduction, cell development, morphogenesis, and the transcription and translation process. For female adults, hemicentin, TSP1-containing protein, vitellogenin, homeobox, vasa, argonaute, and several transcription factors were upregulated. For salmon louse males, actin, troponin, myosin, cuticle protein, brain-specific angiogenesis inhibitor and sperm proteins such as nuclear autoantigenic sperm protein, motile sperm domain-containing protein 1 and peroxidosomal N1-acetyl-spermine/spermidine oxidase were mainly overregulated (Table 5). Finally, statistical analysis showed 63, 166 and 114 hypothetical proteins and unannotated contigs up/down-regulated from nauplius I–II/copepodid, copepodid/chalimus and female/male adults, respectively (Fig. S2 in File S1). A detailed list of hypothetical proteins and unannotated contigs is shown in the supplementary material (Table S6–S8 in File S3).
In overall, the gene expression patterns revealed through the developmental stages of C. rogercresseyi, suggest lower changes of transcription activity between nauplius I, II and copepodid stages. In contrast, higher gene expression differences were found during the infective stage of chalimus and adults of the salmon louse. The principal component analysis showed correlation values that grouped nauplius I, nauplius II and copepodid stages by separate from chalimus and adults instars (Fig. S3 in File S1). Moreover, to confirm the usefulness of the C. rogercresseyi cDNA database established by the Illumina paired-end sequencing method, we investigated by qRT-PCR the expression of 9 genes selected from catalytic activity, nervous system development, molting, contractile elements, reproduction and cellular process (Fig. S4–S8 in File S1). The correlation between expression levels quantified by qPCR and the in silico analysis confirmed the robustness of the illumina sequencing results (Fig. S9 in File S1)
Radar plot of contigs with significant expression values (P≤10–16; |fold-change|>5) in terms of percentages for nauplius I–II/copepodid, copepodid/chalimus and female/male from C. rogercresseyi were analyzed in order to evidenced the proportions of genes up/down-regulated that are associated to key biological process and molecular functions (Fig. 7). Interestingly, the analysis revealed that the nauplius I–II, copepodid and chalimus stages are mainly annotated to aminoacid transfer/repair/breakdown, metabolism, molting cycle, and nervous system development. Additionally, genes showing differential transcription in female and male adults were highly related to cytoskeletal and contractile elements, reproduction, cell development, morphogenesis, and transcription-translation processes.
Annotated contigs were associated to (i) aminoacid transfer, repair and breakdown, (ii) cellular process, (iii) cytoskeletal and contractile elements, (iv) molting cycle, (v) metabolism, homeostasis, mitochondrial genes, (vi) neuronal involvement and nervous system, (vii) reproduction, (viii) cell development and morphogenesis, (ix) Transcription and translation and (x) others.
Sea lice are the most prevalent ectoparasites found in the farmed salmon industry worldwide, and two species, Lepeophtheirus salmonis (Krøyer, 1838) and Caligus rogercresseyi (Boxshall and Bravo, 2000), are responsible for major economic losses in countries such as Norway, Scotland, Canada, and Chile . According to Hamre et al. , the complete life cycle is now known for 17 species of Caligidae, as represented by just three genera, Caligus (12 species), Lepeophtheirus (four species), and Pseudocaligus (one species). However, the number of developmental stages appears to vary among species, with the free-living phase being comprised of two nauplii stages and the infective copepodid stage, while there are several chalimus and adult stages . In Caligus species, four chalimus instars have been found, the last of which molts into the definitive adult , , , . In contrast, the life cycle of Lepeophtheirus species have been reported to have four chalimus and two pre-adult stages that allow the louse the ability to detach from a temporary frontal filament shortly after molting and move over the surface of the skin . However, recent findings from observing chalimus larvae molting and through morphometric cluster analysis from L. salmonis reported only two chalimus stages, and, consequently, a life cycle comprised of six post-nauplius instars .
Understanding salmon louse biology is critical for establishing strategies that allow for the control and management of this ectoparasite. However, evidence supporting morphological and physiological changes in correlation with transcriptome profiles during the life cycle of salmon lice is still limited. For instance, EST collections for the developmental stages of L. salmonis have only been reported in female and male adults , , , which so far represents the most comprehensive and publicly available transcriptome for L. salmonis at 129,250 transcripts . In this context, the present study provides 84,023 high quality contigs and, subsequently, 29,000 significant annotated proteins from different developmental stages of the salmon louse C. rogercresseyi. This sequencing effort represents the most comprehensive transcriptome resource available for this caligid species.
Salmon lice included in the present RNA-seq study were evaluated between the nauplius I–II and copepodid, copepodid and chalimus stages, and also between female and male adults. This approach was applied in order to identify relevant transcriptome profiles across larvae instars and with sexual differentiation. In fact, it could be hypothesized that these developmental phases are representative of the major physiological changes during the life cycle of copepods. For instance, a study related to peptidergic signaling in C. finmarchicus reported that the highest expression levels from six stages (embryo, early nauplius, late nauplius, early copepodid, late copepodid, and adult) are seen in the naupliar and copepodid stages, while the lowest levels are present in embryos and adult females . Specifically in the copepodid stage, host-seeking behavior has been displayed by L. salmonis during its infectious stage, including moving towards river mouths and maintaining location in haloclines during salmon migrations . Consequently, the copepod must be able to cope with abiotic stress conditions  and respond to the inflammatory defense mechanisms at the site of salmon parasite attachment . Furthermore, relatively little is known concerning sex differentiation and its endocrine control in crustaceans, and most available data have been obtained in decapods . However, transcriptome sequencing studies have facilitated the discovery of novel sex-related genes, which thus far have suggested pivotal transcriptional differences between female and male adults , .
The present transcriptome analysis of C. rogercresseyi revealed 3,030 transcripts that comprised nine clusters, which were differentially expressed between the nauplius I–II, copepodid and chalimus stages. Interestingly, some upregulated genes were mainly associated with metalloproteinase, arginine kinase, and cuticle protein, which evidence participation in the digestion of intake proteins, tissue development, cuticle remodeling, and in specific cleavage events to activate or inactivate proenzymes and bioactive peptides . With respects to nervous system development in copepodids, some relevant genes such as nicotinic acetylcholine receptor, flotillin, synaptotagmin, allotostatin, frequenin, and opsin were highly overexpressed. These results are congruent with previous studies of transcriptome profiles reported for copepodid stages , , . Furthermore, a study by Wilson and Hartline  demonstrated high peripheral and central nervous system development in individuals transitioning from the nauplius to copepodid stage. The upregulation of allotostatin could be associated with the regulation of juvenile hormone production, or, more interestingly, with recent findings where the activation of neurons, or neuroendocrine cells, that expressed the neuropeptide allotostatin modulated feeding behavior in Drosophila, including increased food intake and enhanced behavioral responsiveness to nutrients or molecular clues . It is important to note that these effects on feeding behavior could be related to changes induced by the start of the parasitic phase in the salmon louse. Furthermore, investigations of opsin function outside of vertebrate systems have long been focused on arthropod visual pigments , indicating that copepods possess a sensory apparatus sensitive to different wavelengths that could have implications during the host-finding process, especially in the copepodid stage .
For the chalimus stage, trypsin, alpha amylase, and carboxipeptidase genes were overregulated. Peptidases from the different families may be involved in a wide range of cellular and biological processes, thus making it difficult to infer specific functions across salmon louse development. Host blood has been reported as a major food component for the salmon louse L. salmonis . Blood degradation in several hematophagus organisms has been shown to require the catalysis of several peptidases , . Of the regulated peptidases in the present study, the most overregulated was trypsin, a secretory endopeptidase within the serine protease superfamily. This superfamily includes important digestive enzymes that constitute a major part of digestive fluids and act as activators of other digestive enzymes. These results are congruent with previous reports on the interaction between parasitic copepods and salmon hosts , .
In addition, genes showing differential transcription from female and male adults were highly annotated into six clusters comprised of 2,478 transcripts. For female adults, sex-related genes such as vitellogenins and estradiol 17-beta-dehydrogenase were identified. However, effects of the vertebrate-like steroid hormones on reproductive processes, such as oocyte maturation in crustaceans, still remain unresolved. Vitellogenins are the major yolk proteins in most invertebrates, and several different vitellogenins typically give rise to vitelline granules in mature eggs . The role of multiple vitellogenin genes in some organisms, such as insects, is unknown  despite that proteins with domain structures similar to vitellogenins are also involved in other developmental processes, such as in the regulation of osmolarity, immunity, and clotting . The present data showed a wide diversity of vitellogenins, including LsVit1 and LsVit2 as reported in L. salmonis , and several vitellogenin-likes proteins. Moreover, genes associated with cell development, including homologues of vasa, homeobox, argonaute, cell division protein kinase, and centromere-associated protein, were also specifically expressed in female adults. For salmon louse males, transcription activity related with cytoskeletal and molting cycle, as well as with sperm proteins were mainly overregulated. Based on the data of the present study it is therefore likely that the actin, troponin, myosin, and cuticle proteins are an important part of cuticle formation during the final molt for C. rogercresseyi. The sex-related genes reported in the present study represent novel molecular information regarding salmon louse reproduction.
The initial analysis of C. rogercresseyi transcriptome revealed that approximately 71.4% had no significant hits in GenBank using the nr-database. Even the re-annotation of the contigs revealed a total of 13% novel homologous proteins to L. salmonis. Similar high proportions of novel genes have been reported in non-model crustacean species , . Furthermore, a total of 230 hypothetical proteins evidenced significant gene expression differences among the developmental stages, demonstrating the potential for discovery of unknown genes and novel biological processes involved in the life cycle of salmon lice.
The present study represents a step forward in identifying a number of possible conserved genes that are likely to be involved in various important biological activities. Using de novo assembly, 83,444 high quality contigs and 24,000 genes, as based on known proteins, were identified from the C. rogercresseyi transcriptome. Future studies will address validating the discovered gene profiles, thus avoiding misinterpretations of the functional genomics information. The present data provide the most comprehensive transcriptome resource available for C. rogercresseyi, which should be used for future genomic studies linked to host-parasite interactions.
Primer list for qPCR validated in C. rogercresseyi genes.
BLASTx results from C. rogercresseyi transcriptome assembling.
Figure S1. Volcano plot displaying the −log10 of the P values from Kal's statistical test in terms of the log2 fold change for nauplius I–II/copepodid, copepodid/chalimus and female/male of C. rogercresseyi. The selected genes have significantly different expression values (P≤10−16–P≤10−5). Dots, triangles and squares represent individual ESTs from larvae stages and adult salmon lice, respectively. Annotated and unannotated sequences according BLAST analysis as filled and empty spots were denoted. Figure S2. Number of contigs annotated and unannotated showing up/down regulation for nauplius I–II/copepodid, copepodid/chalimus and female/male of C. rogercresseyi. Figure S3. Principal component analysis from six Caligus rogercresseyi development stages – nauplius I, nauplius II, copepodid, chalimus and female and male adults. Figure S4. Relative expression levels of acetoacetyl-CoA synthetase gene from six developmental stage of Caligus rogercresseyi. Each bar represents the mean of expression levels (± SD). Figure S5. Relative expression level of flotillin and allatostatin precursor protein from six developmental stage of Caligus rogercresseyi. Each bar represents the mean of expression levels (± SD). Figure S6. Relative expression level of tropomyosin and putative cuticle protein from six developmental stage of Caligus rogercresseyi. Each bar represents the mean of expression levels (± SD). Figure S7. Relative expression levels of vitellogenin 1 and 2 gene from six developmental stage of Caligus rogercresseyi. Each bar represents the mean of expression levels (± SD). Figure S8. Relative expression levels of argonaute 1 isoform C and Vasa gene from six developmental stage of Caligus rogercresseyi. Each bar represents the mean of expression levels (± SD). Figure S9. Correlation analysis between transformed expression values obtained by qPCR and in silico analysis from six developmental stage of Caligus rogercresseyi.
Table S3. Relevant annotated genes identified by clustering analysis between Nauplius I–II and Copepodid stages of C. rogercresseyi transcriptome. Table S4. Relevant annotated genes identified by clustering analysis between Copepodid and Chalimus stages of C. rogercresseyi transcriptome. Table S5. Relevant annotated genes identified by clustering analysis between Female and Male stages of C. rogercresseyi transcriptome.
Table S6. Hypothetical proteins and unannotated contigs Up/down-regulated in Copepodid/Nauplius I–II groups. Table S7. Hypothetical proteins and unannotated contigs Up/down-regulated in Chalimus/Copepodid groups. Table S8. Hypothetical proteins and unannotated contigs Unannotated contigs Up/down-regulated in Male/Female groups.
Conceived and designed the experiments: CGE. Performed the experiments: CGE VVM GNA. Analyzed the data: CGE. Contributed reagents/materials/analysis tools: CGE. Wrote the paper: CGE VVM GNA.
- 1. Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nature Reviews Genetics 10: 57–63.
- 2. Wilstermann A, Vidal S (2013) Western corn rootworm egg hatch and larval development under constant and varying temperatures. Journal of Pest Science 86: 419–428.
- 3. Vera JC, Wheat CW, Fescemyer HW, Frilander MJ, Crawford DL (2008) Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing. Molecular Ecology 17: 1636–1674.
- 4. Marguerat S, Bahler J (2010) RNA-seq: from technology to biology. Cellular and Molecular Life Sciences 67: 569–579.
- 5. Neira-Oviedo M, Tsyganov-Bodounov A, Lycett GJ, Kokoza V, Raikhel AS, et al. (2011) The RNA-Seq approach to studying the expression of mosquito mitochondrial genes. Insect Molecular Biology 20: 141–152.
- 6. Bellin D, Ferrarini A, Chimento A, Kaiser O, Levenkova N, et al. (2009) Combining next-generation pyrosequencing with microarray for large scale expression analysis in non-model species. Bmc Genomics 10: 555.
- 7. Wilhelm BT, Landry JR (2009) RNA-Seq-quantitative measurement of expression through massively parallel RNA-sequencing. Methods 48: 249–257.
- 8. Everett MV, Grau ED, Seeb JE (2011) Short reads and nonmodel species: exploring the complexities of next-generation sequence assembly and SNP discovery in the absence of a reference genome. Molecular Ecology Resources 11: 93–108.
- 9. Seeb JE, Carvalho G, Hauser L, Naish K, Roberts S, et al. (2011) Single-nucleotide polymorphism (SNP) discovery and applications of SNP genotyping in nonmodel organisms. Molecular Ecology Resources 11: 1–8.
- 10. Sahasrabudhe PV, Tejero R, Kitao S, Furuichi Y, Montelione GT (1998) Homology modeling of an RNP domain from a human RNA-binding protein: Homology-constrained energy optimization provides a criterion for distinguishing potential sequence alignments. Proteins-Structure Function and Genetics 33: 558–566.
- 11. Ritdachyeng E, Manaboon M, Tobe SS, Singtripop T (2013) Possible roles of Juvenile Hormone and Juvenile Hormone binding protein on changes in the integument during termination of larval diapause in the bamboo borer Omphisa fuscidentalis. Physiological Entomology 38: 183–191.
- 12. Kristoffersen AB, Rees EE, Stryhn H, Ibarra R, Campisto JL, et al. (2013) Understanding sources of sea lice for salmon farms in Chile. Preventive Veterinary Medicine 111: 165–175.
- 13. Skilbrei OT, Finstad B, Urdal K, Bakke G, Kroglund F, et al. (2013) Impact of early salmon louse, Lepeophtheirus salmonis, infestation and differences in survival and marine growth of sea-ranched Atlantic salmon, Salmo salar L., smolts 19972009. Journal of Fish Diseases 36: 249–260.
- 14. Bowers JM, Mustafa A, Speare DJ, Conboy GA, Brimacombe M, et al. (2000) The physiological response of Atlantic salmon, Salmo salar L., to a single experimental challenge with sea lice, Lepeophtheirus salmonis. Journal of Fish Diseases 23: 165–172.
- 15. Saksida SM, Morrison D, McKenzie P, Milligan B, Downey E, et al. (2013) Use of Atlantic salmon, Salmo salar L., farm treatment data and bioassays to assess for resistance of sea lice, Lepeophtheirus salmonis, to emamectin benzoate (SLICE (R)) in British Columbia, Canada. Journal of Fish Diseases 36: 515–520.
- 16. Roth M, Richards RH, Dobson DP, Rae GH (1996) Field trials on the efficacy of the organophosphorus compound azamethiphos for the control of sea lice (Copepoda: Caligidae) infestations of farmed Atlantic salmon (Salmo salar). Aquaculture 140: 217–239.
- 17. Jones MW, Sommerville C, Wootten R (1992) Reduced Sensitivity of the Salmon Louse, Lepeophtheirus salmonis, to the Organophosphate Dichlorvos. Journal of Fish Diseases 15: 197–202.
- 18. Sevatdal S, Horsberg TE (2003) Determination of reduced sensitivity in sea lice (Lepeophtheirus salmonis Kroyer) against the pyrethroid deltamethrin using bioassays and probit modelling. Aquaculture 218: 21–31.
- 19. Bravo S, Treasurer J, Sepulveda M, Lagos C (2010) Effectiveness of hydrogen peroxide in the control of Caligus rogercresseyi in Chile and implications for sea louse management. Aquaculture 303: 22–27.
- 20. Bravo S, Sevatdal S, Horsberg TE (2008) Sensitivity assessment of Caligus rogercresseyi to emamectin benzoate in Chile. Aquaculture 282: 7–12.
- 21. Duston J, Cusack RR (2002) Emamectin benzoate: an effective in-feed treatment against the gill parasite Salmincola edwardsii on brook trout. Aquaculture 207: 1–9.
- 22. ffrench-Constant RH, Daborn PJ, Le Goff G (2004) The genetics and genomics of insecticide resistance. Trends in Genetics 20: 163–170.
- 23. Yasuike M, Leong J, Jantzen SG, von Schalburg KR, Nilsen F, et al. (2012) Genomic Resources for Sea Lice: Analysis of ESTs and Mitochondrial Genomes. Marine Biotechnology 14: 155–166.
- 24. Bravo S (2010) The reproductive output of sea lice Caligus rogercresseyi under controlled conditions. Experimental Parasitology 125: 51–54.
- 25. Gonzalez L, Carvajal J (2003) Life cycle of Caligus rogercresseyi, (Copepoda: Caligidae) parasite of Chilean reared salmonids. Aquaculture 220: 101–117.
- 26. Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, et al. (2005) Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21: 3674–3676.
- 27. Kanehisa M, Goto S, Furumichi M, Tanabe M, Hirakawa M (2010) KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Research 38: D355–D360.
- 28. Mortazavi A, Williams BA, Mccue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods 5: 621–628.
- 29. Christie AE, Roncalli V, Wu LS, Ganote CL, Doak T, et al. (2013) Peptidergic signaling in Calanus finmarchicus (Crustacea, Copepoda): In silico identification of putative peptide hormones and their receptors using a de novo assembled transcriptome. General and Comparative Endocrinology 187: 117–135.
- 30. Christie AE, Roncalli V, Lona PB, McCoole MD, King BL, et al. (2013) In silico characterization of the insect diapause-associated protein couch potato (CPO) in Calanus finmarchicus (Crustacea: Copepoda). Comparative Biochemistry and Physiology D-Genomics & Proteomics 8: 45–57.
- 31. Eichner C, Frost P, Dysvik B, Jonassen I, Kristiansen B, et al. (2008) Salmon louse (Lepeophtheirus salmonis) transcriptomes during post molting maturation and egg production, revealed using EST-sequencing and microarray analysis. Bmc Genomics 9: 126.
- 32. Sutherland BJG, Jantzen SG, Yasuike M, Sanderson DS, Koop BF, et al. (2012) Transcriptomics of coping strategies in free-swimming Lepeophtheirus salmonis (Copepoda) larvae responding to abiotic stress. Molecular Ecology 21: 6000–6014.
- 33. Torrissen O, Jones S, Asche F, Guttormsen A, Skilbrei OT, et al. (2013) Salmon lice - impact on wild salmonids and salmon aquaculture. Journal of Fish Diseases 36: 171–194.
- 34. Hamre L, Eichner C, Marlowe C, Dalvin S, Bron JE, et al. (2013) The Salmon Louse Lepeophtheirus salmonis (Copepoda: Caligidae) Life Cycle Has Only Two Chalimus Stages. Plos One 8: e73539
- 35. Jones SRM, Prosperi-Porta G, Kim E (2012) The Diversity of Microsporidia in Parasitic Copepods (Caligidae: Siphonostomatoida) in the Northeast Pacific Ocean with Description of Facilispora margolisi n. g., n. sp and a new Family Facilisporidae n. fam. Journal of Eukaryotic Microbiology 59: 206–217.
- 36. Bravo S, Pozo V, Silva MT, Abarca D (2013) Comparison of the fecundity rate of Caligus rogercresseyi infesting Atlantic salmon (Salmo salar L.) on farms in two regions of Chile. Aquaculture 404: 55–58.
- 37. Gonzalez MT, Molinet C, Arenas B, Asencio G, Carvajal J (2012) Fecundity of the sea louse Caligus rogercresseyi on its native host Eleginops maclovinus captured near salmon farms in southern Chile. Aquaculture Research 43: 853–860.
- 38. Mordue AJ, Birkett MA (2009) A review of host finding behaviour in the parasitic sea louse, Lepeophtheirus salmonis (Caligidae: Copepoda). Journal of Fish Diseases 32: 3–13.
- 39. Carmichael SN, Bron JE, Taggart JB, Ireland JH, Bekaert M, et al. (2013) Salmon lice (Lepeophtheirus salmonis) showing varying emamectin benzoate susceptibilities differ in neuronal acetylcholine receptor and GABA-gated chloride channel mRNA expression. Bmc Genomics 14.
- 40. Brooks KM (2005) The effects of water temperature, salinity, and currents on the survival and distribution of the infective copepodid stage of sea lice (Lepeophtheirus salmonis) originating on Atlantic salmon farms in the Broughton Archipelago of British Columbia, Canada. Reviews in Fisheries Science 13: 177–204.
- 41. Tadiso TM, Krasnov A, Skugor S, Afanasyev S, Hordvik I, et al. (2011) Gene expression analyses of immune responses in Atlantic salmon during early stages of infection by salmon louse (Lepeophtheirus salmonis) revealed bi-phasic responses coinciding with the copepod-chalimus transition. Bmc Genomics 12.
- 42. Rodriguez EM, Medesani DA, Fingerman M (2007) Endocrine disruption in crustaceans due to pollutants: A review. Comparative Biochemistry and Physiology a-Molecular & Integrative Physiology 146: 661–671.
- 43. He L, Wang Q, Jin XK, Wang Y, Chen LL, et al. (2012) Transcriptome Profiling of Testis during Sexual Maturation Stages in Eriocheir sinensis Using Illumina Sequencing. Plos One 7.
- 44. Singh B (2010) Matrix metalloproteinases – an overview. Research and Reports in Biology 1: 1–20.
- 45. Christie AE, Fontanilla TM, Nesbit KT, Lenz PH (2013) Prediction of the protein components of a putative Calanus finmarchicus (Crustacea, Copepoda) circadian signaling system using a de novo assembled transcriptome. Comparative Biochemistry and Physiology D-Genomics & Proteomics 8: 165–193.
- 46. Christie AE, Sousa GL, Rus S, Smith CM, Towle DW, et al. (2008) Identification of A-type allatostatins possessing -YXFGI/Vamide carboxy-termini from the nervous system of the copepod crustacean Calanus finmarchicus. General and Comparative Endocrinology 155: 526–533.
- 47. Wilson CH, Hartline DK (2011) Novel Organization and Development of Copepod Myelin. I. Ontogeny. Journal of Comparative Neurology 519: 3259–3280.
- 48. Hergarden AC, Tayler TD, Anderson DJ (2012) Allatostatin-A neurons inhibit feeding behavior in adult Drosophila. Proceedings of the National Academy of Sciences of the United States of America 109: 3967–3972.
- 49. Porter ML, Cronin TW, McClellan DA, Crandall KA (2007) Molecular characterization of crustacean visual pigments and the evolution of pancrustacean opsins. Molecular Biology and Evolution 24: 253–268.
- 50. Aarseth KA, Schram TA (1999) Wavelength-specific behaviour in Lepeophtheirus salmonis and Calanus finmarchicus to ultraviolet and visible light in laboratory experiments (Crustacea: Copepoda). Marine Ecology Progress Series 186: 211–217.
- 51. Brandal P, Egidius E, Romslo I (1976) Host blood: a major food component for the parasitic copepod Lepeophtheirus salmonis Kroyeri, 1838 (Crustacea: Caligidae). Norwegian Journal of Zoology 24: 341–343.
- 52. Kvamme BO, Kongshaug H, Nilsen F (2005) Organisation of trypsin genes in the salmon louse (Lepeophtheirus salmonis, Crustacea, copepoda) genome. Gene 352: 63–74.
- 53. Kvamme BO, Skern R, Frost P, Nilsen F (2004) Molecular characterisation of five trypsin-like peptidase transcripts from the salmon louse (Lepeophtheirus salmonis) intestine. International Journal for Parasitology 34: 823–832.
- 54. Fast MD, Burka JF, Johnson SC, Ross NW (2003) Enzymes released from Lepeophtheirus salmonis in response to mucus from different salmonids. Journal of Parasitology 89: 7–13.
- 55. Firth KJ, Johnson SC, Ross NW (2000) Characterization of proteases in the skin mucus of Atlantic salmon (Salmo salar) infected with the salmon louse (Lepeophtheirus salmonis) and in whole-body louse homogenate. Journal of Parasitology 86: 1199–1205.
- 56. Belles X, Maestro JL (2005) Endocrine peptides and insect reproduction. Invertebrate Reproduction & Development 47: 23–37.
- 57. Tufail M, Takeda M (2008) Molecular characteristics of insect vitellogenins. Journal of Insect Physiology 54: 1447–1458.
- 58. Sappington TW, Raikhel AS (1998) Molecular characteristics of insect vitellogenins and vitellogenin receptors. Insect Biochemistry and Molecular Biology 28: 277–300.
- 59. Dalvin S, Frost P, Loeffen P, Skern-Mauritzen R, Baban J, et al. (2011) Characterisation of two vitellogenins in the salmon louse Lepeophtheirus salmonis: molecular, functional and evolutional analysis. Diseases of Aquatic Organisms 94: 211–224.