Transcriptome Analysis of the Innate Immunity-Related Complement System in Spleen Tissue of Ctenopharyngodon idella Infected with Aeromonas hydrophila

The grass carp (Ctenopharyngodon idella) is an important commercial farmed herbivorous fish species in China, but is susceptible to Aeromonas hydrophila infections. In the present study, we performed de novo RNA-Seq sequencing of spleen tissue from specimens of a disease-resistant family, which were given intra-peritoneal injections containing PBS with or without a dose of A. hydrophila. The fish were sampled from the control group at 0 h, and from the experimental group at 4, 8, 12, 24, 48 and 72 h. 122.18 million clean reads were obtained from the normalized cDNA libraries; these were assembled into 425,260 contigs and then 191,795 transcripts. Of those, 52,668 transcripts were annotated with the NCBI Nr database, and 41,347 of the annotated transcripts were assigned into 90 functional groups. 20,569 unigenes were classified into six main categories, including 38 secondary KEGG pathways. 2,992 unigenes were used in the analysis of differentially expressed genes (DEGs). 89 of the putative DEGs were related to the immune system and 41 of them were involved in the complement and coagulation cascades pathway. This study provides insights into the complement and complement-related pathways involved in innate immunity, through expression profile analysis of the genomic resources in C. idella. We conclude that complement and complement-related genes play important roles during defense against A. hydrophila infection. The immune response is activated at 4 h after the bacterial injections, indicating that the complement pathways are activated at the early stage of bacterial infection. The study has improved our understanding of the immune response mechanisms in C. idella to bacterial pathogens.


Introduction
The grass carp (Ctenopharyngodon idella), with a long breeding history, is one of the most economical herbivorous freshwater fish species and, is widely distributed in Asia [1].Due to its great growth performance, rich nutrient content, low cost of breeding and delicious meat, grass carp has become a dominant commercially farmed species, accounting for over 20% of freshwater aquaculture production in China [2].Despite their excellent growth traits and large-scale breeding mode, grass carp are susceptible to various diseases.Outbreaks of disease associated with bacterial pathogens, such as Aeromonas hydrophila, have caused high mortality, resulting in a decline in production and tremendous economic losses; these factors severely restrict the development prospects of the grass carp farming industry [3][4][5][6].
To date, no excellent culture varieties of the grass carp have been obtained by traditional cultivation techniques.Because the filial generation has a long sexual maturity period, the hybrid breeding strategy is not viable.Moreover, because of a failure to understand the genetic background of grass carp, little molecular breeding technology has been applied [7].Hence, research is required to establish a useful genetic breeding program; for disease control, further research into the grass carp's immune system, to identify immune-related genes and pathways, would be beneficial [8].
A. hydrophila is a pathogenic organism that causes a broad spectrum of diseases in humans and animals [9].While it was thought to be an opportunistic pathogen in humans, a growing number of intestinal and extra-intestinal cases of disease indicate that A. hydrophila is an emergent human pathogen, irrespective of the immunologic aspects of the host [10].A. hydrophila is a Gram-negative motile bacillus that causes motile aeromonad septicemia [11].Therefore, effective measures are required to protect against infections with this pathogenic micro-organism in aquatic animals.The identification of differentially expressed genes (DEGs) following infection with A. hydrophila is an important process towards understanding more about fish motile aeromonad septicemia.
Next-generation sequencing (NGS) technology for transcriptome processing based on high throughput RNA sequencing (RNA-Seq), including Illumina, can be used to generate a large amount of sequence data for organisms [12].Using this sequencing method, the relative abundances of immune-related genes within several fish and mammalian species have been studied; species studied include the Tasmanian devil (Sarcophilus harrisii) [13], Chinese giant salamander (Andrias davidianus) [14], naked carp (Gymnocypris przewalskii) [15], and turbot (Scophthalmus maximus) [16].Exploring the genome resources of an organism provides an opportunity to analyze the structural and functional genes related to the immunity responses in that organism [17][18][19][20].However, expressed sequence tags (ESTs) have not yet been used to identify the gene expression profiles related to the bacterial immunity response in C. idella.
In present study, we used Illumina NextSeq™ 500 [21][22][23][24][25][26] sequencing to describe the transcriptomes of seven sets of spleen samples taken from fish exposed to either A. hydrophila, or not.The aims of the study were to (1) provide the most complete gene sequence resources data for the grass carp to date; and (2) identify the immune-related differential expression of genes in the complement and coagulation cascades and complement-related pathways.We believe that the knowledge of these genes and the transcriptome data produced will greatly facilitate breeding programs and genome-wide association studies for the grass carp.

Ethics statement
All experiments with fish in this study were conducted in accordance with the guidelines on the care and use of animals for scientific purposes, set up by the Institutional Animal Care and Use

Experimental treatments and sample collection
According to the result of medial lethal concentration of the pre-experimental, Disease-resistant and common species of healthy grass carp (70 ± 2.3g, n = 160; about 12 months old), were obtained from the Wujiang National Farm of Chinese Four Family Carps, Jiangsu Province, China.Animals were cultured at 28 ± 0.5°C in a circulating water system sterilized with potassium permanganate, in which the water was aerated for two weeks before the experiment.The fish were fed three times per day (8:00 am, 12:00 pm and 5:00 pm) with 5% of their total body weight.The C. idella specimens were then randomly separated into four tanks (40 individuals in each tank).Fish in the first two tanks were given 100μL intra-peritoneal injections of A. hydrophila (AH10; Aquatic Pathogen Collection Center of Ministry of Agriculture, China) that was suspended in 1 × PBS at a dose of 2.7 × 10 7 CFU/mL.Fish in the other two tanks were injected with the same volume of 1 × PBS, as a control.After injection, all the animals were raised under the same felicity conditions.Three fish were randomly selected from each control tank at 0 h and from the experimental tank at 4, 8, 12, 24, 48 and 72 h.The fish were euthanized with 100 mg/L of MS-222 (tricaine methanesulfonate; Sigma, St. Louis, MO, USA) and maintained on ice before tissue collection.The spleen samples were taken from the specimens and were immediately frozen in liquid nitrogen.The samples were then stored at −80°C until isolation of the total RNA could be carried out.The total RNA was used to sequence the transcriptome of the specimens.

RNA isolation
Total RNA was extracted using TRIzol Reagent (Invitrogen, Carlsbad, CA, USA), following the manufacturer's instructions.Then, DNA I (Thermo Scientific, USA) was used to remove the genomic DNA.All the described steps were carried out in a Vertical Superclean Bench.The quantity and quality of the RNA from each sample was determined using a Nanodrop 2000C spectrophotometer (Nanodrop Technologies, USA) and agarose gel electrophoresis, respectively.High quality RNA, with absorbance OD 260 /OD 280 ratios of 1.8-2.0 and OD 260 /OD 230 ratios 2.0 was obtained; it was clear and the brightness of the 28S:18S was close to 2.0.This RNA was used to construct the cDNA library.
In preparation for the construction of the cDNA library, the total RNA of each sample was standardized to 200 ng/μL.Then, the standardized volume of total RNA extracted from the spleen samples from the same time point (n = 6) were combined into one pool.This produced seven RNA pools (one for each time point).Each RNA pool was processed with Turbo DNAfree (Ambion, Austin, TX, USA) and was purified using the RNeasy Mini Kit (Qiagen, Valencia, CA, USA), according to the manufacturer's instructions.The RNA quantity and quality were determined once again.

mRNA purification, cDNA library construction and sequencing
The seven cDNA libraries of the spleen RNA samples were constructed using Illumina TruSeq™ RNA Sample Prep Kit (Illumina, San Diego, CA, USA), according to the manufacturer's instructions.Poly(A) mRNA from each time point was isolated and purified from the total RNA with magnetic oligo-dT beads.The mRNA was first composited with divalent cations, and was then heated to obtain short fragments of about 200 bp.First-strand cDNA was synthesized from the short fragments using reverse transcriptase and random primers.Then, it was transformed into double-stranded cDNA using DNA polymerase I and RNase H.
The paired-end libraries were constructed with the double-stranded cDNA using a Genomic Sample Prep Kit (Illumina, San Diego, CA, USA).Each library was purified using the QIAquick PCR Extraction Kit (Qiagen, Valencia, CA, USA) and linked with sequencing adaptors after end repair of the double-stranded cDNA.Unsuitable fragments and the adaptors were removed using AMPure XP beads (Beckman Coulter, Shanghai, China).Then, each sequencing library was amplified by polymerase chain reaction (PCR).PicoGreen (Quantifluor™-ST fluorometer E6090, Promega, CA, USA), fluorospectrophotometry (Quant-iT PicoGreen dsDNA Assay Kit; Invitrogen, P7589), and an Agilent 2100 (Agilent 2100 Bioanalyzer, Agilent, 2100; Agilent High Sensitivity DNA Kit, Agilent, 5067-4626) were used to check the quantity, length and the distribution of the fragments in each cDNA library.The seven multiplexed cDNA libraries for each treatment were normalized to 10 nM.Then, the libraries were diluted to 4-5 pM, quantified, and sequenced using the Illumina NextSeq™ 500 platform (Shanghai Personal Biotechnology, Shanghai, China).

Data filtering and de novo sequencing assembly
The mean length of the raw data across the seven cDNA libraries was about 150 bp.To obtain high-quality clean data, the adapter sequences were removed from the original reads, and a 5-bp window was used to check the raw data, from the 3' to 5', trimming the low-quality sequences with a Q score <20.Any sequences for which the final length was less than 50 bp were removed.In order to get accurate and comprehensive unigenes, the clean data of the seven transcriptomes for each treatment were assembled, because this process tends to eliminate genome-wide reference from the same species [27].After filtering the raw data, the quality of the sequences was controlled and analyzed using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/).
Trinity software (http://trinityrnaseq.sf.net) was used to perform the de novo transcriptome assembly [28].Using Inchworm, a K-mer library was built with the filtered reads to form contigs; a component was constructed with the contigs using Chrysalis and then a De Bruijn graph was generated.Butterfly was used to optimize the De Bruijn graph and establish transcripts through the paths [29].All the transcripts were searched against in the NCBI non-redundant (NR) database (http://ftp.ncbi.nlm.nih.gov/blast/db/) using the BLASX algorithm (E-value1E-05).The unigenes were obtained after clustering the top-hit results.In order to ascertain the sequence directions of the unigenes that could not be aligned from the databases, ORF Finder [30]software was used to predict their open reading frames, with default settings, except that the parameter "-find," was set as "1".

Comparative expression analysis
Reads per kilobase per million mapped reads (RPKM) are widely used as expression values for DEGs in RNA-Seq [35].However, we used a more precise method to predict those expression levels in this study [36].The DEGs were identified using DESeq (http://www.huber.embl.de/users/anders/DESeq), a fold change of <0.5 or >2.0 (that is, 2-fold up-or down-regulated), with a p-value < 0.05, was considered as significant differential expression [36][37][38][39].The RPKM values of those DEGs were then considered as the expression levels, based on the number of reads aligned to each gene.Comparative expression of the genes was visually displayed using a Volcano Plot.Volcano plots are usually used to represent the microarray of mRNA expression levels.
TreeView (http://bonsai.hgc.jp/~mdehoon/software/cluster/software.htm/,http://jtreeview.sourceforge.net)[40], Cluster 3.0, MeV [41], and Java TreeView [42] software packages were used to perform cluster analysis of the gene expression patterns.We screened the DEGs of the spleen tissue and used Blast2 GO to perform GO enrichment analysis.The p-value denoted the magnitude of the difference in expression.The GO classification information of the functional unigenes from the spleen tissue was obtained through comparison with the entire genome database of other teleost fish.KO analysis was performed using the expression levels of the DEGs (up-or down-regulated).The KEGG pathways (http://www.genome.jp/kegg/tool/map_pathway2.html)were used to indicate the location of the DEGs in the different pathways.Pathway classification enrichment analysis of the DEGs showed the differences between the genes based on the p-values.

Quantitative real-time PCR verification
To validate the transcriptional level of the key DEGs and the quality of the RNA-Seq, samples from separate specimens at each time-point of the experiment were collected.30 unigenes were selected for quantitative real-time PCR (qRT-PCR).qRT-PCR was performed using a CFX96™ Real-time PCR Detection System (Bio-Rad, USA), with SYBR Green Master Mix (Takara, Shanghai, China) as a fluorescent dye, according to the manufacturer's instructions.Primers were designed according to the RNA-Seq results, using Primer Premier 5 (Premier, Canada) (S1 Table) [43].The arithmetic mean value of the 18S rRNA [44][45][46] and β-actin [47][48][49] genes of grass carp served as internal control value, to normalize the expression levels.After total RNA from each sample at each time point was extracted, reverse transcription was used to synthesize cDNA using the PrimeScript™ RT reagent Kit with gDNA Eraser (Takara, Shanghai, China), according to the manufacturer's instructions.The reaction was performed in a total volume of 20 μL, including 10 μL of SYBR Green Master Mix (Takara, Shanghai, China), 1 μL of diluted cDNA mix, 0.6 mL of each primer (10 mM), and 7.8 μL of RNase-free water.The thermal profile for the SYBR Green RT-PCR was 95°C for 20 s, followed by 40 cycles of 95°C for 5 s, 55°C for 30 s, and 72°C for 30 s. First, a standard curve was used to assess the accuracy of primers, of which the amplification efficiency between 95% and 105% were chosen for candidate primer.Then, amplification and detection of only one PCR product was confirmed using a melting curve analysis of the amplification products at the end of each PCR.All experiments were performed in triplicate, providing technical repeats.Expression levels of the different genes were analyzed using the comparative CT method (2 −ΔΔCT ) [50].

Results and Discussion
Identification the immune-related genes involved during the resistance of disease, such as those caused by A. hydrophila, provides a theoretical basis for disease prevention in teleost fish.Moreover, understanding the expression patterns of the genes involved in the response to motile aeromonad septicemia is the first step towards understanding the molecular mechanisms in grass carp.Qualitative and quantitative analysis of the immune regulation signal pathways, mediated by the candidate genes, also provide the possibility of enhancing non-specific immune-related enzyme activities in C. idella.

De novo assemblies and data analysis
The raw reads of the seven transcriptomes have been deposited in the NCBI Sequence Read Archive (SRA) database (http://www.ncbi.nlm.nih.gov/Traces/sra/;accession number SRP060308).From the RNA sequencing, a total of 149.51 million 150 bp paired-end raw reads were generated, with an average of 21.36 million reads for each of the samples (S2 Table ).After trimming, 122.18 million clean reads remained, with 78.11%-85.25% of the reads per sample being useful, equivalent to 32.85 GB clean data (S3 Table ).The clean reads were assembled into 425,260 contigs, with a mean size of 310 bp and N50 length of 412 bp for 46.98% of them.They generated 191,795 transcripts, with an average length of 694 bp.52,668 unigenes were generated, with an average length of 1,072 bp, ranging from 201 to 17,133 bp (Table 1).Approximately 91.25% of the contigs were <599 bp and 60.06% were 100-199 bp; 70.52% of the transcripts were 100-599 bp; and 98.69% of the unigenes were 200-4,999 bp (S1 Fig) .
The E-value results showed that 85% of the sequences had a strong homology with their expected distribution (<1.0 e-50) (Fig 1A ); 18% of the sequences were between 60% and 80%, while 65% were over 80%, similar with their expected distribution patterns.Thus, 83% of the transcripts had a distribution higher than 60% of the E-value, which supported the credibility of the de novo assembly (Fig 1B).
To evaluate the evolutionary conservation of the annotated unigenes, the sequences were compared with the databases of zebra fish (Danio rerio), fugu (Takifugu rubripes), stickleback (Gasterosteus aculeatus), medaka (Oryzias latipes) and a pufferfish (Tetraodon sp.), respectively (Table 2).A total of 38,910 (73.88%) putative known unigenes were found in all five species;  45,458 (86.31%) were found in the zebra fish database (Fig 2 ), indicating a high level of conservation of the genome resourses between C. idella and other teleost fish species, especially the zebra fish [51].

Expression and classification of the differentially expressed genes
A total of 2,992 DEGs (those that were at least 2-fold up-or down-regulated, with a p-value <0.05) were found by comparing each of the six experimental groups with the control group.Specific and shared DEGs were classified into 39 pathways, of which 89 putative DEGs were involved in the immune system (Table 5).Of those involved in the immune system, 41 (46.07%) were matched to the "complement and coagulation cascades" pathway, and 8 (8.99%) each were classified into the "hematopoietic cell lineage" and "platelet activation" pathways (Fig 5).The results indicate that complement components play an important role in fighting against bacterial infection.

Gene expression in the complement and coagulation cascades pathway
In the present study, 16 candidate genes were selected from complement-and lysosome-pathway to validate the expression levels by qRT-PCR, and then compared with the RPKM values of the corresponding genes that were sequenced using the Illumina NextSeq™ 500 platform.The expression levels of the candidate genes followed the same trends as those observed in the RNA-Seq data (Fig 6).The complement C1q subcomponent subunit A (C1QA) gene was significantly up-regulated compared to the control between 8 h and 12 h (Fig 6G ), and the complement C1q subcomponent subunit C (C1QC) gene was significantly up-regulated at 12 h (Fig 6H).Both the complement component 8 subunit beta (C8B) and the complement component factor B (CFB) (Fig 6K and 6L) genes were significantly up-regulated between 4 h and 72 h in the samples from the infected fish.The coagulation factor II (thrombin) receptor (F2R) gene was significantly up-regulated at 4 h and 12 h, compared to the control (Fig 6I).However, the fibrinogen gamma chain (FGG) gene was significantly down-regulated between 4 h and 72 h (Fig 6J).The regulation of the complement component 8 subunit alpha (C8A) gene was more irregular than the genes of the complement and coagulation cascades pathway.C8A was significantly up-regulated at 4 h and 8 h, but down-regulated at 12 h, and then, increased between 24 h and 72 h ( The complement and coagulation cascades pathway has been studied at depth in mammals.Involved in the innate immune system, the complement components play important roles in anti-bacterial defenses [59][60][61].Previous studies illustrated that teleost fish, like other vertebrates, contain three complement pathways, including a classical pathway, alternative pathway and lectin pathway [62].The complement component C3, as a crucial factor in the complement pathways, is cleaved into C3a and C3b, and then to C5, to form the membrane attack complex (MAC).Some studies have indicated that the classical and alternative pathways in Immune system Chemokine signaling pathway 149 Immune system Platelet activation 131 Immune system Leukocyte transendothelial migration 119 Immune system T cell receptor signaling pathway 111 Immune system Fc gamma R-mediated phagocytosis 107 Immune system Toll-like receptor signaling pathway 83 Immune system Natural killer cell mediated cytotoxicity 78 Immune system B cell receptor signaling pathway 75 Immune system Complement and coagulation cascades 65 Immune system Antigen processing and presentation 58 Immune system RIG-I-like receptor signaling pathway 55 Immune system Fc epsilon RI signaling pathway 55 Immune system Hematopoietic cell lineage 50 Immune system Cytosolic DNA-sensing pathway 47 Immune system NOD-like receptor signaling pathway 46 doi:10.1371/journal.pone.0157413.t004 teleost fish, as the first line of defense, have a significant effect on pathogenic microorganism invasions [63,64].However, there is very little information about the lectin pathway in fish [65].Complement component C4 plays an important role in activating downstream paths in the classical pathway [62].In the present study, the expression level of C4 was affected in a time-dependent manner, by A. hydrophila infection.However, the expression results were opposite to those found in grouper (Epinephelus coioides) [66].These differences may have been a result of the different fish species, or because the experiments were conducted at different developmental stages of the fish.In C. idella, most of the genes involved in the complement pathway were significantly up-regulated between 0 h and 4 h, illustrating that the pathway was activated during the early stages of the infection.

Gene expression in the complement-related pathways
In order to intuitively understand the effects of infection on the innate immune system, we considered the phagosome (   As described above, the expression levels of the TUBB gene increased between 0 h and 4 h; we hypothesize that bacterial infection of C. idella initially resulted in the activation of the complement components, causing cells to up-regulate their expression of the tubulin gene.This would have allowed transport of the suddenly increased amount of complement pathwayrelated proteins out of the cells.The MARCO gene is involved in phagocytosis; some studies have indicated that lipopolysaccharides (LPS) can induce autophagy and bacteria have been found in macrophages [67].In the present study, the MARCO gene expression was up-regulated at 12 h.However, most complement pathway-related genes were down-regulated at 12 h, such as C2, C3, C4, C5 and C8A.This was probably because of the activation of phagocytosis, which would have followed the identification of the bacteria by the complement system.The results illustrate that a putative regulatory network may exist in the innate immune system.

Conclusions
In this study, we performed high-throughput sequencing of spleen tissue from C. idella specimens infected by A. hydrophila.52,668 unigenes and 2,992 DEGs were obtained from the transcriptome data.89 DEGs were annotated as part of the immune system.The results indicate that the response of grass carp to A. hydrophila infection depends on the complement and coagulation cascades pathways.Additionally, complement-related pathways are involved in the defense against bacterial infection.The present study also provides evidence of the immune mechanisms in teleost fish.Improvements in activation of the complement components in C. idella would likely benefit their anti-bacterial defenses.Our study not only provides useful data for further studies on the innate immune system, but also facilitates improving disease resistance breeding for C. idella.

Fig 1 .
Fig 1. Unigene distribution in the C. idella.(A) E-value distribution of unigenes annotated into public databases with an E-value cut-off of 1E-50.(B) Identity distribution of unigenes annotated into public databases with an E-value cut-off of 1E-50.doi:10.1371/journal.pone.0157413.g001

Table 4 .
Top 15 list of pathways related to immune system.

Fig 7 )
, endocytosis (Fig 8), antigen processing (Fig 9) pathway, because these are the complement-related pathways.The expression levels of 18 DEGs were measured by qRT-PCR, which showed corresponding results with the sequencing data.The tubulin beta (TUBB) gene was significantly up-regulated between 4 h and 12 h (Fig 7A).The macrophage receptor with collagenous structure (MARCO) gene was significantly up-

Table 1 .
Summaries of cDNA libraries sequencing of grass carp by Illumina NextSeq 500 platform.

Table 2 .
Blast X search analysis of all transcripts of C. idella.

Table 3 .
Statistics of the annotation results for the C. idella unigenes.

Table 5 .
DE genes pathways classification of C. idella.