Transcriptome Analysis of Pacific White Shrimp (Litopenaeus vannamei) Hepatopancreas in Response to Taura Syndrome Virus (TSV) Experimental Infection

Background The Pacific white shrimp, Litopenaeus vannamei, is a worldwide cultured crustacean species with important commercial value. Over the last two decades, Taura syndrome virus (TSV) has seriously threatened the shrimp aquaculture industry in the Western Hemisphere. To better understand the interaction between shrimp immune and TSV, we performed a transcriptome analysis in the hepatopancreas of L. vannamei challenged with TSV, using the 454 pyrosequencing (Roche) technology. Methodology/Principal Findings We obtained 126919 and 102181 high-quality reads from TSV-infected and non-infected (control) L. vannamei cDNA libraries, respectively. The overall de novo assembly of cDNA sequence data generated 15004 unigenes, with an average length of 507 bp. Based on BLASTX search (E-value <10−5) against NR, Swissprot, GO, COG and KEGG databases, 10425 unigenes (69.50% of all unigenes) were annotated with gene descriptions, gene ontology terms, or metabolic pathways. In addition, we identified 770 microsatellites and designed 497 sets of primers. Comparative genomic analysis revealed that 1311 genes differentially expressed in the infected shrimp compared to the controls, including 559 up- and 752 down- regulated genes. Among the differentially expressed genes, several are involved in various animal immune functions, such as antiviral, antimicrobial, proteases, protease inhibitors, signal transduction, transcriptional control, cell death and cell adhesion. Conclusions/Significance This study provides valuable information on shrimp gene activities against TSV infection. Results can contribute to the in-depth study of candidate genes in shrimp immunity, and improves our current understanding of this host-virus interaction. In addition, the large amount of transcripts reported in this study provide a rich source for identification of novel genes in shrimp.


Introduction
Taura syndrome virus (TSV) is a contagious viral disease of penaeid shrimp [1]. Over the last two decades, TSV has seriously threatened the shrimp aquaculture industry and caused serious economic losses [2,3]. In cultured Pacific white shrimp (Litopenaeus vannamei), which has become the major aquacultured crustacean species in the Western Hemisphere [4], TSV can cause a cumulative mortality ranged from 40 to .90% [5]. The survival shrimp of TSV infections may carry the virus for life [6,7]. TSV was first discovered in South America, but has later spread to North America, Hawaii and Asia [8][9][10][11]. TSV is a small, simple RNA virus. The genome of TSV is a positive-sense singlestranded RNA of 10.2 kb containing two open reading frames [12]. Although considerable progress has been made in the molecular characterization of TSV [13], no effective cure for this disease has yet been found [14]. Shrimp lack an acquired immune system. Their defense is considered to depend entirely on an innate, nonadaptive mechanism to defense invasion by pathogens [15]. An understanding of the host-pathogen interaction will be helpful in controlling infectious diseases in shrimp [15]. Therefore, the molecular response of shrimp to viral infection is becoming an increasingly important subject for study [16]. Recently, the molecular mechanism of the shrimp-virus interaction has made notable progress [17], many genes involved in viral infection in shrimp were found, such as lectins, antimicrobial peptide, blue blood protein and superoxide dismutase [18][19][20][21]. Most of these genes were identificated using the suppression subtractive hybridization (SSH) technology. SSH is an effective approach for identifying differentially expressed genes among different biological processes [22]. However, subtractive hybridization does not provide a quantitative measure of expression differences, and its experimental results often contain a large number of false positives [23]. Recently, the high-throughput sequencing technol-ogies, such as the Illumina Genome Analyzer, the Applied Biosystems Solid platform, and the 454 Life Sciences (Roche) pyrosequencing platform, provide a rapid and high-throughput method to identify differentially expressed genes and their expression profile [24][25][26].
Identification of host genetic factors in response to pathogen is of great significance for shrimp breeding and production. However, information on the host genes involved in TSV pathogenesis is still limited. To our knowledge, there is no previous report of isolating genes that are involved in TSV infection. In this study, we performed a transcriptome analysis of the hepatopancreas of L. vannamei challenged with TSV, using a high-throughput sequencing method (Roche 454 pyrosequencing). The aim of this study was to discover new genes involved in TSV infection, and better understand the virus-host interaction. Furthermore, the high-throughput sequencing will produce a large number of transcripts in this study, providing a strong basis for future genomic research on shrimp.

Shrimp, Virus and Challenge
L. vannamei from a specific pathogen-free (SPF) line (High Health Aquaculture, Kona, Hawaii, USA) were used in this study. The shrimp (11-12 g body weight) were provided from the National and Guangxi Vannamei Genetic Breeding Center, Guangxi Province, China, and held in the environmentally controlled 1000-liter glass saltwater tanks (32-ppt salinity, 25 to 26uC) and fed an artificial pellet feed. The shrimp were randomly sampled and tested by PCR for certify to be free of TSV by PCR [5]. In the challenge experiment, there were two shrimp groups: 1 TSV challenge group, and 1 negative control group (20 shrimp each group). The challenge group was fed once a day for 3 consecutive days with minced virus infected tail tissue at 10% of their body weight. In parallel, the negative control group was fed once a day throughout the test period of 3 days with minced PCR-confirmed [5] healthy tail tissue at 10% of their body weight. At 72 hours after infection, the hepatopancreas tissues of shrimp were collected in cryotubes and stored in liquid nitrogen for later RNA extraction.

RNA Extraction, cDNA Library Construction, and Deep Sequencing
Total RNAs were extracted from TSV-infected and noninfected shrimp hepatopancreas using TriReagent (Qiagen), and the mRNAs were purified from the total RNAs using the PolyATtract mRNA isolation systems (Promega) following the manufacturer's instructions. Integrity and size distribution were checked with Bioanalyzer 2100 (Agilent technologies, USA). Equal amounts of the high-quality mRNA samples from each group were then pooled for cDNA synthesis and sequencing. The normalized cDNA library was prepared following the 454 mRNA pyrosequencing sample preparation procedure (Roche, IN, USA). Library construction and pyrosequencing was carried out by Beijing Autolab Biotechnology Co., Ltd. on a 454 GS FLX system (Roche).

De novo Assembly and Functional Annotation
Raw sequencing reads were quality trimmed, and adaptor sequences were removed before the assembly. After removal of low quality reads, processed reads were assembled using CAP3 software with default parameters [27]. The overall assembly was performed using the combined sequence data for both the TSVinfected sample and the non-infected sample. The contigs and singletons were generally referred to as unigenes. Subsequently, the unigenes were subjected to BLASTX similarity search against NCBI non-redundant protein database and the swissprot database using BLASTALL programs with an E-value threshold of 10 25 [28]. All annotated unigenes were used to determine the COG term, GO term and KEGG pathway with a cut-off E-value of 10 25 using BLASTX [29,30].

Identification of Differentially Expressed Genes
For differential gene expression analyzes, RPKM (reads per kilobase per million reads) were used as the value of normalized gene expression levels [31]. Statistical comparison of RPKM values between the TSV-infected sample and the non-infected sample was conducted using a web tool IDEG6 (http://telethon.bio. unipd.it/bioinfo/IDEG6_form/) [32]. FDR (false discovery rate) ,0.001 was used as the threshold of P-value in multiple test to judge the significance of gene expression difference [33]. Genes were considered differentially expressed in a given library when the p-value #0.001 and a greater than two-fold change (absolute value of log2 ratio .1) in expression across libraries was observed.

Quantitative RT-PCR Analysis
To validate our 454 sequencing data, six differentially expressed L. vannamei genes (cathepsin-L, arginine kinase, fatty acids binding protein, alternative splicing factor, sorbitol dehydrogenase and hemocyanin) were selected for quantitative RT-PCR analysis, using the same RNA samples as for the transcriptome profiling. Primers were designed using the Primer5 software (Premier Biosoft International) (Table S1). First strand cDNA was synthesized from 1 mg of RNA using M-MuLV reverse transcriptase (Qiagen). The qPCR reaction mixture (20 mL) consisted of 26 Power SYBR Green PCR Master mix, 0.9M each of the forward and reverse primers, and 1 mL of template cDNA. PCR amplification was performed under the following conditions: 50uC for 2 min and 95uC for 30 s, followed by 40 cycles of 95uC for 15 s and 62uC for 1 min, and a final extension at 72uC for 5 min.

Identification of Microsatellites
All the assembled cDNA contigs from both the infected library and the control library were used for identification of microsatellites. All types of microsatellites from dinucleotides to hexanucleotides were detected using the MISA software [34] with default parameters (for all repeat types, minimum total length = 15 bp and minimum repeats = 3). Primers were designed using the primer3 software [35].

Pyrosequencing and Assembly
To identify the genes involved in L. vannamei response to TSV infection, we created two cDNA libraries from pooled mRNAs extracted from the hepatopancreas of TSV-infected and noninfected (control) groups, respectively. The two libraries were subjected to a pyrosequencing run on the 454 GS FLX system, resulting in 131745 (TSV-infected sample) and 110721 (noninfected sample) raw reads, respectively. Files containing these data were deposited in the Short Read Archive of the National Center for Biotechnology Information (NCBI) with accession numbers of SRR554365 (TSV-infected) and SRR556131 (noninfected). After filtering for adaptors and low-quality sequences, the TSV-infected library generated 126919 cleaned reads, ranging from 41 bp to 620 bp, with the average length of 367 bp and N50 length of 454 bp (Table S2). In the non-infected library, a total of 102181 cleaned reads were obtained, ranging from 45 bp to 619 bp, with the average length of 364 bp and N50 length of 454 bp (Table S3). The overall assembly was performed using the combined cleaned reads from the two libraries. De novo assembly using the CAP3 software produced 15004 unigenes (including contigs and singletons) with an average length of 507 bp, ranging from 42 to 8750 bp (Table 1).

Functional Annotation
All unigenes were compared with the Swiss-Prot and the NCBI non-redundant (NR) protein databases for functional annotation by using BLASTX with an e-value threshold of 10 25 . Among the 15004 unigenes from both the TSV-infected library and the noninfected library, 4400 (29.33%) showed significant matches (Evalue #10 25 ) in the Swiss-Prot database. An additional 10412 (69.39%) unigenes showed significant matches (E-value #10 25 ) in the NR database. In total, 10425 (69.50%) unigenes were annotated in Swiss-Prot or NR database.
Gene ontology (GO) analysis was performed with the unigenes from both the infected library and the control library. A total of 6567 and 6604 unigenes map to biological processes, 2977 and 2828 unigenes map to molecular functions, and 5206 and 5222 unigenes map to cellular components in the TSV-infected library and non-infected control library, respectively (Table S4). The functional distribution of the genes of the two libraries was similar. In both libraries, most of the corresponding biological process genes were involved in metabolic processes and cellular processes. Most of the molecular function genes were associated with catalytic activity, binding, and molecular transducer activity; most of the cellular component genes encode proteins associated with cell, parts of cell and cell organelles ( Figure 1).
We also searched the annotated sequences for the genes involved in COG classification. Among the 25 COG categories, the cluster for 'General function prediction only' (15.388%) represented the largest group, followed by the 'Posttranslational modification, protein turnover, chaperones' (12.83%) and 'Translation, ribosomal structure and biogenesis' (7.227%) clusters ( Figure 2).
In summary, for functional annotation, all the unigenes were searched against NR, Swissprot, GO, COG, and KEGG databases by BLASTX with a cut-off E-value of 10 25 . By this method, 10425 unigenes (69.50% of all unigenes) returned an above cut-off BLAST result (

Identification of Differentially Expressed Genes
To identify differentially expressed genes potentially involved in TSV infection, we constructed two normalized cDNA libraries from pooled mRNAs extracted from the hepatopancreas of TSVinfected and non-infected groups, respectively. Subsequently, these libraries were sequenced by 454 GS FLX technology. Comparison of gene expression revealed 1311 genes differentially expressed in TSV-infected shrimp compared to the control, including 559 upregulated genes and 752 down-regulated genes. The number of down-regulated genes is much larger than that of up-regulated genes, which might be consistent with the observation that viralinfected individuals of shrimp are less active. Among the 1311 differentially expressed genes, 1061(80.93%) genes were well annotated, whereas the remaining 250(19.07%) genes had low sequence homology to known sequences in public databases, suggesting that they might be putative novel genes in L. vanname involved in the response to TSV infection. All differentially expressed unigenes with their Nr, Nt, Swissprot, GO, COG, KEGG and ORF analysis are presented in additional Table S6.
To validate our RNA-seq results, six differentially regulated genes with the different total transcript reads (range 44-1076) were selected for quantitative real time-PCR (qRT-PCR) analysis. The results indicate that the qRT-PCR analysis of the relatively high abundant genes (.500 reads) agrees well with the 454 sequencing analysis. For example, based on 454 sequencing analysis, cathepsin-L (CATL), arginine kinase (AK) and fatty acids binding protein (FABP) were differentially regulated 1.48, 21.26 and 22.53 log2-fold, respectively, and showed 1.25, 21.62 and 22.31 log2-fold changes, respectively in qRT-PCR analyses ( Figure 3). However, the qRT-PCR analysis of the relatively low abundant genes (,500 reads), including alternative splicing factor (ASF), sorbitol dehydrogenase (SDH) and hemocyanin (HCS), do not match the 454 sequencing analysis perfectly, even if it shows similar trends in up-or down-regulation of genes analysised by 454 sequencing (Figure 3). Nevertheless, qRT-PCR analysis confirmed the change direction detected by the 454 sequencing analysis.

Candidate Genes Involved in L. vanname Immune Response
Among the genes that were found to be differentially expressed in the TSV-infected shrimp compared to non-infected controls, several are involved in various processes of animal immune response (Table 3). These were classified under 8 functions, including antiviral proteins, antimicrobial proteins, proteases, protease inhibitors, signal transducers and transcription factors, cell death and cell adhesion. Antiviral proteins are proteins that are induced by interferon in virus-infected human or animal cells and mediates interferon inhibition of virus replication [36]. Among the differentially expressed genes homologous to antiviral proteins, we found that a homolog of Zinc finger CCCH-type antiviral protein was significantly up regulated in TSV-infected shrimp compared to non-infected controls. The up-regulation of this gene after viral infection suggests that it may be involved in shrimp immune response. Antimicrobial proteins are an important component of the natural defenses of most living organisms against invading pathogens [15]. They interfere with microbial integrity or metabolism by targeting structures or nutrients specific to microbes [37]. Of the antimicrobial proteins identified in this study, Lysozyme and Histone H2A were significantly up regulated after TSV infection, indicating they may play important roles in shrimp defense against virus. Also of interest for the study of virushost interactions is the identification of genes involved in signal transduction, as signal transduction molecules have been suggested to play important roles in viral recognition and replication [38]. We identified 8 differentially expressed genes involved in signal transduction, including P38 mapk, Map kinase-interacting serine/ threonine, Serine/threonine-protein phosphatase alpha-1 isoform, Senescence-associated protein, Transmembrane BAX inhibitor motif-containing protein 4, C-type lectin, Innexin, and Fatty-acid amide hydrolase 2. All of these genes were up regulated. The other category of genes that are involved in transcriptional control, cell  adhesion and cell death processes, may also play important roles during the TSV infection, as processes regulated by these genes have been suggested to modulate phagocytic events, cellular remodeling, recruitment of immune cells to sites of insult, and extracellular immune cascades such as the melanization response [39].

Identification of Microsatellites
Microsatellites (or simple sequence repeats, SSRs) are repetitive sequence motifs of 1-6 bp [40,41]. Although they are widely used as molecular markers due to their variability and abundance in the genome, codominant expression and inheritance in a Mendelian fashion [42], only a limited number of microsatellite sequences have been reported for L.vannamei [43]. In this study, we obtained 770 microsatellites, of which 23.90% were di-nucleotide repeats (184), followed by 36.88% tri-nucleotide repeats (284) and 36.23% tetra-nucleotide repeats (279), as well as 2.99% penta-nucleotide repeats (23) (Table S7). We also designed 497 primer sets using the Primer3 program (Table S8). These identified microsatellites have potential utility to genetic mapping, population structure and gene flow studies of L. vannamei.

Discussion
Taura syndrome (TS) is a major cause of shrimp mortality in cultured L. vannamei in the Americas and Asia [1]. Although there are many published reports of characterization and detection of TSV, little is known about the interaction between this virus and shrimp. Understanding the interaction between host and its pathogen is useful, not only for studies on the molecular immune mechanisms, but also for agricultural practice that aims to provide a theoretical basis for developing effective strategies to prevent viral disease. Roche 454 RNA sequencing (RNA-Seq) is a powerful new method for discovering novel genes and investigating gene expression patterns, especially in non-model organisms that do not have sequenced genomes [44]. Like many other crustacean species with significant economic value, L. vannamei lacks a complete genome sequences and most other genetic tools and resources. In this study, we used the 454 RNA-Seq to investigate the gene expression changes associated with the TSV infection. We identified a total of 15004 unigenes in L. vannamei, 4579(30.52%) of which were new transcripts compared to known genes in public databases. Comparative analysis of transcriptome changes between TSV-infected and non-infected shrimp revealed 1311 differentially expressed genes, of which 559 genes were upregulated and 752 down-regulated. Our sequencing data analyses indicate that TSV infection has a significant impact on the transcriptome profile of L. vannamei hepatopancreas.
Among the differentially expressed genes found in this study, several had been previously reported to be involved in the shrimp response against white spot syndrome virus (WSSV), such as Ctype lectin and hemocyanin [15,17,[45][46][47][48]. Animal C-type lectins play important roles in innate immunity to recognize and eliminate pathogens efficiently [49]. In invertebrates, C-type lectins are involved in non-self immune recognition and pathogen phagocytosis through opsonization [50]. Several studies reported the expression of C-type lectins in shrimp hepatopancreas was greatly affected after challenge by WSSV [20,[51][52][53]. In this study, we found 15 unigenes homologous to C-type lectins, and their expression exhibited to change significantly after TSV infection. Hemocyanin is another well-known immune-related gene previously reported to be involved in viral infection [54]. Hemocyanins are the oxygen-transporting proteins in arthropods and molluscs [55]. Hemocyanins have the defense-related functions that are mediated through phenoloxidase activity. Several previous studies reported that hemocyanins in shrimp were greatly over expressed during WSSV infection [56][57][58]. Similarly, several other previously reported differentially expressed genes, such as heat shock protein, lysozyme and fatty acid-binding protein, strongly up-regulated in shrimp challenged with WSSV [59][60][61]. In the present study, up-regulation of these genes have also been observed in shrimp challenged with TSV. It suggests that these genes might have the similar expression pattern in response to virus infection, regardless of the virus species.
Although some of the differentially expressed genes found in this study had not been previously reported to be involved in virus-host interaction, they were annotated in the pathway known to be involved in various processes of animal defense against pathogens, such as cell death/apoptosis and mitogen-activated protein kinase (MAPK). Cell death/apoptosis pathway is known to be related to the cell hypersensitivity response, blocking pathogen progression and systemic resistance [62,63]. Among the genes involved in cell  death/apoptosis, we found lysosomal aspartic protease, ATP binding cassette transmembrane transporter, caspase, beclin protein and apoptosis-inducing factor 1. Such genes may respond to the viral infection through controlling the extent of the cell death in the defense response. MAPK is another noteworthy pathway that was activated during virus infection and contributed to virus replication in animal or plant cells [64]. Among the differentially expressed genes, we found map kinase-interacting serine/threonine-protein kinase 2, heat shock protein 70, P38 mapk and polyubiquitin-C shared homology to signaling molecules of the MAPK pathway. These genes are likely to be involved in response to the TSV infection.
In summary, we employed the 454 pyrosequencing technique to investigate the transcriptome profile of L. vanname challenged with TSV. Comparative transcriptome analysis between TSV-infected and control groups revealed significant differences in gene expression. Although the molecular functions of some genes and their associated pathways remain largely unknown, this study provides valuable information on the antiviral mechanism in shrimp and the role of the differentially expressed genes in response to TSV infection. Furthermore, the large number of transcripts and molecular markers obtained in this study provides a strong basis for future genomic research on shrimp.

Supporting Information
Table S1 Primers used in qRT-PCR for validation of differentially expressed genes. (XLS)