Differential STAT gene expressions of Penaeus monodon and Macrobrachium rosenbergii in response to white spot syndrome virus (WSSV) and bacterial infections: Additional insight into genetic variations and transcriptomic highlights

Diseases have remained the major issue for shrimp aquaculture industry for decades by which different shrimp species demonstrated alternative disease resistance or tolerance. However, there had been insufficient studies on the underlying host mechanisms of such phenomenon. Hence, in this study, the main objective involves gaining a deeper understanding into the functional importance of shrimp STAT gene from the aspects of expression, sequence, structure, and associated genes. STAT gene was selected primarily because of its vital signalling roles in stress, endocrine, and immune response. The differential gene expressions of Macrobrachium rosenbergii STAT (MrST) and Penaeus monodon STAT (PmST) under White Spot Syndrome Virus (WSSV) and Vibrio parahaemolyticus/VpAHPND infections were identified through qPCR analysis. Notably, during both pathogenic infections, MrST demonstrated significant gene expression down-regulations (during either early or later post-infection time points) whereas PmST showed only significant gene expression up-regulations. Important sequence conservation or divergence was highlighted through STAT sequence comparison especially amino acid alterations at 614 aa [K (Lysine) to E (Glutamic Acid)] and 629 aa [F (Phenylalanine) to V (Valine)] from PmST (AY327491.1) to PmST (disease tolerant strain). There were significant differences observed between in silico characterized structures of MrST and PmST proteins. Important functional differentially expressed genes (DEGs) in the aspects of stress, endocrine, immune, signalling, and structural were uncovered through comparative transcriptomic analysis. The DEGs associated with STAT functioning were identified including inositol 1,4,5-trisphosphate receptor, hsp90, caspase, ATP binding cassette transmembrane transporter, C-type Lectin, HMGB, ALF1, ALF3, superoxide dismutase, glutathione peroxidase, catalase, and TBK1. The main findings of this study are STAT differential gene expression patterns, sequence divergence, structural differences, and associated functional DEGs. These findings can be further utilized for shrimp health or host response diagnostic studies. STAT gene can also be proposed as a suitable candidate for future studies of shrimp innate immune enhancement.

STAT, Suppressors of Cytokine Signalling (SOCS), and Protein Inhibitor of Activated STAT have also been identified through various research efforts in the past years [27][28][29].
Furthermore, there had been some studies on the STAT gene expression changes in different shrimp species after pathogenic infections. STAT gene expression was up-regulated in Fenneropenaeus chinensis challenged with WSSV and Vibrio anguillarum [30]. The significant upregulation of Marsupenaeus japonicus [31] and L. vannamei [32] STAT gene expressions were identified under WSSV infection. Nevertheless, under some diseased conditions, Macrobrachium spp. also demonstrated non-differential STAT gene expression. For example, M. nipponense had no significant STAT gene expression changes after Aeromonas hydrophila [33] and non-O1 Vibrio cholerae [34] bacterial infections. Interestingly, due to the immune signalling importance of STAT gene, there exists risk of shrimp STAT manipulation by the invading pathogens as demonstrated by the shrimp STAT hijacking by WSSV virus [35].
The study of gene expression is an efficient strategy for the fast and accurate determination of gene activation or repression during pathogenic infections. Real time quantitative PCR (qPCR) and RNA-Seq analyses are more commonly utilized for gene expression studies in recent decades [36,37]. qPCR involves the usage of intercalating dyes or probes and qPCR machine for the determination of gene expression fold change between different treatment groups [38]. RNA-Seq utilizes high throughput next-generation sequencing (NGS) technology and has advantages in terms of price, efficiency, difficulty, and application range compared to more traditional methods such as microarray [39].
Despite the increasing numbers of gene expression studies involving pathogen-challenged shrimps, there had been a lack of research focusing on the comparison of immune gene expressions across different pathogenic conditions especially STAT gene. Therefore, this study involved identification and comparison of differential gene expressions of M. rosenbergii STAT (MrST) and P. monodon STAT (PmST) upon WSSV and V. parahaemolyticus/Vp AHPND infections. STAT gene was selected mainly because of its diverse functional importance through JAK-STAT pathway. This was followed by sequence and structure divergence identification between MrST and PmST. This is because genetic variations can lead to significant gene expression changes and functional alterations during pathogenic infections. A comparative transcriptomic analysis was also conducted to elucidate the underlying stress, endocrine, immune, signalling, and structural DEGs associated with STAT gene functioning during pathogenic infections. Overall, this study had the aim of obtaining more information on the functional importance of shrimp STAT genes involved in the aspects of expression, sequence, structure, and associated genes. The aim was successfully achieved.

Pathogen preparations
For WSSV virus propagation, the feeding of local P. monodon shrimps (15-20 g body weight) with WSSV-infected shrimp muscle tissues was conducted. The moribund shrimps were confirmed to be WSSV positive through PCR [40] and stored at -80˚C. WSSV virus stock solution was then prepared [41] which involved the homogenization and lysis of the WSSV-infected shrimp muscle tissues in TN Buffer followed by centrifugation, filtration, and storage at -80˚C. The WSSV stock solution viral copy number was quantified using primer pairs VP28-140Fw and VP28-140Rv [42].
On the other hand, P. monodon suspected with AHPND outbreak were collected and validated through both clinical sign observation and AP3 PCR detection method [43]. The Vp AHPND bacteria [44] were selectively propagated by incubating the digestive organs of Vp AHPND -infected shrimps in the order of tryptic soy broth (TSB+), thiosulfate citrate bile salt (TCBS) agar, and tryptic soy agar (TSA+). The bacteria preservation was done through cryovials (CRYOBANK™) at -80˚C and utilized for downstream experiments.

Pre-challenge works
For the WSSV and V. parahaemolyticus challenge with Macrobrachium rosenbergii, M. rosenbergii juvenile prawns (5-8 g body weight) were purchased from a hatchery at Kuala Kangsar, Perak, Malaysia. The acclimatization of the prawns was conducted for seven days under aseptic experimental setup. Each tank contained 10 prawns with aerated freshwater at 28 ± 1.0˚C.
Whereas for WSSV challenge with P. monodon, locally obtained juvenile 4 th generation P. monodon shrimps (15-20 g body weight) of Mozambique, Africa strain (10 shrimps per tank) were acclimatized for seven days under aseptic experimental setup with aerated artificial seawater (30 ppt) at 28 ± 1.0˚C. For the Vp AHPND experimental challenge, disease tolerant crossbred (13 th generation Madagascar strain with 5 th generation local strain) juvenile P. monodon shrimps (15-20 cm body length) were involved. The acclimatization of the shrimps (27 shrimps each tank) was done for seven days under aseptic experimental setup with aerated artificial seawater (30 ppt) at 28 ± 1.0˚C.
The negative screening of the prepared M. rosenbergii and P. monodon shrimps was conducted before experimental challenge using PCR methods and confirmed to be WSSV-free [40] and V. parahaemolyticus/Vp AHPND -free [43] respectively.
Besides that, for the WSSV experimental challenge, P. monodon shrimps were injected with 100 μl filtered WSSV stock solution (4.11 x 10 5 copies/μl). Sterile PBS was injected for the negative control group shrimps. The shrimp hepatopancreas collection was done at 0, 3, 6, 12, 24, and 48 hpi and also 12 days post-infection (dpi) (survivors) and stored at -80˚C. The challenge details were described in previous publication [47].
All tanks were equipped with aerators and water filters. The experimental challenges were conducted with three biological replicates for each treatment and control groups. The positive screening of the challenged shrimps was done through PCR methods for WSSV [

Total RNA extraction and first strand cDNA synthesis
Total RNA samples were extracted from shrimp hepatopancreas at each post-infection time interval of both treatment and control groups using NucleoSpin RNA II Extraction Kit (Macherey's-Nagel, Germany), RNA Isolation Kit (Macherey's-Nagel, Germany), and TransZol Up Plus RNA Kit (TransGen Biotech, Beijing, China) respectively. The extracted RNA samples were also treated with TransScript 1 One-Step gDNA Removal and cDNA Synthesis SuperMix (TransGen Biotech, Beijing, China) to achieve DNA contaminant removal and first strand cDNA synthesis for subsequent downstream applications. The manufacturer's protocols were followed for all kits utilized.

Expression profile comparison through qPCR analysis
The STAT gene expression profiles of M. rosenbergii (MrST) and P. monodon (PmST) during WSSV and V. parahaemolyticus/Vp AHPND infections were determined and compared through quantitative real-time PCR (qPCR) analysis. Three biological replicates with three technical replicates each were applied for every treatment group. The qPCR primers were designed through PrimerQuest Tool software (https://sg.idtdna.com/Primerquest/home/Index) and listed in S1 Table. The MrST qPCR experiments were conducted using TaqMan 1 Universal PCR Master Mix kit and Step One Plus Real-Time PCR System 1 instrument (Applied Biosystems, Foster City, CA, USA). The qPCR reaction (20 μl) consisted of 10 μl TaqMan Universal RT-PCR Master Mix, 1 μl primers/probe set containing 900 nM of forward reverse primers, 300 nM probe, 2 μl template cDNA, and nuclease-free water. The qPCR cycling program involved 50˚C for 2 mins, 40 cycles of 95˚C for 10 mins, 95˚C for 15 secs, and 60˚C for 1 min. Elongation factor 1-alpha (EF1a) gene was chosen as the internal control reference gene [49]. The experimental protocol details were mentioned previously [50].
The PmST qPCR experiments were carried out using GoTaq 1 qPCR Master Mix kit (Promega, Madison, Wisconsin, USA) and Agilent Technologies Stratagene Mx3005P instrument. The qPCR reaction (20 μl) included 10 μl GoTaq 1 qPCR 2X Mix, 500 nM forward primer, 500 nM reverse primer, 2 μl template cDNA, and nuclease-free water. The qPCR cycling program of 95˚C for 2 mins, 40 cycles of 95˚C for 15 s, and 56˚C for 35 secs was utilized. EF1a gene was selected as the internal control reference gene as well.
The analysis of the Ct values obtained was conducted through Livak's 2^ddCt relative quantification method [51]. The differential gene expression values determined were then statistically validated through One-Way ANOVA analysis with post hoc Duncan test using SPSS software Version 22 (Significance value: P<0.05). The post hoc Duncan test was carried out to identify the exact differences of different treatment groups (classified under alphabetical subsets) when One-Way ANOVA analysis was significant. The raw data for the qPCR experiments was shown in S1 Data.

Sequence and structural comparison of STAT genes
The MrST, PmST, and LvST sequences were retrieved from NCBI nucleotide database (NCBI Accession Numbers: KT380661.1; AY327491.1; HQ228176.1) [46]. The MrST sequence validation was done through PCR method and subsequent Sanger Sequencing analysis. In addition, the PmST sequence of disease tolerant P. monodon shrimps used in this study was determined through PCR technique using conserved site targeting strategy. The PCR primers involved were designed using PrimerQuest Tool software (https://sg.idtdna.com/Primerquest/ home/Index) and listed in S2

Identification and comparison of annotated differentially expressed genes (DEGs)
For additional identification and validation of STAT-related functional genes, the determination and comparison of Differentially Expressed Genes (DEGs) from M. rosenbergii and P. monodon under WSSV and V. parahaemolyticus/Vp AHPND infection conditions were conducted. The DEGs involved were identified and retrieved from RNA-Seq results of associated previous publications [44][45][46][47]. The data is available at the NCBI SRA database: SRR1424572, SRR1424574, SRR1424575, and SRP153251.
Generally, the extracted M. rosenbergii and P. monodon hepatopancreas RNA samples were treated with DNase and subsequently sent for cDNA library preparation and NGS sequencing using Illumina HiSeq 2000/BGI-SEQ 500 Sequencer platform by the Beijing Genome Institute (Hong Kong). The raw sequencing reads were filtered by which the clean reads were used for DEGs determination. This was followed by the functional annotation of the identified DEGs through mapping to different databases. The details of RNA-Seq data analysis were described in previous publications [44][45][46][47].
The stress, immune, and endocrine DEGs were mainly identified and compared between different treated samples. The patterns of interaction including co-activation or co-repression of DEGs under different pathogenic conditions especially those involved in STAT functioning were elucidated. The qPCR validation details of these RNA-Seq results were also described in previous publications [44-47].

Sequence comparison between MrST, PmST, and L. vannamei STAT (LvST)
Several selected shrimp STAT complete cds sequences, MrST (Accession Number: KT380661), PmST (disease tolerant strain), PmST (Accession Number: AY327491.1), and LvST (Accession Number: HQ228176.1) were obtained and compared using Clustal Omega software at translated amino acid (Fig 2) and nucleotide (S5 Fig) levels. The important conserved and diverged sites between the aligned sequences were marked by which a significant number of diverged sites were located at the 5' UTR and 3' UTR regions.
At nucleotide level, the major conserved area was found at the middle region whereas long conserved overlaps were more frequently identified at the start and end regions. Important diverged sites determined between PmST (disease tolerant strain) and PmST (AY327491.1)   Besides that, at translated amino acid level, both major conserved area and long conserved overlaps between compared STAT sequences were located at the middle region. Despite the multiple important diverged sites identified between the two compared PmST nucleotide sequences, at amino acid level, only two amino acid alterations were discovered at positions of  Both MrST and PmST protein sequences were predicted to be intracellular and non-transmembrane through Protter analysis. MrST and PmST proteins had high probabilities to be located within cytoplasmic region (estimated probability of 0.61) and nuclear region (estimated probability of 0.94) respectively based on the MultiLoc 2 prediction analysis. The secondary structures of MrST and PmST protein sequences were predicted as shown in S10A and S10B Fig. For MrST protein sequence, 23 high probability common motifs and 1 Src homology 2 (SH2) domain profile (probability score, 13.148) were found (S11A Fig). Whereas 25 high probability common motifs and 1 SH2 domain profile (probability score, 13.519) were matched to PmST protein sequence (S11B Fig). 3D protein structures of MrST and PmST protein sequences were predicted which contained α-helix, β-sheet, and coil structures (S12A and S12B

Comparative transcriptomic differentially expressed genes (DEGs) analysis
The important stress, endocrine, immune, signalling, and structural DEGs of M. rosenbergii and P. monodon during WSSV and V. parahaemolyticus/Vp AHPND infections were identified and compared in the Fig 3 below. More details including gene identities, differential expression values, and annotation sources were given in the S7 Table. Based  compared to P. monodon treatment groups. WP group possessed higher number of down-regulated DEGs compared to other treatment groups. Some DEGs were only down-regulated in WP group involving inositol 1,4,5-trisphosphate receptor, apoptosis-stimulating of p53 protein 1, mitochondrial coenzyme A transporter, polysaccharide lyase, trypsin, C-type Lectin, proPO, ceramide synthase, STAT, and TBK1. Intriguingly, Hsp90, transglutaminase, ALF1, ALF3, and ankyrin demonstrated sole up-regulation pattern across all treatment groups. Peroxisomal acyl-coenzyme A oxidase and catalase were up-regulated in M. rosenbergii treatment groups while down-regulated in P. monodon treatment groups. On the other hand, trehalose transporter was down-regulated in M. rosenbergii treatment groups while up-regulated in P. monodon treatment groups. Moreover, dopamine N-acetyltransferase and HMGB only showed up-regulation pattern in P. monodon treatment groups. Apoptosis-inducing factor (AIF) and IMD were only differentially expressed in Vp AHPND -infected treatment group. Caspase was down-regulated in all treatment groups except being up-regulated in VM group.

Differential gene expression pattern of MrST and PmST during WSSV and V. parahaemolyticus/Vp AHPND infections
STAT gene was chosen for gene expression analyses involving different pathogenic-challenged treatment groups because of its highly diverse gene functioning in the JAK-STAT signalling pathway which possesses stress, endocrine, and immune importance [20,21,23].
Unlike the potential up-regulation of STAT gene expressions in the early WSSV post-infection time points associated with previously described hijacking mechanism adapted by WSSV for viral replication [35], the MrST gene expressions were down-regulated in the early WSSV post-infection time points (Fig 1)  Intriguingly, elevated STAT gene expressions caused by VP28 vaccination also aided in lowering viral gene expression and thus slowed down WSSV establishment in WSSV-infected P. monodon juvenile [63]. This infers a competitive relationship between host STAT gene expression and WSSV viral gene expression. Hence, the up-regulation of PmST gene expressions from 3 hpi to 48 hpi in response to WSSV infection (Fig 1) in this study is suggested to be the collective effect of stronger immune response from disease-resistant P. monodon and WSSV viral hijacking.
On the other hand, for V. parahaemolyticus infection, the down-regulation pattern of MrST gene expressions observed (Fig 1) is supported by a similar scenario of down-regulated STAT gene expressions at later hpi of V. parahaemolyticus-infected S. paramamosain [64]. The upregulated PmST gene expressions identified in response to AHPND infection (Fig 1)

Important conservation and divergence between STAT sequences
The vital conserved areas determined between MrST, PmST (disease tolerant strain), PmST (AY327491.1), and LvST sequences (nucleotide and amino acid) can be applied in cross-species conserved primer development. This is exemplified by the development of conserved primers for decapod crustaceans (including shrimps) [66] and different bear species [67] for mitochondrial genome sequencing purpose. A probable common ancestry is inferred between MrST and LvST based on their overlapping stop codon positions at nucleotide level. This is supported by another inference of a single common ancestor origination for ALF genes of all crustaceans [68]. MrST sequence was most diverged from other compared STAT sequences at nucleotide (S5 Fig) and amino acid (Fig 2) levels. The differential gene expressions between MrST and PmST (Fig 1) might be significantly influenced by these genetic sequence (nucleotide and amino acid) variations involving divergence, additions or deletions. The important effect of genetic variations on the gene expressions had been highlighted by previous works [69,70].
Moreover, the two amino acid changes (614 aa and 629 aa) detected between PmST sequences compared could be essential in the enhanced functioning of PmST in disease tolerant P. monodon. These amino acid changes are caused by nonsynonymous mutations. The key outcome of such amino acid changes can be inferred to be the improvement or alteration of PmST protein recognition ability or binding affinity which can be related to some previous research findings [71][72][73]. The divergence at the 5' UTR and 3' UTR regions of the aligned STAT sequences suggests the potential involvement of pre-or post-transcriptional regulatory elements found in these regions in causing differential gene expressions and alternative disease tolerance. This is validated by previously determined significant correlation between increased gene expression and adaptive evolution in the 3' UTR and amino acid sequence [74].
By referring to the NCBI BLAST search results, although being closely related to shrimp species (F. chinensis) (89%), MrST amino acid sequence was also strongly conserved with crab species (S. paramamosain and E. sinensis) (88%; 87%). This is supported by the close evolutionary relationship between MrST and crab species (S. paramamosain and E. sinensis) in the phylogenetic analyses conducted (S6A and S6B Fig). Intriguingly, PmST amino acid sequence had high homology to P. trituberculatus (87%) in the NCBI BLAST search. This is supported by a previously determined mitochondrial DNA similarity between P. monodon and P. trituberculatus [75]. Furthermore, there was a close clustering between Macrobrachium prawns and Penaeidae shrimps (P. monodon, F. chinensis, and L. vananmei) in the phylogenetic analyses (S6A and S6B Fig), which suggests a similar ancestry between them.

In silico predicted MrST and PmST structural variations
The MrST protein sequence possessed slightly lower theoretical isoelectric point (pI) (6.04) compared to PmST protein sequence (6.11). The subcellular localization prediction of MrST and PmST proteins with intracellular and non-transmembrane properties was determined to be within the cytoplasmic (P = 0.61) and nuclear (P = 0.94) regions respectively. This is validated by a previously identified correlation of protein pI with subcellular localization by which cytoplasmic region contains higher number of acidic proteins compared to nuclear region [76]. Besides that, the four functional conserved domains (STAT_int, STAT5_CCD, STAT_ bind, and SH2) identified for both MrST and PmST protein sequences successfully contributed to the STAT gene identity validation of these sequences. The two amino acid changes described in Section 3.2 were found within the SH2 conserved domain which has functional importance in STAT dimerization and signalling specificity [77].

Additional transcriptomic insight into functional DEGs
Based on the transcriptomic DEGs displayed in Fig 3 and S7 Table, further understanding was obtained for the DEGs' functionalities particularly those related to STAT. This is important due to the insufficient number of previous studies on the direct comparison between the transcriptomic DEGs of M. rosenbergii and P. monodon under viral and bacterial infection conditions. Overall, the significantly higher number of up-regulated DEGs compared to downregulated DEGs is postulated to be the effect of host immune response activation during pathogenic infection. In addition, the higher number of up-regulated DEGs identified in M. rosenbergii compared to P. monodon can be correlated to its stronger immune response. Such strong immune response was previously demonstrated by the ability of adult M. rosenbergii to achieve clearance of WSSV virus compared to susceptible P. monodon [78]. Intriguingly, some down-regulated DEGs were uniquely found in the WP group which resulted in its relatively higher number of down-regulated DEGs among the compared treatment groups. This is postulated to be caused by the decreased host response of survived P. monodon after successful WSSV clearance.
The functioning of stress DEGs in the early stress-induced immunoendocrine response is inferred based on both up-regulation of these DEGs in this study and previous publications [79][80][81][82]. The up-regulation of endocrine DEGs across different treatment groups is postulated to be the effect of higher energy needs for host immune response activation and post-infection cell repair. This is validated by the significance of energy balance for stress adaptation and aquatic animal tolerance highlighted in previous publications [83,84]. The activated shrimp immune response led to the up-regulation of immune and signalling DEGs. The up-regulation of structural DEGs suggests the high probability of cell and tissue structural repair across different post-infection time points. This is supported by the identification of structural DEGs (including actin-associated genes) with cytoskeleton functions in V. parahaemolyticus-infected L. vannamei [85]. All these DEGs with stress, endocrine, immune, signalling, and structural functionalities are vital in the overall host response against invading pathogens.
Interestingly, the uniquely up-regulated DEGs, including peroxisomal acyl-coenzyme A oxidase, catalase, and TBK1 in M. rosenbergii infers their greater importance in M. rosenbergii host response. On the other hand, dopamine N-acetyltransferase, trehalose transporter, and HMGB showed unique up-regulation in P. monodon during pathogenic infections, which suggests their stronger importance in P. monodon host response as well. The special functional importance of AIF and IMD signalling pathway in AP group may be further investigated to gain deeper understanding into their sole differential expressions in AP group.
Overall, a synergistic functioning of shrimp stress, endocrine, immune, signalling, and structural genes during pathogenic infections can be postulated which is vital for host survival and elimination of invading pathogens. Moreover, these DEGs can be jointly proposed as Survival Adaptation Molecular Patterns (SAMPs) with STAT as one of the crucial signalling components.

Conclusions
In conclusion, during WSSV and V. parahaemolyticus/Vp AHPND infections, MrST gene expressions were significantly down-regulated (during either early or later post-infection time points) whereas PmST gene expressions were only significantly up-regulated. In addition, the sequence and structural comparison of MrST and PmST provided significant insight into the important similarities or differences between the compared shrimp STAT sequences. These differences were inferred to be one of the deciding factors resulting in the differential gene expression patterns observed. STAT gene plays vital diverse roles in JAK-STAT signalling pathway especially during pathogenic infections. Hence, the systematic comparison of selected omics data was done to identify the important DEGs (stress, endocrine, immune, signalling, and structural) in M. rosenbergii and P. monodon when exposed to WSSV and V. parahaemolyticus/Vp AHPND infections focusing on those involved in STAT functioning or potentially associated with JAK-STAT signalling. The functional grouping of these DEGs validated the diverse signalling roles of STAT.
Overall, the findings of this study will be able to provide valuable insight for future research towards better understanding of the shrimp immune response especially STAT gene functioning. This study also possesses novelty in the emphasis of the stress and endocrine DEGs because these DEGs are easily neglected normally with more research focus given to immune DEGs. However, these DEGs are important as well because stress DEGs function as alert and trigger factors whereas endocrine DEGs function as regulatory and survival factors. The qPCR primers designed, sequence and structural divergence identified, and important DEGs obtained can be applied for shrimp health or immune response activation diagnostic purpose. STAT gene can also be proposed as a suitable candidate for the study of immune response enhancement or regulation due to its diverse signalling importance in shrimp immunity and survival.