First Insights into the Subterranean Crustacean Bathynellacea Transcriptome: Transcriptionally Reduced Opsin Repertoire and Evidence of Conserved Homeostasis Regulatory Mechanisms

Bathynellacea (Crustacea, Syncarida, Parabathynellidae) are subterranean aquatic crustaceans that typically inhabit freshwater interstitial spaces (e.g., groundwater) and are occasionally found in caves and even hot springs. In this study, we sequenced the whole transcriptome of Allobathynella bangokensis using RNA-seq. De novo sequence assembly produced 74,866 contigs including 28,934 BLAST hits. Overall, the gene sequences were most similar to those of the waterflea Daphnia pulex. In the A. bangokensis transcriptome, no opsin or related sequences were identified, and no contig aligned to the crustacean visual opsins and non-visual opsins (i.e. arthropsins, peropsins, and melaopsins), suggesting potential regressive adaptation to the dark environment. However, A. bangokensis expressed conserved gene family sets, such as heat shock proteins and those related to key innate immunity pathways and antioxidant defense systems, at the transcriptional level, suggesting that this species has evolved adaptations involving molecular mechanisms of homeostasis. The transcriptomic information of A. bangokensis will be useful for investigating molecular adaptations and response mechanisms to subterranean environmental conditions.


Introduction
Subterranean fauna form below the surface of the earth. Hyporheic/groundwater environments are harsh for animals due to limited space, permanent darkness, low dissolved oxygen concentrations, and limited energy/food inputs. The environment includes two major ecosystems, namely stygofauna (aquatic and living in groundwater) and troglofauna (air-breathing

RNA extraction and library construction
Total RNA was extracted using the the RNeasy 1 Micro Kit (Qiagen) and the RNase-Free DNase I Kit (Qiagen, Valencia, CA, USA) according to manufacturers' instructions. Whole bodies of 15 adult A. bangokensis were pooled and homogenized in RLT buffer (Qiagen). Extracted RNA was stored in RNA stable 1 (Biometrica, San Diego, CA, USA) to prevent RNA degradation during long-term storage. Extracted RNA quality and concentration were determined using the 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). A next-generation sequencing (NGS) library was constructed from 2 μg total RNA using NuGEN Encore 1 Complete RNA-Seq Library Systems (NuGEN, San Carlos, CA, USA). Final transcriptomic library lengths and concentrations were determined using the 2100 Bioanalyzer. Transcriptomic libraries were sequenced by the MiSeq 1 System (Illumina) platform using sequenced runs of 300×2 paired-end reads. Index and adaptor sequences were trimmed using Trimmomatic [13] and low quality reads were removed using the FASTX tool kit [14].

De novo assembly and annotation
We performed de novo assembly using software packages designed for short read sequence assembly, including Abyss [15], Velvet [16], CLC Genomics Workbench 7.5 environment (CLC Bio Aarhus, Denmark), and Oases (D.R. Zerbino, European Bioinformatics Institute). We assembled each sample using the same assembly parameters (K-mer length = 27, coverage cutoff = 10, minimum contig length = 200 bp). Consideration of the assembly statistics (N50, longest contig, number of contigs, proportion of reads assembled) led us to finally choose Oases, which generated the longest assembled ESTs.

Data deposition
The raw sequencing reads of A. bangokensis were deposited in the Sequence Read Archive in GenBank (SAMN05712658).

Annotation and gene ontology (GO) analysis
GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses of the contigs were performed using the Blast2GO sequence annotation tool [17]. The three main categories biological processes, cellular components, and molecular functions were obtained after aligning contigs using default parameters. The assembled data were arranged including read length, gene annotation, GenBank number, E-value, species, and the species accession number. The assembled data, including GO terms, were deposited as supplementary material (S3-S5 Tables). In each section, the specific GO term composition was calculated and presented as a percentage.

Gene expression analysis
The gene expression level of the A. bangokensis transcriptome was calculated using the reads per kilobase transcriptome per million mapped reads (RPKM) method [18]. Heat map analysis was conducted to represent the transcriptomic profiles using MeV software (ver. 7.4; Dana-Farber Cancer Institute, Boston, MA, USA).

Results and Discussion
Transcriptome assembly and gene annotation Gene annotation of the transcripts was performed by BLASTx analysis using the NCBI non-redundant (NR) protein database. The results showed that 28,934 contigs (38.6%) had at least one positive BLAST hit (E-value < 1e-04) representing 11,751 annotated genes (S1 Table). Distribution analysis showed that 18 species had more than 500 BLAST transcript hits, and the waterflea Daphnia pulex (Crustacea, Branchiopoda) showed the highest similarity with 3,250 reads (S2 Table; Fig 1A). Among the top hit species, 75% of contigs matched sequences of the phylum Arthropoda, while the other 25% was comprised of Chordata, Mollusca, Hemichordata, Annelida, and Echinodermata ( Fig 1B). In addition, 53% and 15% of contigs showed homologies to insects and crustaceans (Branchiopoda), respectively, at the class level ( Fig 1C). In the NCBI NR database, a relatively a huge amount of gene information of insects was appended compared to those of crustaceans. Thus, although Bathynellacean is an order of crustaceans, the overall gene annotation results represent high quality of assembled transcripts of A. bangokensis [19].

Functional annotation
InterProScan protein sequence analysis and classification can be used to effectively classify protein functions by predicting domains and important sites [20]. The most abundant Inter-Pro domains are presented in Table 2 (S1 Fig). The InterPro domains with the highest numbers of hits were immunoglobulin (IG)-like domain (IPR007110; 2433 hits), followed by Ploop containing nucleoside triphosphate hydrolase (IPR027417; 2342 hits) and fibronectin type III (IPR003961; 1761 hits). The immunoglobulin superfamily (IgSF) is a large group of soluble cell surface proteins that are mostly involved in adaptive immune defenses (e.g., recognition, binding, or adhesion processes), which are believed to be restricted to vertebrates [21]. All members of IgSF possess at least one IG-like domain or IG fold. The basic molecular   Table). doi:10.1371/journal.pone.0170424.g001 Subterranean Crustacean Bathynellacea Transcriptome mechanisms of the IG-like domain have rarely been investigated in invertebrates, but several genes possessing IG domains such as the C-type lectin domain, Down syndrome cell adhesion molecules, fibrinogen-related proteins, and hemolin have been suggested to be involved in host defense mechanisms of arthropods including crustaceans and mollusks [22][23][24][25][26][27][28]. Based on the relationship between IgSF and invertebrate immunity, it is possible that A. bangokensis has a robust innate immune defense system involving IG-like domain-containing proteins. GO terms related to the top domains were described at the second level (S2 Fig). Detailed GO distributions in three GO categories (biological process, cellular component, and molecular function) are shown in the supplementary material (S3-S5 Tables). In the biological process category, many genes are classified as metabolic processes (28%), cellular processes (25%), and single-organism processes (19%). In the molecular function category, the vast majority of genes are involved in catalytic activities (43%) and binding functions (40%). In the cellular component class, most genes are related to the cell (36%) and cell membrane (24%). The overall proportions of the major GO categories in A. bangokensis sequences were very similar to those of the transcriptomes of several other crustaceans [29][30][31][32], and we were not able to identify significant differences in the proportions of the three GO term categories. KEGG analysis of the A. bangokensis transcriptome revealed that the vast majority of KEGG pathways are involved in signal transduction (13%), followed by the endocrine system (5%), translation (5%), and carbohydrate metabolism (5%) (Fig 2A; S6 Table). The composition rate and percentage rankings of the top 10 A. bangokensis KEGG pathways were similar to those of amphipods [30,33] (Fig 2B; Hyalella azteca and Fig 2C; Melita plumulosa) and an isopod [34] (Fig  2D; Asellus aquaticus). Thus, these results suggest the intactness of the A. bangokensis transcriptome, since the information does not lack major functional GO categories or KEGG pathways compared with the transcriptomes of arthropods.   Table). Opsin repertoire Opsins have been classified into four major monophyletic groups: 1) ciliary photoreceptors ('c-opsins'), 2) rhabdomeric photoreceptors ('r-opsins'), 3) cnidarian opsins ('Cnidops'), and 4) a mixed group consisting of 'retinal G-protein coupled receptors', peropsins and neuropsins [35]. Interestingly, in the A. bangokensis transcriptome, we were not able to identify an opsin contig. Of the 28,934 Blast hit results, several candidates identified as putative opsin or relevant genes were searched, but our additional analysis using an in-depth annotation platform (e.g., domain analysis, phylogenetic analysis) revealed that the sequences were named incorrectly, because the contigs were matched to previous incorrectly registered opsin genes of other species in the NR database. In addition, short partial contigs of A. bangokensis that showed low similarity in various regions of the amino acid sequence could not be annotated to an opsin gene. To investigate putative opsin transcripts in A. bangokensis, a total of 50 opsin amino acid sequences from arthropods and onychophorans (numbered by 1-50 in S7 Table; i.e., long wavelength sensitive; middle wavelength sensitive; short wavelength sensitive groups, Blue and UV; onychopsin) annotated in a previous report [36], 7 amphipod opsins (numbered by 51-57 in S7 Table), and 21 arthropsins (numbered by 58-78 in S7 Table) and 25 peropsins (numbered by 79-103 in S7 Table) of arthropods, and additional 8 invertebrate melanopsins (numbered by 104-111 in S7 Table) registered in GenBank were mapped directly to 74,866 contigs of A. bangokensis using an internal local BLASTx platform. However, no contig of A. bangokensis was aligned to the gene pool comprised of opsin, arthropsin, peropsin, and eye pigmentation genes. In addition, we attempted to identify several eye-related genes previously annotated in the crustacean lineage [37]. Although these genes are involved in diverse functions beyond eye development or vision, many were transcriptionally not detected in the A. bangokensis (Table 3). Expression of the Dachshund (Dac) gene is important for controlling cell fate determination in eye, limb, brain, and muscle development [38]. Drosophila eyes absent (Eya) plays an essential role in retinal cell survival and differentiation [39], while multiple functions of Eya homologues (Eya1, Eya2, Eya3, and Eya4) have been identified continuously in mammals [40]. These two genes, Dac and Eya, act synergistically to induce ectopic retinal development and positively regulate each other's expression through conserved domains in Drosophila [41]. Eyegone, which is a member of the Pax transcription factor family, was discovered for its essential requirement in retinal primordium growth in Drosophila [42,43]. We were unable to determine the transcriptional roles of these genes in A. bangokensis, but overall transcriptional patterns that are not detected in the transcriptome would affect phenotypical or morphological adaptations in A. bangokensis. Further identification in embryonic and/or early developmental stage should be expanded in future study, as we analyzed the transcriptome of adult A. bangokensis. Since stygofauna have reduced or absent eyes and have enhanced non-optic sense organs without pigmentation [44,45], the absence of pigments and eyes is observed in A. bangokensis. A total of 25 and 26 eye pigmentation genes that were retrieved from Drosophila melanogaster and Tribolium castaneum respectively [46], were also aligned to the entire transcripts of A. bangokensis, but there was no matched transcripts in A. bangokensis transcriptome. Although mRNA expression of putative opsin genes was not observed in the A. bangokensis transcriptome, it will be interesting to investigate the presence or absence of the opsin repertoire in the genome and the evolutionary development of alternative sensory organs in future studies. Several previous examples showed the potential correlation between reduced or absent eyes and transcriptional expression of the opsin repertoire. No loss of gene function in opsin gene paralogs with a reduced level of gene expression was reported in the cave-adapted amphipod Gammarus minus [36]. The authors suggested that loss of expression of opsin genes without loss of gene function is explained by the pleiotropic roles of opsin genes [36]. Extensive transcriptomic analysis revealed that three independently evolved subterranean diving beetles lack transcripts of nearly all opsin photoreceptor genes, whereas the two surface beetles showed evidence of transcriptional expression of a full suite of insect visual and non-visual opsin genes [47]. Thus, research on the absence or presence of putative opsin genes from the A. bangokensis genome will be useful to understand the regressive evolution of eye reduction in the Bathynellacea lineage. Table 3. List of eye-related genes identified in the Allobathynella bangokensis transcriptome, presented with RPKM values and matching species information. The genes and reference protein IDs of Daphnia pulex were adopted from a previous study [37].

Visual system specification gene family
Decapentaplegic  ., Hsp10, Hsp20, Hsp40, Hsp60, Hsp70, and Hsp90) were annotated (Fig 3A; S8 Table). The diversity of the A. bangokensis Hsp family was unpredictable, since a drastic reduction in the ranges of daily and annual temperatures is a distinguishing characteristic of the subterranean zone [50]. Overall, the RPKM values for most Hsp genes were relatively low; 31 Hsp genes (65%) showed RPKM values < 1 (Fig 3B; S8 Table). Some Hsp genes are expressed constitutively at minimal or basal levels, while other Hsp genes can be induced rapidly in response to environmental stimuli [51]. Thus, we expect that A. bangokensis employs an effective holding and folding defense system using a combination of differentially expressed Hsp genes at low levels. This defense system diminished cellular stress, which can be triggered by even small changes in subterranean environmental conditions. Regardless of their chaperonin function, a previous report highlighted a distinct gain of function of the hsp90α gene in the cavefish Astyanax mexicanus [52]. hsp90α is expressed specifically in the cavefish lens starting just prior to apoptosis, while the expression was not observed in the lens of eyed surface-dwelling A. mexicanus (surface fish), suggesting that activation of the hsp90α gene may be required for eye degeneration and apoptosis in the lens in cavefish [52]. Therefore, further study of Hsp function in A. bangokensis is needed to understand its phenotypic adaptation. Also, it has been widely suggested that Hsp genes are good biomarkers for numerous environmental changes [48,49]. Sequence information of the A. bangokensis Hsp family can be applied to subterranean environmental monitoring by analyzing mRNA and protein expression profiles after induction by environmental stressors.

Innate immune system
In general, metazoans have an innate immune system, which consists of cellular and humoral metabolism, as the first line of defense against pathogenic bacteria, fungi, viruses, and metazoan parasites using recognition, regulation, and response [53,54]. Many key pathways employing genes involved in innate immunity are conserved between mammals and arthropods including crustaceans [53,55]. In the A. bangokensis transcriptome, we found conservation of most immunity-related gene families previously analyzed in arthropods [55][56][57][58]. As shown in Table 4, a variety of innate immunity-related sequences corresponding to adhesive proteins, antimicrobial proteins, apoptosis-and cell cycle-related proteins, cellular defense effectors, immune regulators, pattern recognition proteins, proteases, protease inhibitors, reduction/oxidation-related proteins, signal transduction-related proteins, and stress proteins were observed. These results suggest that complex immune-relevant gene sets are actively expressed to maintain cellular homeostasis even in subterranean animals. Immune-relevant gene sets will help extend our knowledge on the immune systems of syncarids in comparative aspects and ecological genetics within Arthropoda.
In arthropods, pathogens are selectively recognized by major signaling pathways, such as Toll, immune deficiency (IMD), and Janus kinase (JAK)-signal transducer of activators of transcription (STAT) for activation of immune effectors [57]. Since there is no information on the intactness of the major pathways in subterranean crustaceans, we evaluated the absence or  Table).   [56][57][58][59][60]. The identified genes and information pertaining to the absence or presence of expression, sequence IDs, matching species, E-values, and GenBank accession numbers are presented in Supplementary material (S9 Table).
The JAK-STAT signaling pathway is evolutionarily conserved and mediates the response to chemical messenger molecules such as diverse cytokines, IFNs, growth factors, and related molecules [68]. In the A. bangokensis transcriptome, putative homologs of the transmembrane cytokine receptor Domeless (Allo_26652), JAK (Allo_51891), STAT (Allo_07305), signal transducing adaptor molecule (Allo_52924), suppressor of cytokine signaling (Allo_16562),   Table) [60]. The transcriptional presence of key modulators of the JAK-STAT pathway, as well as downstream components of the pathway, suggests that the JAK-STAT pathway is strongly involved in the regulation of A. bangokensis immunity. Similar to the JAK-STAT pathway, many downstream members of the IMD signaling pathway are also conserved in the A. bangokensis transcriptome. The IMD pathway is activated mainly by Gram-negative bacteria and is separated into the NF-κB/Relish and Jun N-terminal kinase (JNK) branches [69]. Recently, Rosa et al [60] summarized 23 key component proteins of the insect IMD pathway, 17 of which are commonly identified in crustaceans. Since the KEGG database does not include the IMD pathway, manual identification was employed in the A. bangokensis transcriptome, which showed that most of the components of the IMD pathway have a homologue in the A. bangokensis transcriptome. Among the 17 crucial proteins involved in the crustacean IMD pathway, 16 were annotated in the A. bangokensis transcriptome (S9 Table), and only defense repressor 1 was not identified at the transcriptional level. These 16 proteins are IMD (Allo_52952), enzymes involved in ubiquitination (UEV1a, Allo_70733; Effete/Ubc13, Allo_20821; Bendless/Ubc5, Allo_28290), the negative regulators Caspar (Fasassociated factor 1, Allo_72340) and POSH (E3 ligase Plenty of SH3, Allo_27226), inhibitor of apoptosis IAP2 (Allo_73002), transforming growth factor-β-activated kinase 1 (TAK1, Allo_01961), TAK1-binding protein 2 (TAB2, Allo_72573), IκB kinase IKK-α (Allo_18196), a Relish-like Rel/NF-κB transcription factor (Allo_40693), Caudal homeobox protein (Allo_52669), mitogen-activated protein kinase kinase Hemipterous (Allo_31075), the JNK homolog Basket (Allo_17845), the negative regulator Puckered (Allo_58869), Activator protein 1, and the transcription factor Jun-related antigen (Allo_32187). Overall, the TAK1/TAB2 complex-activated JNK pathway was apparently conserved in A. bangokensis, but several members involved in the NF-kB/Relish branch were missing from the transcriptome database.
In crustaceans, the absence or presence of several components of the IMD pathway (i.e., peptidoglycan recognition proteins, Fas associated protein with death domain (FADD), death related ced-3/Nedd2-like (DREDD), poor IMD response upon knock-in (Pirk), IκB kinase Kenny/NEMO, Fos-related antigen/Kayak (Fra)) is controversial [60], and A. bangokensis also did not express FADD, DREDD, or Pirk at the transcriptional level. Thus, this result supports that A. bangokensis lacks several crucial members of a well-conserved IMD pathway as shown by the missing sequences in Insecta (e.g., Hemiptera), Crustacea, and Chelicerata [60]. Further experiments are needed to clarify the absence/presence of these genes at the genome level. However, Kenny/NEMO (Allo_64535) and Fra (Allo_55867) contigs were observed in the database. Taken together, we have confirmed that major immune responsive pathways, such as Toll, JAK-STAT, and the JNK branch of IMD pathway, involved in the innate immune system of A. bangokensis are evolutionarily conserved across crustaceans at the transcriptional level.

Antioxidant defense system
Reactive oxygen species (ROS) such as superoxide anion, hydrogen peroxide, and hydroxyl radical are generated during mitochondrial oxidative phosphorylation and during the cellular response to xenobiotics, cytokines, and bacterial invasion [70]. Oxidative stress results in direct or indirect ROS-mediated damage to nucleic acids, proteins, and lipids. To maintain cellular homeostasis, antioxidant defense systems composed of catalase, glutathione depletion, glutathione reductase, glutathione peroxidase, and superoxide dismutase (SOD) can be induced to eliminate excess ROS levels. Based on the annotation results, we found that A. bangokensis has an antioxidant defense system similar to those of other arthropods (Fig 5A). In addition, the basal RPKM values of all genes indicated relatively high transcription levels, and thus it remains to be determined how A. bangokensis protects its cells and bodies from exogenous oxidative stress.
Glutathione S-transferases (GSTs) and their isoforms play important roles in the cellular antioxidant defense system. In addition to their activities under oxidative stress conditions, GST family members play crucial roles in drug metabolism and the detoxification pathway. Phase I, phase II, and phase III detoxification systems are an efficient means of protecting against the potential impacts via both metabolic homeostasis and elimination of exogenous molecules (e.g., xenobiotics) in animals [71]. Of these detoxification systems, phase II is characterized by reductive or conjugative modification reactions of phase I metabolites to endogenous compounds through GST enzymatic activity. Thus, GSTs and their isoforms have been commonly used as strong biomarkers of both oxidative stress and cellular toxicity. In the A. bangokensis transcriptome, 22 GST genes were annotated and grouped into eight well-characterized GST subfamilies (Delta, Kappa, Mu, Omega, Sigma, Theta, Zeta, and microsomal GST) of arthropods (Fig 5B). Taken together, we provide evidence for a robust antioxidant defense system in A. bangokensis and maintenance of cellular homeostasis in subterranean crustaceans. Elucidating the mechanisms underlying activation of the antioxidant defense system by environmental changes is crucial for the development of effective monitoring strategies for subterranean ecosystems.

Conclusions
The ability of A. bangokensis to survive in such extreme subterranean environments suggests that they have evolved adaptations by employing molecular homeostatic mechanisms. The relationships between animals and their hyporheic/groundwater environments are being investigated continuously, and transcriptomic sequencing of A. bangokensis, as a sentinel species, renders it an optimal model for studying molecular adaptation and response mechanisms to harsh environmental conditions. Although the transcriptional responses induced by environmental changes or the unique adaptive metabolism of A. bangokensis are not discussed in this study due to the limited samples from subterranean regions, the transcriptomic database and gene composition provide the basis for clarifying the adaptive and responsive metabolic pathways. Thus, further identification and confirmation of the functions of conserved genes or pathways as well as the dissection of the genetic architecture of response genes will be useful for the study of adaptive mechanisms in subterranean ecosystems.

Author Contributions
Conceptualization: HP GSM.