The complete mitogenome of Lysmata vittata (Crustacea: Decapoda: Hippolytidae) with implication of phylogenomics and population genetics

In this study, the complete mitogenome of Lysmata vittata (Crustacea: Decapoda: Hippolytidae) has been determined. The genome sequence was 22003 base pairs (bp) and it included thirteen protein-coding genes (PCGs), twenty-two transfer RNA genes (tRNAs), two ribosomal RNA genes (rRNAs) and three putative control regions (CRs). The nucleotide composition of AT was 71.50%, with a slightly negative AT skewness (-0.04). Usually the standard start codon of the PCGs was ATN, while cox1, nad4L and cox3 began with TTG, TTG and GTG. The canonical termination codon was TAA, while nad5 and nad4 ended with incomplete stop codon T, and cox1 ended with TAG. The mitochondrial gene arrangement of eight species of the Hippolytidae were compared with the order of genes of Decapoda ancestors, finding that the gene arrangement order of the Lebbeus groenlandicus had not changed, but the gene arrangement order of other species changed to varying degrees. The positions of the two tRNAs genes (trnA and trnR) of the L. vittata had translocations, which also showed that the Hippolytidae species were relatively unconserved in evolution. Phylogenetic analysis of 50 shrimp showed that L. vittata formed a monophyletic clade with Lysmata/Exhippolysmata species. This study should be helpful to better understand the evolutionary status, and population genetic diversity of L. vittata and related species.


Introduction
The genus Lysmata is an important group in family Hippolytidae, contains more than 48 described species, most of which are small shrimp living in shallow waters [1,2]. For a long time, the classification of Hippolytidae was the most controversial family in Decapoda, especially the monophyly of Hippolytidae and the position of the genus Lysmata [3,4]. In the past few decades, the studies of Lysmata mainly focused on morphology, with relatively few studies on population genetic structure. Meanwhile, most of the selected marker genes are partial sequences of rrnL, rrnS and cox1, and these gene fragments often fail to provide enough information to make the study of population genetics and species evolution. The mitogenome is a significant tool for studying identification and phylogenetic relationships in the different species [5]. In shrimps, the mitochondria is maternally inherited, usually is circular and approximately 15 to 20 kb in length, including thirteen PCGs, two rRNAs, twenty-two tRNAs and one CR. The mitogenome contains abundant gene information, and the phylogenetic tree based on the mitogenome sequences has the advantages of stable and reliable structure. Analyzing the genetic relationship of species through the establishment of the 13PCGs sequence of the mitogenome can better solve the problems encountered in species classification.
Lysmata vittata (Crustacea: Decapoda: Hippolytidae) belongs to a small marine ornamental shrimp, commonly known as peppermint shrimp, which is popular in the marine aquarium trade. The species has a special sexual system, ie, protandric simultaneous hermaphrodite (PSH) [3]. It is a member of the clean shrimp family, a common marine ornamental species that originated in the Indian Ocean-Pacific region, including coastal areas such as China, Japan, Philippines and Australia [6][7][8]. L. vittata prefers to move in the range of 2~50 m below the sea surface, usually hiding in the reef during the day and activating at night [9]. In recent years, with the continuous breakthroughs in genomics technology, the phylogenetic research of the Lysmata species has gradually moved from the morphological level to the genome level. As a relatively important marine ornamental species, the determination of L. vittata mitogenome is of great significance for the development of genetic diversity and evolutionary history of Lysmata.
In this study, the mitogenome of the L. vittata has been successfully determined, and its structure and phylogenetic status have been analyzed. This work should help to further understand the evolutionary relationship and population genetic diversity between the L. vittata and related species.

Mitochondria DNA sequencing and genome assembly
Specimens of L. vittata were collected in Xiamen, Fujian province, China. The morphological characteristics of the species follow the previous description of Abdelsalam [1]. Approximately 5g of muscle tissue was harvested for mtDNA isolation using an improved extraction method [10]. After DNA isolation, the isolated DNA was purified according to manufacturer's instructions (Illumina), and then 1 μg was taken to create short-insert libraries, whose insertion size was 430 bp, followed by sequencing on the Illumina Hiseq 4000 [11] (Shanghai BIOZERON Co., Ltd). The high molecular weight DNA was purified and used for PacBio library prep, BluePippin size selection, then sequenced on the Sequel Squencer.
The raw data obtained by sequencing was processed and then the duplicated sequences were assembled. The mitogenome was reconstructed using a combination of the PacBio Sequel and the Illumina Hiseq data. Assemble the genome framework by the both Illumina and Pac-Bio using SOAPdenovo2.04 [12]. Verifying the assembly and completing the circle or linear characteristic of the mitogenome, filling gaps if there were. Finally, the clean data were mapped to the assembled draft mitogenome to correct the wrong bases, and the most of the gaps were filled through local assembly.

Validation of mitogenome data
In order to ensure the accuracy of the L. vittata mitogenome data, we resequenced the samples on the Illumina HiSeq X10 platform (Nanjing Genepioneer Biotechnologies Co. Ltd).

Phylogenetic analysis
To reconstruct the phylogenetic relationship among shrimp, the PCGs sequences of the 49 Decapoda species were downloaded from GenBank database (S1 Table). The PCGs sequences of Harpiosquilla harpax (NC_006916) were used as outgroup. The nucleotide and amino acid sequences of 13 PCGs were aligned using MEGA 5.0 [17]. Gblocks was used to identify and selected the conserved regions [20]. Subsequently, Bayesian inference (BI) and Maximum likelihood (ML) analysis were utilized for reconstructing phylogenetic tree by MrBayes v3.2.6 [21] and PhyML 3.1 [22]. According to the Akaike Information Criterion (AIC) [23], TVM + I + G model was considered as the best-fit model for analysis with nucleotide alignments using jModeltest [24], and MtArt + I + G + F model was the optimal model for the amino acid sequence dataset using ProtTest 3.4.2 [25]. In BI analysis, two simultaneous runs of 10000000 generations were conducted for the matrix. Sampling trees every 1000 generations, and diagnostics were calculated every 5000 generations, with three heated and one cold chains to encourage swapping among the Markov-chain Monte Carlo (MCMC) chains. Additionally, the standard deviation of split frequencies was below 0.01 after 10000000 generations, and the potential scale reduction factor (PSRF) was close to 1.0 for all parameters. Posterior probabilities over 0.9 or bootstrap percentage over 75%, the results were regarded as credible [26,27]. The resulting phylogenetic trees were visualized in Fig Tree v1.4.0.

Genome structure, organization and composition
The mitogenome of L. vittata was a typical circular molecule of 22003 bp in size. It contained 37 mitochondrial genes (thirteen PCGs, twenty-two tRNAs, two rRNAs and three CRs) (Fig 1  and S2 Table). Among the 37 genes, the coding direction of the twenty-three genes was clockwise (F-strand), and the coding direction of the remaining fourteen genes was counterclockwise (R-strand) (Fig 1 and S2 Table).
The nucleotide composition of the mitogenome was biased toward A and T (T = 37.15%, A = 34.35%, C = 16.69%, G = 11.80%) ( Table 1). The relatively AT contents of the complete mitogenome were calculated [mitogenome (71.50%), PCGs (69.79%), tRNAs (69.58%) and rRNAs (69.29%)] (Tables 1 and 2). However, with the exception of Thor amboinensis (73.10%), the AT content of L. vittata mitogenome was higher than other species in the Hippolytidae (Table 1). Among the nine species of Hippolytidae, the AT-skew values of L. vittata (-0.039) was similar with L. boggessi (-0.040), and the AT-skew values of Lebbeus groenlandicus (0.062), Exhippolysmata ensirostris (0.009) and Saron marmoratus (0.110) was positive. In addition, with the exception of Thor amboinensis (-0.081), the GC-skew value for L. vittata (Guangdong) was the biggest negative comparing to that of other mitogenomes (Table 1). By comparing the mitogenome sequence of L. vittata (Fujian) with that of L. vittata (Guangdong), it was found that the whole mitogenome sequence of L. vittata (Fujian) could completely overlap with L. vittata (Guangdong) except that it was 1146 bp bases longer than that of L. vittata (Guangdong). The base distribution of L. vittata (Guangdong) deletion was shown in S1 Fig. The reason for sequence deletion may be related to sequencing method and sequence splicing. All original sequence data in this study were submitted to the NCBI database under accession number MT478132.

PCGs and codon usage
The PCGs region was 11144 bp long, and accounted 50.6% of the L. vittata mitogenome. Furthermore, a contrast of nucleotide composition, AT-skew value, and GC-skew value of PCGs
The RSCU values of L. vittata mitogenome were analyzed and the results were shown in Table 3. The total number of codons in thirteen PCGs was 3714 except eleven canonical stop codons and two incomplete stop codons and the most common amino acids were Ile (AUR) (499), Phe (UUR) (357) and Leu2 (UUR) (315), whereas codons encoding Cys (UGR) (41) and Met (AUR) (24) were rare (Fig 2). The overall A + T content of thirteen PCGs was 69.79%, the AT-skews and GC-skews were negative which implied a higher occurrence of Ts and Cs than As and Gs (Table 1).

Overlapping and intergenic regions
The mitogenome of L. vittata contained four overlapping regions, these four pairs of genes were presented: atp8/atp6, trnE/trnF, nad4/nad4L and trnL1/rrnL, with the longest 23 bp overlap located between trnL1 and rrnL (S2 Table). The 27 intergenic regions were found with a length varying from 2~3821 bp (S2 Table). Three putative CRs had been identified in L. vittata mitogenome. The CR1 was located between rrnS and trnI, with a length of 650 bp, and the A+T content was 80.46%. The CR2 was located between cox1 and trnL2, with a length of 3821 bp, and the A+T content was 72.23%. The CR3 was located between trnL2 and cox2, with a length of 888 bp, and the A+T content was 77.25% (Tables 2 and S2).
To our knowledge, the complete mitogenome sequence of L. vittata is the longest in the existing research on shrimp. How multiple CRs were generated and evolved in the

PLOS ONE
The complete mitogenome of Lysmata vittata mitogenome of Lysmata is a novel problem that has not yet been solved, and more mitogenomes of Lysmata are still needed to clarify the mechanism forming this phenomenon.

Gene rearrangement
In terms of gene rearrangement, compared with the genes order of the ancestor of Decapoda [20,29], the order of the genes of L. groenlandicus remains unchanged, and all species of the Lysmata had multiple CR regions. Among them, L. amboinensis, L. debelius and L. boggessi had 2 CR regions, L. vittata has three CR regions and the positions of the two tRNA genes (trnA and trnR) had been translocated. In addition, the mitochondrial genes order of E. ensirostris, S. marmoratus and T. amboinensis all had varying degrees of translocation compared

PLOS ONE
The complete mitogenome of Lysmata vittata with the gene order of Decapoda (Fig 3). The position of cox2 and trnL2 of E. ensirostris was translocated, and the trnC and gene block (trnM-nad2-trnW) of S. marmoratus were translocated. T. amboinensis produced more translocations, including two gene block (nad6-cob-trnS2 and nad5-trnH-nad4-nad4l) translocations and single tRNA (trnQ, trnT, trnE, trnH, trnY, trnP, trnC and trnM) translocations (Fig 3). In fact, gene rearrangement was a very common phenomenon in the mitogenome and the rearrangement mainly occurred in tRNA genes. Gene arrangement was stable, and it could be used as an important phylogenetic marker in the analysis of evolutionary perspective on shrimp. Comparing the order of the

Phylogenetic analysis
The taxonomic status of genus Lysmata within Hippolytidae has been a highly contentious issue for a long time. In this study, using ML and BI analysis methods, phylogenetic analysis was performed based on the nucleotide and amino acid sequences of thirteen PCGs of the species in S1 Table, and the analysis results were presented (Figs 4 and 5). The phylogenetic tree based on the nucleotide sequence of thirteen PCGs showed that Lysmata and Exhippolysmata formed a monophyletic group, while S. marmoratus, L. groenlandicus and T. amboinensis was clustered into a monophyletic group with species of Alpheidae and Palaemonidae (Fig 4). This analysis supported Christoffersen's [30,31] proposal to classify the Lysmata into the same classification level as the Lysmatidae. The phylogenetic tree based on the amino acid sequence of 13 PCGs revealed that the species of Hippolytidae clustered into a large branch, among which Lysmata-Exhippolysmata formed a monophyletic branch, which was in sister relationship with S. marmoratus-L. groenlandicus/T. amboinensis (Fig 5). The topological structures of the

PLOS ONE
The complete mitogenome of Lysmata vittata phylogenetic trees constructed based on the nucleotide sequence and amino acid sequence were slightly different within the Hippolytidae, but the monophyleticity of Lysmata-Exhippolysmata had been fully verified in previous studies [32][33][34][35]. Furthermore, the phylogenetic analyses confirmed that L. vittata (Fujian) and L. vittata (Guangdong) were closely related. The two shrimps were clustered together and the branch length was zero. Especially their branch nodes were strongly supported (ML BP = 100%; BI PP = 1), indicating that there was almost no difference between L. vittata (Fujian) and L. vittata (Guangdong). The phylogenetic relationship among other suborder/superfamily of Decapoda was similar to Ma et al. [36] research.

Conclusion
In this study, we successfully obtained the mitogenome sequence of the L. vittata, which was also the first species of the Hippolytidae to publish the mitogenome sequence in the GenBank database. The genome sequence was 22003 base pairs (bp) and it included 37 genes and three CRs. Each PCGs was initiated by a canonical ATN codon, except for cox1, nad4L and cox3, which were initiated by a TTG, TTG and GTG. Two of the thirteen PCGs (nad5 and nad4) terminated with incomplete stop codon T, and one (cox1) terminated with stop codon TAG. The AT-skew (-0.04) and the GC-skew (-0.17) were both negative in the mitogenomes of L. vittata. Compared with the gene order of a Decapoda ancestor, the gene arrangement order of the L. vittata has changed. Futhermore, phylogenetic analyses showed that L. vittata formed a monophyletic branch with other species of the genus Lysmata/Exhippolysmata.