The Complete Mitochondrial Genome of the Booklouse, Liposcelis decolor: Insights into Gene Arrangement and Genome Organization within the Genus Liposcelis

Booklice in the genus Liposcelis are pests of stored grain products. They pose a considerable economic threat to global food security and safety. To date, the complete mitochondrial genome has only been determined for a single booklouse species Liposcelis bostrychophila. Unlike most bilateral animals, which have their 37 mt genes on one circular chromosome, ≈15 kb in size, the mt genome of L. bostrychophila has two circular chromosomes, 8 and 8.5 kb in size. Here, we report the mt genome of another booklouse, Liposcelis decolor. The mt genome of L. decolor has the typical mt chromosome of bilateral animals, 14,405 bp long with 37 genes (13 PCGs, 22 tRNAs and 2 rRNAs). However, the arrangement of these genes in L. decolor differs substantially from that observed in L. bostrychophila and other insects. With the exception of atp8-atp6, L. decolor differs from L. bostrychophila in the arrangement of all of the other 35 genes. The variation in the mt genome organization and mt gene arrangement between the two Liposcelis species is unprecedented for closely related animals in the same genus. Furthermore, our results indicate that the two-chromosome mt genome organization observed in L. bostrychophila likely evolved recently after L. bostrychophila and L. decolor split from their most recent common ancestor.


Introduction
Insect mitochondrial (mt) genomes usually consist of a single circular chromosome (13-20 kb) containing 13 protein-coding genes (PCGs), 22 transfer RNA genes (tRNAs) and two ribosomal RNA genes (rRNAs), which is typical of bilateral animals [1]. The mt genome usually contains one large non-coding element called the A+T-rich or control region, which contains the sites for genome replication and the initiation of gene transcription [1]. Due to several unique features, including conserved gene content, maternal inheritance, and rapid nucleotide evolution, mt genome sequences have been used to facilitate the understanding of animal evolution [2]. Currently, more than 300 insect mt genomes have been sequenced [3]. Various types of gene rearrangements have been observed in insect mt genomes. Rearrangements of tRNA genes are the most common whereas rearrangements of proteincoding and rRNA genes are less common [4][5][6]. Studies in recent years also revealed variations of mt genome organization in bilateral animals [1]. For instance, mt genomes that consist of multiple chromosomes have been reported in parasitic lice [7][8][9], booklice [3], rotifera [10] and nematodes [11][12][13]. Variations in mt genome organization may provide a novel perspective for understanding animal evolution [14][15][16], in addition to genome sequences [3,17], RNA secondary structures [7,[18][19][20], and gene rearrangements [5,18,21,22].
Several types of atypical mt genome organization have been reported in psocodean insects (superorder Psocodea) in recent years. Psocodea contains two orders of insects: Psocoptera (booklice and barklice) and Phthiraptera (chewing and sucking lice). The mt genomes of human lice, Pediculus humanus, P. capitis and Pthirus pubis, consist of 14 to 20 mini-chromosomes, each one is 1.8 to 4 kb in size and contains one to five genes [7,9]. The chewing louse, Coloceras sp., has a typical mt chromosome with 37 genes and a circular mt DNA molecule that is approximately half the size of the typical mt chromosome [8]. The booklouse, Liposcelis bostrychophila, has a bipartite mt genome with two chromosomes: one chromosome is <8 kb in size and has 16 genes and the other is <8.5 kb in size and has 22 genes [3]. Extensive gene rearrangement has been found in the mt genomes of most of the 12 Psocodea species that have been completely or nearly completely sequenced to date, including the booklouse, L. bostrychophila [3].
During the last two decades, the booklice of the genus Liposcelis have emerged as serious pests of stored commodities worldwide [23,24]. The genus Liposcelis has 126 known species worldwide and includes four groups (A, B, C, and D) [25,26]. The booklouse. L. decolor, investigated in the current study, belongs to group C, while L. bostrychophila belongs to group D. Many previous studies have indicated that there is great variation among the four Liposcelis groups with respect to morphology, physiology, biochemistry and molecular biology [25,27,28]. In particular, analyses of ITS (internal transcribed spacers) sequences indicated that Liposcelis species of groups C and D have the highest nucleotide divergence among the four groups [25].
To understand whether the bipartite mt genome organization observed in L. bostrychophila occurred in other booklice of the genus Liposcelis, we sequenced the mt genome of L. decolor. We found that, unlike L. bostrychophila, L. decolor has the typical mt chromosome of bilateral animals. However, the arrangement of mt genes in L. decolor differs substantially from that in L. bostrychophila and other known insects. Our results showed, for the first time, a high level of variation in both mt genome organization and mt gene arrangement between closely related animal species in the same genus.

Ethics statement
No specific permits were required for the insects collected in this study. The sampling locations were not privately owned or protected in any way and the collection did not involve endangered or protected species.
Sample collection, DNA extraction, PCR and sequencing L. decolor individuals were collected at grain storage facilities in Binzhou, Shandong Province, China in 2010, and identified to species according to their morphological characteristics [28][29][30]. Subsequently, the ITS sequence [25] and sequences of partial rrnL and cox1 genes [31] were used to confirm the species identification. The ITS sequence obtained has been deposited in GenBank under accession number KF874610. An L. decolor colony was maintained in the lab on a diet of whole wheat flour, skim milk, and yeast powder (10:1:1) in an incubator at 27 6 0.5uC, 75-80% relative humidity and a scotoperiod of 24 hours. Voucher specimens (#Ps-01-01-03) were deposited at the Insect Collection, Southwest University, Chongqing, China. Total genomic DNA was extracted using a Tissue/Cell gDNA Mini Kit (Watson Biotechnologies, Shanghai, China) and stored at 220uC. Parts of cox1, cox3, cob, rrnS, rrnL and nad5 genes were amplified by PCR with conserved insect primers (Table S1) [32]. Species-specific primers were then designed for long PCR (Table S1). Transcriptional orientation is indicated with arrows. Protein-coding genes, ribosomal RNA genes and transfer RNA genes are shown in orange, blue and green respectively. tRNA genes for the two serine and two leucine tRNAs: S 1 = AGN, S 2 = UCN, L 1 = CUN, and L 2 = UUR. The non-coding regions larger than 60 bp are indicated in black. CR = putative control region. Arrows and purple curves indicate primers and PCR fragment, respectively. See Table S1 for sequence of PCR primers. doi:10.1371/journal.pone.0091902.g001  Table S1. doi:10.1371/journal.pone.0091902.g002 Each long PCR reaction was performed in a 25 mL volume, containing 1 mL each of forward primer (10 mM) and reverse primer (10 mM), 4 mL of dNTPs (each 2.5 mM), 1 mL of template DNA (,300 ng/mL), 2.5 mL MgCl 2 (25 mM), 2.5 mL of 106 LA PCR reaction buffer, 12.75 mL ddH 2 O and 0.25 mL LA Taq DNA polymerase (5 U/mL, Takara). All reactions were carried out using C1000 TM thermal cyclers (Bio-RAD, Hercules, CA, USA) with the follow conditions: initial 2 min denaturation at 94uC, 37 cycles of 94uC for 20 sec, 58uC for 50 sec, 68uC for 5-10 min (depending on target size, 1 min/kb), followed by a final extension at 68uC for 15 min. Gel-purified amplification products , 6 kb in size were ligated into pGEM-T Easy vectors (Promega, Madison, WI, USA), and introduced into Escherichia coli (Trans5a, TransGen Biotech, Beijing, China), following by ampicillin selection, then sequenced with M13 primers. Longer PCR products (. 6 kb) were directly sequenced with both forward and reverse PCR primers and internal primers by primer walking. All amplification products were sequenced with an ABI 3730 automated DNA sequencer (Applied Biosystems, Foster city, CA, USA) at the Beijing Genomics Institute (BGI) in Beijing, China.

Sequence assembly, annotation and analysis
SeqMan (DNAStar) was used to assemble the two overlapping nucleotide sequences, which were further confirmed by manually inspection. The protein-coding and rRNA genes were identified using the program ORF Finder (http://www.ncbi.nlm.nih.gov/ gorf/gorf.html) and BLAST searches against the GenBank database, respectively. Subsequently, all of these genes were further confirmed by alignment with homologous genes from those of other louse and booklouse species. The transfer RNA genes were identified by their cloverleaf secondary structure using ARWEN [33] with default parameters and tRNAscan-SE 1.21 [34] with Search Mode = ''EufindtRNA-Cove'', Genetic Code = ''Invertebrate Mito'' and Cove score cutoff = 0.1. The stem-loop secondary structure of the putative control regions was folded using the Mfold Server [35] under the RNA folding option with default parameters. The base composition and codon usage were analyzed with BioEdit (http://bioedit.software.informer.com/) and DAMBE 5.3.9 [36]. Sequences of mt genomes of other lice were retrieved from GenBank and MitoZoa (Table S2) [37].

Phylogenetic analyses
Phylogenetic analyses were conducted with the 11 Psocodea mt genome sequences currently available in GenBank including the new booklouse sequence obtained in this study. The mt genome sequence of the fruit fly, Drosophila melanogaster, served as an outgroup. Sequences of atp8, nad4L, and tRNA genes were too short and too variable to be correctly aligned among the psocodean species; these genes were thus excluded from the phylogenetic analyses. The nad4 was also excluded as this gene has not been identified in the human pubic louse, P. pubis [7]. The amino acid sequences from each protein-coding gene and the nucleotide sequence of each rRNA gene were aligned with MAFFT v7 [38]. The nucleotide sequences of each protein-coding gene were aligned based on the corresponding amino acid alignments using PAL2NAL [39] to ensure the correct reading frame; the poorly aligned sites were removed with GUID-ANCE [40] using the default setting. Then, positions with gap in more than half of the species were removed. Substitution saturations of the nucleotide sequences were examined using DAMBE 5.3.9 following Xia et al. [41]. Whole PCG sequences were chosen to enter the next step if Iss (index of substitution saturation) is significantly lower than Iss.c (critical value for symmetrical tree topology) (P , 0.05). All of the protein-coding and rRNA genes, except nad3 and nad6, passed this test. Consequently, the third codon positions of nad3 and nad6 were excluded from phylogenetic analyses. The best fit models for the alignment of nucleotide sequence and amino acid sequence were determined using the Akaike Information Criterion in jModelTest 0.1.1 [42] and ProtTest 3 [43], respectively. Specifically, the GTR+I+G model and MtREV+I+G model were chosen for the nucleotide sequence dataset and the amino acid sequence dataset, respectively. Phylogenetic trees were estimated via Bayesian inference (BI) method using MrBayes v3.12 [44]. Four independent Markov chains were simultaneously run for 2,000,000 generations with a heating scheme (temp = 0.2). Trees were  sampled every 100 generations (sample-freq = 100) and the first 25% of the generations were discarded as burn-in and the remaining samples were used to compute the consensus tree. Stationarity was considered to be reached when the average standard deviation of split frequencies was below 0.01 [45].

Mitochondrial genome of Liposcelis decolor
The mt genome of L. decolor has one typical circular chromosome, unlike the booklouse, L. bostrychophila (Figure 1). The size and the circular organization of the mt chromosome of L. decolor was confirmed by two overlapping PCR amplicons, 9.1 kb (d1-d2 from cox3 to rrnS) and 5.5 kb in size (d3-d4 from rrnS to cox3) respectively (Figure 1, Figure 2 and Table S1). The two amplicons overlapped by 33 bp in cox3 and 92 bp in rrnS.
Sequencing and assembly of these two PCR amplicons revealed that the mt genome of L. decolor is 14,405 bp in length and encodes 37 genes that are typically found in metazoan mt genomes ( Figure 1 and Table S3) (GenBank accession number: JX870621). All of the protein-coding genes (PCGs) initiate translation at an ATN codon, except for the TTG codon used in cox1. TAA and TAG serve as stop codons for all of the PCGs (Table S4). Eleven of the 13 PCGs had average length for Psocodean species, but nad4 is shorter and the nad4L is longer than in other Psocodean species ( Figure S1). All 22 tRNA coding genes usually found in the mt genomes of metazoans are present in L. decolor ( Figure 3); all have the conventional cloverleaf shaped secondary structure except trnS 1 , which lacks the D-arm, as in other insects. There are three non-coding regions longer than 60 bp in the mt genome of L. decolor. The longest non-coding region (118 bp) lies between cob and trnY, has an A+T content of 82.20%; three stem-loop secondary structures can be found in this region ( Figure 4A). An 80 bp non-coding region lies between nad2 and nad4 with an 88.75% A+T content but has no stem-loop secondary structure. A 69 bp non-coding region is between trnG and trnL 1 with a 92.75% A+T content and has two stem-loops ( Figure 4B). The 118-bp region is considered to be most likely the putative control region (Figure 1) due to it is longer sequence in the mt genome and the presence of typical loop secondary structure.
The A+T content of the L. decolor mt genome is 75.23%. This is typical of psocodean insects but is higher than that of L. bostrychophila, 68.63% ( Figure S2 and Table S5). The higher A+T content of L. decolor is present in all regions, both genes and noncoding regions (Table S5). The difference of A+T content between the two booklice is reflected further in the codon usage: the relative synonymous codon usages (RSCU) of the two booklice showed that L. decolor used more NNA and NNT codon than L. bostrychophila ( Figure 5). The nucleotide composition of mt genome is usually conserved within a genus; however, it varies between L. decolor and L. bostrychophila. This variation may be related to mt genome fragmentation, because all of the psocodean fragmented mt genomes have a lower A+T content ( Figure S2).

Mitochondrial gene rearrangement in L. decolor
The mt gene arrangement in L. decolor differs substantially from that of the hypothetical ancestor of insects and from that of the booklouse, L. bostrychophila ( Figure 6). With the only exception of Figure 6. Arrangement of mitochondrial genes in Liposcelis and the hypothetical ancestor of the arthropods. Circular genomes have been arbitrarily linearized for ease of comparison. Gene names are the standard abbreviations used in the present study. tRNA genes are designated by the single letter according to the IPUC-IUB one-letter amino acid codes. Genes which are underlined are encoded on the opposite strand to the majority of genes in that particular genome. Black, gray, and white boxes represent putative control regions, pseudogenes, and transfer RNA genes, respectively. The boxes in 15 colors represent 13 protein coding genes and 2 ribosomal RNA genes. Shared gene-boundaries are labeled with square ''brackets'' above each genome. doi:10.1371/journal.pone.0091902.g006 Figure 7. Phylogeny of the Psocodea inferred with mitochondrial genome sequences. Numbers above the left branches show Bayesian posterior probability for the phylogenies from nucleotide sequences, the right from amino acid sequences. Only support above 50% is shown. The insects belong to Phthiraptera, Psocoptera, and Diptera are shown in gray, yellow and pink background respectively. doi:10.1371/journal.pone.0091902.g007 atp8-atp6, there is no gene boundary or gene block shared between L. decolor and L. bostrychophila, even though these two booklice belong to the same genus. Atp8-atp6, is a highly conserved ancestral gene boundary for animals and is assumed to be constrained by the function of a bicistronic atp8-atp6 transcript [46]. The mt gene arrangement is usually conserved within the same genus. For example, the two Pediculus species of human lice, both have extensively fragmented mt genomes but have the same mt gene arrangement [7]. Furthermore, Coloceras sp. and Campanulotes bidentatus, also have the same gene arrangement, except for a difference in the location of their trnQ. The extent of the variation in the mt gene arrangement between the two species of Liposcelis booklice is unprecedented for animals within the same genus, indicating that gene rearrangement occurred frequently after these two booklice diverged from each other. The genus Liposcelis is divided into four groups (A, B, C, D) phylogenetically [25,26]; L. decolor and L. bostrychophila are in different groups, C and D respectively, although they often co-occur in a wide range of stored products in the same ecosystems. Previous studies have revealed that substantial variation exists among the Liposcelis groups at both morphological and molecular levels [25,30,31]. Whether or not the multipartite mt genome observed in L. bostrychophila occurred only in species of group D remains to be investigated.

Phylogenetic relationship of L. decolor to other species in the Psocodea
Bayesian inference was used to determine phylogenetic relationships among 11 species of Psocodea from the orders Psocoptera and Phthiraptera, with nucleotide and deducted amino acid sequences of mt genomes (Figure 7 and Figure S3). Although L. decolor and L. bostrychophila differ in both mt genome organization and mt gene arrangement, these two booklice are more closely related to each other than to other species in the Psocodea. The two Liposcelis species formed a sister clade to the parasitic lice (order Phthiraptera). This was also indicated by previous studies [3,30,47]. The close relationships between the two booklice, and between the booklice and the parasitic lice are strongly supported by our analyses. Our results indicate that the bipartite mt genome organization observed in L. bostrychophila likely evolved recently L. decolor and L. bostrychophila split from their most recent common ancestor. Furthermore, from the phylogenies of the Psocodea constructed in the present study and a number of previous studies [7,9], it can be inferred that the bipartite mt genome organization in L. bostrychophila evolved independently from the much more fragmented mt genomes observed in the blood-sucking lice (suborder Anoplura). Figure S1 Size of mitochondrial protein-coding genes and rRNA genes of the Psocodea.  Figure S3 Alignments of mitochondrial gene sequences used for phylogenetic analyses. A. Nucleotide sequence alignment (gene order of this alignment: cox1-3, atp6, cob, nad1-3, nad5-6, rrnS and rrnL; sequences of nad3 and nad6 just with codon positions 1 and 2); B. Amino acid sequence alignment (gene order of this alignment: cox1-3, atp6, cob, nad1-3 and nad5-6