The Multipartite Mitochondrial Genome of Liposcelis bostrychophila: Insights into the Evolution of Mitochondrial Genomes in Bilateral Animals

Booklice (order Psocoptera) in the genus Liposcelis are major pests to stored grains worldwide and are closely related to parasitic lice (order Phthiraptera). We sequenced the mitochondrial (mt) genome of Liposcelis bostrychophila and found that the typical single mt chromosome of bilateral animals has fragmented into and been replaced by two medium-sized chromosomes in this booklouse; each of these chromosomes has about half of the genes of the typical mt chromosome of bilateral animals. These mt chromosomes are 8,530 bp (mt chromosome I) and 7,933 bp (mt chromosome II) in size. Intriguingly, mt chromosome I is twice as abundant as chromosome II. It appears that the selection pressure for compact mt genomes in bilateral animals favors small mt chromosomes when small mt chromosomes co-exist with the typical large mt chromosomes. Thus, small mt chromosomes may have selective advantages over large mt chromosomes in bilateral animals. Phylogenetic analyses of mt genome sequences of Psocodea (i.e. Psocoptera plus Phthiraptera) indicate that: 1) the order Psocoptera (booklice and barklice) is paraphyletic; and 2) the order Phthiraptera (the parasitic lice) is monophyletic. Within parasitic lice, however, the suborder Ischnocera is paraphyletic; this differs from the traditional view that each suborder of parasitic lice is monophyletic.


Introduction
The mitochondrial (mt) genomes of bilateral animals typically consist of a single circular chromosome that is 13 to 20 kb in size and contains 37 genes: 13 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes [1]. Exceptions to the typical mt chromosomes, however, have been discovered in the past decade. Mitochondrial genomes that consist of multiple chromosomes, i.e. multipartite mt genomes, have been found in mesozoa, nematodes, rotifers and parasitic lice [2][3][4][5][6][7]. The number of chromosomes in a multipartite mt genome varies from two in the rotifer, Brachionus plicatilis [5], to 20 in the human body louse, Pediculus humanus [6]. Each chromosome in a multipartite mt genome is circular and comprises a coding region and a non-coding region. In an extreme case in the human body louse, an mt chromosome contains only a single tRNA gene [6]. Sequences of the non-coding regions, or segments in the non-coding regions, are often highly conserved among all of the chromosomes in a multipartite mt genome, indicating the functional roles of non-coding regions, potentially in mt genome replication and gene transcription [2,5,6]. Different chromosomes in a multipartite mt genome are in unequal abundance [5]; furthermore, the relative abundance of an mt chromosome varies at different developmental stages [8].
Multipartite mt genomes (circular or linear) are not unique to bilateral animals, and have been found in plants [9], fungi [10], cnidarians [11], and various lineages of protists, including algae [12], ciliates [13], flagellates [14], and ichthyosporeans [15]. The multipartite mt genomes of non-bilateral animals exhibit more diversity than their counterparts in bilateral animals. For instance, the mt genome of the box jellyfish, Alatina moseri, consists of eight linear chromosomes [16]. The mitochondrial genome of Diplonema papillatum, a free-living diplonemids, is composed of more 100 circular chromosomes; some of these chromosomes contain only a segment of genes [14,17].
Why are mt genomes in pieces? Several possibilities have been raised, including Darwinian selective advantage, random genetic drift [18], loss of mitochondrial single-strand binding protein (mtSSB) [7], blood-feeding [6] and parasitic life-style [19]. None of these possibilities, however, has been investigated experimentally. Furthermore, two mechanisms have been proposed to explain how multipartite mitochondrial genomes are generated: 1) intramolecular homologous recombination [20]; and gene deletions [21]. None of these mechanisms, however, could explain why multipartite mt genomes should replace the typical single-chromosome mt genomes.
Booklice (order Psocoptera) in the genus Liposcelis are major pests to stored grains worldwide and are closely related to parasitic lice of birds and mammals (order Phthiraptera). The monophyly of the order Phthiraptera has been questioned recently [22,23]. There are two contradictory hypotheses: 1) the parasitic lice (order Phthiraptera) are monophyletic and booklice of the genus Liposcelis (order Psocoptera), are the sister-group to the parasitic lice [24,25]; and 2) the parasitic lice are paraphyletic and the booklice of the genus Liposcelis are the sister-group to one of the four suborders of the parasitic lice, the Amblycera [22,23,26]. Monophyly of another suborder of the parasitic lice, the Ischnocera, has also been questioned: some species in the Ischnocera have been shown to be more closely related to blood-sucking lice in the suborder Anoplura than to other Ischnocera species [7,27].
Mitochondrial genome sequences have been used extensively in inferring phylogenetic relationships among insects at different taxonomic levels [28][29][30]. Thus far, complete mt genome sequences have been determined for more than 300 species of insects (available at http://www.ncbi.nlm.nih.gov). Only one of these species, a barklouse, Lepidopsocid sp., however, is from the order Psocoptera [31]; no booklice have been sequenced for complete mitochondrial genomes. In this study, we sequenced the mt genome of the booklouse, Liposcelis bostrychophila Badonnel (Psocoptera: Liposcelididae). Here, we report the features of the mt genome of L. bostrychophila, and the phylogenetic relationships among the major lineages of the Psocodea, i.e. booklice, barklice and parasitic lice, inferred from mt genome sequences.

Ethics Statement
No specific permits were required for the described field studies, and no specific permissions were required for these locations/ activities. We confirm that these locations are not privately-owned or protected in any way and the field studies did not involve endangered or protected species.
Sample collection, DNA extraction, PCR amplification and sequencing L. bostrychophila were collected at grain storage facilities from six localities in five provinces in China (Table S1). These insects were identified to species by morphology [32] and internal transcribed spacer (ITS) sequences [33]. Total DNA was extracted from L. bostrychophila by the SDS method [34]. For real-time PCR, the total DNA was also extracted using DNeasy Blood & Tissue Kit (QIAGEN) from the Beibei strain of L. bostrychophila. Fragments of cox1and cob were amplified by PCR with conserved insect primers [35]. PCR products were purified with Gel Extraction Mini kits (Watson Biotechnologies). Purified PCR products were ligated into pGEM-T Easy vectors (Promega, Madison, WI) and introduced into Escherichia coli (T1, Transgen) followed by ampicillin selection. Positive clones were sequenced with a 3730, ABI Applied Biosystems sequencer. Then, species-specific primers were designed for long-PCR from cox1 to cob sequences. Long PCR reactions were carried out in a total volume of 25 ml, utilizing 1 ml of DNA, 1 ml each of two primers (10 mM; P1-P2, P3-P4; Figure 1), 4 ml of dNTPs (each 2.5 mM), 2.5 ml MgCl 2 (25 mM), 2.5 ml of 106 LA PCR reaction buffer, 12.75 ml ddH 2 O and 0.25 ml LA Taq DNA polymerase (5 U/ml, Takara). This was performed on Bio-RAD S1000 TM thermal cyclers and the PCR conditions were: 1 min at 96uC, 37 cycles of 40 sec at 96uC, 50 sec at 60uC and 9 min at 68uC, followed by 15 min at 68uC. The two overlapping PCR fragments (P1-P2, and P3-P4; Figure 1) were directly sequenced using primer walking method by the Beijing Genomics Institute (BGI) at Shenzhen, China. The nucleotide sequences were proofread and assembled into a contig with SeqMan (DNAStar). Initially, only seven protein-coding genes were found in the 8,530 bp contig that was amplified by long-PCR primers from cox1 and cob. So, we amplified and sequenced fragments of two  Table 1  other mitochondrial genes, rrnL and nad5, which were not in the first contig, with conserved insect mitochondrial primers [35]. Two overlapping fragments were amplified by long-PCR with the primers designed from rrnL and nad5 sequences (P5-P6, P7-P8; Figure 1). These two fragments were sequenced and a contig was assembled; six other protein-coding genes were found in this contig. Sequences of the primers used in this study are presented in Table 1.

Sequence analysis
All of the typical protein-coding genes, except atp8 and nad4L, were identified by BLAST searches of NCBI database. Two genes atp8 and nad4L were identified by comparison of putative amino acid sequences and hydrophilicity profiles with those of other insects in NCBI database. A highly conserved amino acid sequence motif, MPQMAPL, at the N-terminal of ATP8 protein of insects was also used to help find atp8 gene. Hydrophilicity profiles were generated with MacVector [36] ( Figure S1). rRNA genes were identified by BLAST searches of NCBI database. All of the typical tRNA genes of animals, except trnE and trnR, were identified by their clover-leaf secondary structure using tRNAscan-SE [37] and ARWEN [38]. trnE and trnR results were identified by manual inspection of potential anticodon sequences and by manual sequence alignment with trnE and trnR of other barklice and parasitic lice. Secondary structures in the largest non-coding regions (putative control regions) were predicted with program RNAstructure v. 5.0 [39]. Base compositions and codon usage were calculated with Mega 5 [40].

PCR verification of the multipartite mitochondrial genome of L. bostrychophila and quantification of the two mitochondrial chromosomes
We verified the size and the circular nature of the two contigs by PCR with two pairs of outbound primers: P9 and P10 in the cox1 gene, and P11 and P12 in the nad4 gene ( Figure 1 and Figure 2). The two contigs are hereafter referred to as mt chromosome I and chromosome II. We quantified the relative copy numbers of chromosome I and chromosome II of the Beibei strain of L. bostrychophila by real-time PCR with an Mx3000P thermal cycler (Stratagene). Each PCR reaction contained 1 ml of genomic DNA, 12.5 ml SYBRH Premix Ex Taq TM II (Takara) and 0.2 mM of each primer. A pair of primers, qcobF and qcobR, were designed from cob gene for mt chromosome I; another pair, qnad1F and qnad1R, were designed from nad1 gene for mt chromosome II ( Table 1). The real-time PCR conditions were: 95uC for 1 min, followed by 40 cycles of 95uC for 10 sec, 60uC for 30 sec and 72uC for 30 sec. A melting curve analysis from 55uC to 95uC was applied to all reactions to ensure consistency and specificity of the amplified product. A nuclear b-Actin gene was used as a reference [41]. The amplification Phylogenetic analysis of the mitochondrial genome sequences of booklice, barklice and parasitic lice (i.e. Psocodea) We inferred phylogenies using the mt genome sequences of the booklouse: L. bostrychophila (Psocoptera: Troctomorpha) (this paper), the barklouse, Lepidopsocid sp. RS-2001 (Psocoptera: Trogiomorpha) [31], and the parasitic lice: Bothriometopus macrocnemis (Phthiraptera: Ischnocera) [43], Campanulotes bidentatus (Phthiraptera: Ischnocera) [44], Heterodoxus macropus (Phthiraptera: Amblycera) [45] and P. humanus (Phthiraptera: Anoplura) [6]. Pachypsylla venusta (Hemiptera: Psylloidea) [46] was used as an outgroup. It was difficult to align the putative amino acid and the nucleotide sequences of atp6, atp8, nad4L, and the nucleotide sequences of the tRNA genes because the length of these genes varied substantially among the five species listed above. So, we excluded atp6, atp8, nad4L and the tRNA genes from our phylogenetic analyses.
Putative amino acid and nucleotide sequences were aligned using Clustal X with the default parameters [47]. Alignments were then imported into the Gblocks server (Http://molevol. cmima.csic.es/castresana/Gblocks_server.html) [48] to remove poorly aligned sites. Nucleotide sequences of the protein-coding genes were tested for substitution saturation using DAMBE 5.0.59 [49]. All of the protein-coding genes, except nad2 and nad3, passed this test: the second and third codon positions of nad2 were excluded, whereas the first and the third positions of nad3 were excluded from our phylogenetic analyses. Phylogenies were inferred from two alignments: (1) the putative amino acid sequences of 10 protein-coding genes (cox1, cox2, cox3, nad1, nad2, nad3, nad4, nad5, nad6, and cob), and (2) the nucleotide sequences of the same 10 protein-coding genes plus the two rRNA genes (rrnS, rrnL). The amino acid and nucleotide sequence alignments had 2,133 and 10,173 positions, respectively (includes alignment gaps). The best-fit model for the amino acid sequence alignment was determined with MEGA 5 [40]: the RtREV model was selected. For the nucleotide sequence alignment, jModelTest 0.1.1 [50] was used to find a suitable model for nucleotide substitution using the Akaike Information Criterion: the GTR+I+G model were chosen. Phylogenies were inferred by Bayesian Inference (BI) with MrBayes v3.12 [51], Maximum Likelihood (ML) with PhyML 3.0 (http://www.atgcmontpellier.fr/phyml/) [52], and Neighbor-Joining (NJ) with MEGA 5 [40]. For Bayesian Inference, four independent Markov chains were simultaneously run for 500,000 generations with a heating scheme (temp = 0.5). Trees were sampled every 100 generations (samplefreq = 100) and the first 20% of the generations were discarded as burn-in and the remaining samples were used to compute the consensus tree. Stationarity was considered to be reached when the average standard deviation of split frequencies was below 0.01 [53]. There were 100 bootstrap replicates in ML analysis and 1000 replicates for NJ analysis.

Results
The Beibei strain of L. bostrychophila has a multipartite mitochondrial genome Two overlapping fragments, 3.1 and 5.8 kb in size, were amplified by PCR with the primer pairs P1and P2 on the one hand, P3 and P4 on the other hand ( Figure 1 and Figure 2). We sequenced these two PCR fragments; the sequences were assembled into a circle, 8,530 bp in size; we called this circle mt chromosome I hereafter. Unexpectedly, mt chromosome I has only half of the genes that are typically found in the mt genome of a bilateral animal (Figure 1).Thus, we sought the other mt genes with primers that were anchored in rrnL and nad5 (P5, P6 and P7, P8, Figure 1). PCR with these primers amplified a 2 kb and a 6.2 kb fragment, respectively ( Figure 2). The sequences of these two fragments were assembled into another circle, 7,933 bp in size; we called this circle mt chromosome II (Figure 1). Of the 35 mt genes typical of the mt genome of bilateral animals, 22 genes were found only in mt chromosome I whereas 16 genes were only in mt chromosome II (Figure 1, Table S2 and S3). Intriguingly, three tRNA genes, trnA, trnE and trnM, were found in both chromosomes. The longest non-coding region (485 bp), was present in both chromosomes; we called this non-coding region the control region hereafter (Figure 2). Sequences of trnA, trnE, trnM and the control region are identical between mt chromosome I and mt chromosome II. We could not find two

General features of the multipartite mitochondrial genome of the Beibei strain of L. bostrychophila
Genes were on both strands of the mt chromosomes of the Beibei strain of L. bostrychophila. For mt chromosome I, six protein-coding genes and five tRNA genes were on one strand whereas one protein-coding gene, one rRNA gene and nine tRNA genes were on the other strand ( Figure 1). For mt chromosome II, five protein-coding genes and four tRNA genes were on one strand whereas one protein-coding gene, one rRNA gene and five tRNA genes were on the other strand ( Figure 1). The gene-boundary trnI-trnL1 is present in L. bostrychophila and the parasitic screamer louse, B. macrocnemis (Ischnocera); this boundary is derived for insects and appears to have evolved independently in the lineages that led to L. bostrychophila and B. macrocnemis (Figure 3). The only gene-boundary that L. bostrychophila shares with all other insects is atp8-atp6, which is ancestral to insects (Figure 3).  contain the sites of initiation of replication and transcription. There are two putative pseudogenes in mt chromosome I: P-nad4 (IV, 216 bp) and P-nad5 (III, 207 bp); and two putative pseudogenes in mt chromosome II: P-cox3 (I, 322 bp) and P-nad5 (II, 108 bp) (Figure 1 and Figure S2). These putative pseudogenes have identical or near-identical sequences to parts of their full-length counterparts. The A+T content of mt chromosome I and mt chromosome II were 67.78% and 69.54%, respectively. The nucleotide composition, AT-skew, and GC-skew between the two chromosomes and among the genes were summarized in Table 2, Table S2, and Table S3. The codon usages for the 13 mt protein-coding genes of L. bostrychophila were summarized in Table S4.
PCR verification of the multipartite mitochondrial genome of L. bostrychophila and quantification of the two mitochondrial chromosomes To verify mt chromosomes I and II, we amplified the entire chromosome I with primers P9 and P10, and the entire mt chromosomes II with primers P11 and P12 (Figure 1). These PCR experiments amplified an 8.0 kb and a 7.4 kb fragment, respectively ( Figure 2). We digested the 8.0 kb fragment (P9-P10) with NdeI and XbaI and obtained three DNA fragments of the sizes we expected from our sequence analyses: 865 bp, 2,704 bp, and 4,529 bp ( Figure 6). We digested the 7.4 kb fragment (P11-P12) with NdeI and ApaI and obtained four DNA fragments of the sizes we expected: 771 bp, 833 bp, 1,054 bp, and 4,736 bp ( Figure 6). Our PCR test and restriction enzyme digestion confirmed the size and the circular structure of mt chromosome I and mt chromosome II of L. bostrychophila. We also amplified these two mt chromosomes from L. bostrychophila collected at five other locations in China (Table S1). PCR with primer pairs P9-P10 and P11-P12 amplified fragments of the same sizes as the fragments amplified from the Beibei strain of L. bostrychophila (Figure 7). Our real-time PCR quantification showed unequal copy numbers between the two mt chromosomes of L. bostrychophila: chromosome I was nearly twice (1.86960.077) as abundant as chromosome II ( Table 3).

Phylogeny of booklice, barklice and parasitic lice inferred from mitochondrial genome sequences
The topologies of the phylogenetic trees inferred from the nucleotide sequences and the putative amino acid sequences were identical. In all of the phylogenetic trees constructed with different methods: 1) the parasitic lice (order Phthiraptera) were monophyletic; 2) booklice and barklice (order Psocoptera) were paraphyletic; 3) L. bostrychophila was the sister-group to the parasitic lice rather than the sister-group to the barklouse, Lepidopsocid sp.; and 4) one of the suborders of parasitic lice, the Ischnocera, was paraphyletic ( Figure 8).

Evolution of multipartite mt genomes in bilateral animals
Multipartite mt genomes are now known from four lineages of bilateral animals: the Mesozoa, Nematode, Rotifera and Psocodea ( Figure 9) [2,[4][5][6][7]. What drove the fragmentation of a typical mt chromosome into two or more smaller chromosomes? How could multiple small mt chromosomes replace a typical large mt chromosome? It is widely held that there has been strong selection for small and compact mt genomes in animals [54]. This selection pressure could be seen in many ways, i.e. many of the genes in the ancestral mt genome have moved to the nuclear   I  I I  I  I I  I  I I  I  I I  I  I I  I  I   genome or have been lost [55]; intergenic regions and introns have been lost and tRNA genes have been truncated and adjacent genes may overlap in mt chromosomes [56][57][58]. However, the typical mt chromosome of bilateral animals appears to be as compact as it can be. We propose that when mt chromosomes of different sizes co-exist, the selection pressure may favor small mt chromosomes over the typical large mt chromosome. Thus, fragmentation of mt chromosomes might be one of strategies of size-reducing for streamlining mt chromosomes and small mt chromosomes may have selective advantages over the typical large mt chromosomes. Indeed, Hayashi et al. (1991) showed that, in cultured human cells, the proportion of an 11-kb circular mtDNA increased significantly whereas the proportion of the typical 16-kb wild-type mt chromosome decreased when these two forms of mtDNA molecules co-existed for 10 weeks [59]. Mitochondrial DNA molecules smaller than the typical mt chromosomes can be generated via deletion events, as observed in nematodes [21], humans [60] and mice [61]. In plants, intramolecular homologous recombination between repeated sequences in the mt maxicircle can also generate smaller mt DNA molecules, as observed in maize [20] and cabbages [62]. In all of these cases, however, the typical mt chromosome, or the master circle, is always present. The typical mt chromosome is apparently not present in L. bostrychophila, the rotifer B. plicatilis [5] or the human body louse P. humanus [6]. We propose that the two mt chromosomes of L. bostrychophila were likely generated by a series of gene-deletion events; the  Why do the two mitochondrial chromosomes of L. bostrychophila have unequal copy numbers?
The copy numbers of different chromosomes of a multipartite mt genome were known previously only for the rotifer, B. plicatilis [5]. One of the mt chromosomes of B. plicatilis is four times as abundant as the other mt chromosome. Our real-time PCR showed that the two mt chromosomes of the booklouse, L. bostrychophila, also differ in copy number: mt chromosome I is nearly twice as abundant as mt chromosome II. If mt genes are in equal copy numbers, as in the typical mt chromosome, then we would expect that the different mt chromosomes of a multipartite mt genome be in equal copy numbers. Why are the mt chromosomes in both L. bostrychophila and B. plicatilis in unequal abundance? We speculate that different types of the chromosomes of a multipartite mt genome may be linked in some way, i.e. analogous to the network of the kinetoplast DNA [63,64], they function and segregate as a unit, and this unit may contain different copies of each mt chromosome; or the multipartite mt chromosomes of L. bostrychophila may form a nucleoid structure that contains two copies of mt chromosome-I but one copy of mt chromosome-II [65]. Both of above cases will result in consistent, unequal copy numbers between different mt chromosomes.

Phylogeny of booklice, barklice and parasitic lice
There are two hypotheses for the evolution of parasitism in the lice (super order Psocodea). First, that parasitism evolved once in the most recent common ancestor (MRCA) of the Anoplura, Rhyncophthirina, Ischnocera, and Amblycera i.e. that the order Phthiraptera is monophyletic. Second, that parasitism evolved twice: once in the MRCA of the Anoplura, Rhyncophthirina, and Ischnocera and once in the Amblycera i.e. that the order Phthiraptera is paraphyletic. Most of the molecular phylogenetic studies using sequences of gene fragments reject the first hypothesis, that parasitism evolved only once in the Psocodea, in favor of the second hypothesis, that parasitism evolved twice in these insects [22,23,26,27]. Our phylogenies from mt genome sequences, however, reject the second hypothesis in favor of the first hypothesis. Our phylogenies thus indicate that parasitism evolved only once in the Psocodea and that the order Phthiraptera is monophyletic. There are also two hypotheses conceiving the Ischnocera, which is much the largest lineage of parasitic lice (over 3,120 described species on birds and mammals [66]). First, the Ischnocera is monophyletic [24] [67]. Second, the Ischnocera is paraphyletic [7]. In the present study, analyses of mt genome   sequences indicate, in contrast to previous molecular and morphological studies, that the Ischnocera is paraphyletic.