The Mitochondrial Genome of Paramphistomum cervi (Digenea), the First Representative for the Family Paramphistomidae

We determined the complete mitochondrial DNA (mtDNA) sequence of a fluke, Paramphistomum cervi (Digenea: Paramphistomidae). This genome (14,014 bp) is slightly larger than that of Clonorchis sinensis (13,875 bp), but smaller than those of other digenean species. The mt genome of P. cervi contains 12 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes and 2 non-coding regions (NCRs), a complement consistent with those of other digeneans. The arrangement of protein-coding and ribosomal RNA genes in the P. cervi mitochondrial genome is identical to that of other digeneans except for a group of Schistosoma species that exhibit a derived arrangement. The positions of some transfer RNA genes differ. Bayesian phylogenetic analyses, based on concatenated nucleotide sequences and amino-acid sequences of the 12 protein-coding genes, placed P. cervi within the Order Plagiorchiida, but relationships depicted within that order were not quite as expected from previous studies. The complete mtDNA sequence of P. cervi provides important genetic markers for diagnostics, ecological and evolutionary studies of digeneans.


Introduction
Paramphistomosis, due to paramphistomes (Trematoda: Digenea: Paramphistomidae), has recently emerged as a major cause of productivity loss in ruminants. Adult worms often inhabit the rumen and reticulum of cattle, water buffaloes, sheep and goats. Their presence in these sites may elicit few apparent signs or symptoms. However, acute parasitic gastroenteritis causing high morbidity and mortality may occur as large numbers of immature paramphistomes migrate through the intestine towards the rumen and reticulum. Severity of disease is greatest in young animals [1][2][3][4][5][6]. As a consequence of frequent under-diagnosis, the significance of subclinical infection in many animals remains unclear and economic losses may exceed those caused by many other helminth parasites [5,7]. Paramphistomosis is widespread [6][7][8][9][10], with different species predominating in different places. However, Paramphistomum cervi is perhaps the most widespread species, being reported from many parts of Eurasia [9,11] and North America [12]. Conventional diagnosis of paramphistomosis is based on the history and clinical signs of the disease. Further confirmation can be obtained by collection of fecal samples from the host and examination for parasite eggs. However, this can lead to misinterpretation or misdiagnosis because the presence of adult paramphistomes (hence their eggs) is not necessarily a cause of disease [7,13,14]. Early diagnosis of paramphistomosis is essential for prompt treatment before irreparable damage to the rumen and bile ducts occurs [9]. Immunological diagnosis may be a dependable means for monitoring the infection, and be supplemented by the finding of eggs. In order to develop this method, whole worm extract of adult P. cervi has been subjected to immunoblotting using sera from bovines infected with P. cervi. This method, however, has not been widely adopted [15].
The lack of knowledge of mt genomics for P. cervi is a major limitation for the development of molecular diagnostic techniques, for analyses of population and genetic variation within this species, and for phylogenetic studies of the Digenea in general.
In our present study, we determined the complete mt nucleotide sequence of P. cervi, which was collected from Qinghai Province, China. Phylogenetic analyses were performed using concatenated mt sequences of 12 protein-coding genes of digenean species available in GenBank to date. The new mt genome sequence may provide useful information on both genomics and the evolution of Paramphistomidae, because there are no complete (or nearly complete) mtDNA sequences available from any member of this family.

Ethics Statement
The yak from which P. cervi adults were collected was being processed at a local abattoir in Dari County, Qinghai Province, as part of the normal work of the abattoir.

Parasite and DNA extraction
Adult P. cervi (Zeder, 1790) were collected from the rumen of a naturally infected yak in Dari County, Qinghai Province of China. The flukes were washed extensively in physiological saline and identified to species in the Key Laboratory of Veterinary Parasitology, Gansu Province based on morphological characters (collection accession number: 20110101).
Total genomic DNA was extracted from one parasite using a Qiagen Blood and Tissue Kit (Qiagen, Germany) according to the manufacturer's instructions and eluted into 100 ml H 2 O, followed by RNase treatment step. The treated DNA sample was stored at 220uC until use.
Amplification, sequencing and assembling of mtDNA fragments Amplification, sequencing and assembly of mtDNA fragments was performed according to methods previously described [23,24].
Seven pairs of oligonucleotide primers were designed based on the conserved regions from published complete mtDNA sequences of Fasciola hepatica [27,28], Clonorchis sinensis [30,31], Opisthorchis felineus [30] and Paragonimus westermani (GenBank Accession No. AF219379) ( Table 1). These sets of primers amplified overlapping fragments to facilitate eventual assembly using Taq polymerase -KOD FX Neo (TOYOBO, Japan). The cycling conditions used were 94uC for 5 min (initial denaturation); then 94uC for 1 min (denaturation), 50uC for 35 s (annealing), 72uC for 1-3 min (extension) for 30 cycles and a final extension at 72uC for 10 min. Each PCR reaction yielded a single band detected in a 1.0% (w/v) agarose gel stained with ethidium-bromide [24]. PCR products were directly sequenced on an ABI 3370 DNA sequencer at Sangon Company (Shanghai, China) using a primer walking strategy. The complete mtDNA sequence of P. cervi was assembled using DNAStar software as a sequence editor [32].
Prediction of protein-coding genes, tRNAs and genes for rrnL and rrnS The ORF finder tool at NCBI (http://www.ncbi.nlm.nih.gov/ gorf/gorf.html) was used to find protein-coding gene sequences, which were subsequently used to search for homologous digenean sequences deposited in the GenBank TM by using tBLASTn. The rhabditophoran platyhelminth genetic code [33] was specified. Gene boundaries were confirmed based on comparison and alignment with other published mt genomes of species in Fasciolidae, Opisthorchiidae and Paragonimidae [27,28,30,31].
Putative tRNA genes were identified using the program tRNAscan-SE [34] and the online tool ARWEN [35] combined with observations and alignments by eye. Genes for large (rrnL) and small (rrnS) subunit ribosomal RNA genes were identified by comparison with the mt rRNA genes of F. hepatica, C. sinensis, O. felineus, P. westermani and other flatworms [27,28,31].
Phylogenetic analyses. DNA sequences of the 12 proteincoding genes were concatenated and imported into BioEdit [36]. After translation (using Translation Table 9 http://www.ncbi. nlm.nih.gov/Taxonomy/taxonomyhome.html/index. cgi?chapter = tgencodes#SG9), the concatenated amino acid sequences were aligned using Clustal [37], and then backtranslated into nucleotide sequences to improve alignment. Phylogenetic trees were constructed using Bayesian analyses in MrBayes v3.1 [39] of concatenated sequences (nucleotides and inferred amino acid sequences) of the protein-coding genes in the mt genomes of P. cervi and 11 other digenean species ( Table 2). The mt genome sequence of the cestode Echinococcus granulosus (NC_008075) was used as an outgroup.
In every case two runs, each of four chains, were specified. For the nucleotide alignment, the GTR+I+G model was as described previously [24,38], partitioned by codon position. Bayesian analysis was run for 5,000,000 generations and sampled every 1000 generations. The first 25% of trees were omitted as burn-in and the remaining trees were used to calculate Bayesian posterior probabilities [39].
For the amino-acid alignment, MrBayes was allowed to determine the most appropriate model (''prset aamodelpr =mixed''), 2,000,000 generations were run and trees sampled every 500. The first 25% of trees were omitted as burnin.

Results and Discussion
General features of the mt genome of P. cervi. Lengths cited for some ''complete'' digenean mt genomes in GenBank are incorrect. They are instead the lengths of the coding portions. For these species, amplification and sequencing of non-coding regions   , also a member of the Schistosomatidae, has a very short non-coding region and a total length of 14,838 bp [19,21]. S. turkestanicum is 14,755 bp [17]. The complete mtDNA sequence of P. cervi (deposited in GenBank, accession number KF475773) is 14,014 bp in length, Table 3. Positions and lengths of genes and regions of P. cervi mt genome, and start and stop codons for the protein-coding genes as well as anticodons for the tRNA genes (starting from cox3). within the range of typical sizes for metazoan mt genomes (14-18 kb). The mt genome of P. cervi is larger than that of C. sinensis (13,875 bp), but smaller than those of other digenean species available in GenBank TM to date ( Table 2). It contains 12 proteincoding genes (cox1-3, nad1-6, nad4L, atp6 and cytb), 22 transfer RNA genes and 2 ribosomal RNA genes (rrnL and rrnS) (Tables 2  and 3). All genes are transcribed in the same direction, which is consistent with other digeneans. The arrangement of proteinencoding genes in P. cervi is the same as that of the F. hepatica [27,28], O. felineus [30], P. westermani, S. turkestanicum [17], S. japonicum and S. mekongi mt genomes, but different from that seen in S. haematobium, S. mansoni and S. spindale [19]. A 40 bp overlap between the 39 end of nad4L and the 59 end of nad4 was noted in P. cervi, similar to that of other digeneans.

Protein-encoding genes
In total, 3,364 amino acids are encoded by the P. cervi mt genome. The nucleotide composition in P. cervi was biased toward G and T, which is similar to that of the digeneans F. hepatica, O. felineus, C. sinensis, P. westermani and the outgroup cestode, E. granulosus, but is slightly different from S. turkestanicum, S. japonicum and other schistosomes, which are biased toward A and T. In the protein-coding genes of P. cervi, strong bias against the usage of C (8.76%, on average) and strong bias in favor of the usage of T (47.77%, on average) were observed. The frequency of usage for G (27.46%, on average) was higher than that for A (16.01%, on average) ( Table 4).
The most common inferred start codon for mt proteinencoding genes of digenean species is ATG, followed by GTG (e.g. F. hepatica [27], O. felineus and C. sinensis [30], S. turkestanicum [17], S. spindale and S. haematobium [19]). GTG was also a frequent initiation codon (5/12) for the mt protein-encoding genes of P. cervi. It is interesting that the stop codon TAG was used for all the mt protein-coding genes of P. cervi. This is unusual, because another termination codon, TAA, is often observed in other digeneans.

Transfer RNA (tRNA) genes
Except for tRNA-Ser1 (AGN) and tRNA-Cys, all tRNA genes appear to exhibit the standard cloverleaf structure. The predicted secondary structure of the serine tRNA (AGN) contains the TYC arm but lacks the DHU arm (terminology follows Wolstenholme, 1992) [43], a situation which is also found in O. felineus [30] and some other digeneans. For the cysteine tRNA, a four-armed structure is feasible, but so is a three-armed structure with a DHU-replacement loop (Fig. 1). It is noteworthy that tRNA-Glu and tRNA-Gly have switched positions in P. cervi relative to the situation in F. hepatica, P. westermani and the opisthorchiids (O. felineus, O. viverrini and C. sinensis), suggesting that this change in tRNA gene position could provide an important phylogenetic signal [19].

Ribosomal RNA genes
The rrnL (16S ribosomal RNA) and rrnS (12S ribosomal RNA) genes of P. cervi were identified by sequence comparison with those of F. hepatica, O. felineus, and Schistosoma spp. These two genes are separated by tRNA-Cys. The sizes of the rrnL and rrnS genes were 986 and 749 bp, respectively, and their A+T content was 63.59% and 60.75%, respectively, which are the lowest among the digeneans studied to date (Tables 2 and 4).

Non-coding regions
Non-coding regions exist in the mt genomes of many parasitic flatworms, but the locations of these relative to major genes tend to be rather variable. It is usual to recognize two such non-coding regions in digeneans: long and short non-coding regions (LNR and SNR) that are often separated by one or more tRNA genes. A common feature of LNRs is the presence of long repeats. Such features are found in F. hepatica, most or all Schistosoma species (but await further characterization in several species) and in P. westermani (for which the LNR also awaits full characterization). In other species, notably in the genera Opisthorchis and Clonorchis, the non-coding regions lack strong structures, such as large repeats. There does not seem to be a strong phylogenetic element to length and structure of the LNR. In P. cervi, there is a short noncoding region (SNR) (58 nucleotides), lacking any notable features and located between cytb and nad4L. A long non-coding region (LNR) (499 nucleotides), is observed between tRNA-Glu and cox3   (Tables 3 and 4). Short homopolymer tracts (, 8 nt) and short microsatellite-like tracts -e.g. (AT) n -are present in this region, but there are no long direct or inverted repeats, nor any similarities with the SNR (Tables 3 and 4). Although the replication process(es) of mt DNA of digeneans is unclear, it is not difficult to predict that the AT-rich non-coding region might be involved in the initiation of replication [9,17,19,44].

Phylogenetic analyses
Some systematic and population genetic studies have been completed based on genetic markers in the mt genomes of flukes [16,17,19,21,24,26,30,45]. So far, the full-length mt genomes of 12 digenean species have been determined and characterized, and these have been used in the phylogenetic study. Using complete mt sequences for phylogenetic analyses is more reliable according to Figure 2. Inferred phylogenetic relationship among the digenean species. Trees were inferred using MrBayes v3.1. A, tree inferred from concatenated nucleotide sequences of 12 protein-coding genes, using the cestode E. granulosus as the outgroup. Posterior support values are given at nodes. See text for more details. B, tree inferred from concatenated amino acid sequences. Only the portion of the tree (members of the order Plagiorchiida) that differs from that in A is shown. C, tree of members of the Plagiorchiida according to phylogeny proposed by Olson et al (2003) [47]. doi:10.1371/journal.pone.0071300.g002 the study of Waeschenbach et al (2012), who confirmed that alignments of .10,000 nucleotides from mtDNAs can provide a rich resource for phylogeny construction, hypothesis-testing and interpretation of the evolution of the major lineages of tapeworms [46]. Now that we have a complete mt genome from a member of the Paramphistomidae, we can begin to explore this possibility for the digeneans. The tree inferred from concatenated nucleotide sequences of the 12 protein-coding genes is shown in Fig. 2A. All nodes are supported by very high posterior probabilities (100%). Two large clades are apparent: one contains seven members of the Family Schistosomatidae (Order Diplostomida -following [47]) and the other includes five members representing four families within the Order Plagiorchiida. Fig. 2B reveals the corresponding tree inferred from amino-acid sequences (only species within the Plagiorchiida are shown: the tree for the members of the Diplostomida was identical with that in Fig. 2A). MrBayes indicated that the most appropriate substitution model for the amino-acid alignment was ''cprev'', originally developed for proteins encoded by chloroplast genomes [48]. For the Diplostomida, phylogenetic relationships depicted in Fig. 2A exactly match those previously reported (e.g. Morgan et al, 2003) [49]. For members of the Plagiorchiida, the situation is a little more complicated. Fig. 2C depicts relationships among the four plagiorchiid families abstracted from the phylogeny in Olson et al (2003) [47]. According to this, the sequence of families in order of increasingly derived status is: Paramphistomidae, Fasciolidae, Opisthorchiidae and Paragonimidae. This arrangement was seen in the tree based on nucleotide sequences ( Fig. 2A), but not in the tree inferred from amino-acid sequences (Fig. 2B), in which a clade containing Fasciolidae and Paragonimidae was strongly supported and P. cervi was sister to this clade (in 70% of trees). In 30% of trees, P. cervi was placed basal to the other plagiorchiids (as in Fig. 2A), but even in these trees, the clade containing Fasciola and Paragonimus was strongly supported. Speculation as to the cause of this is premature. Our sampling of the Order Plagiorchiida is very sparse and sequences from many additional digenean mt genomes will be needed before we can be sure that we have stable and defensible phylogenetic hypotheses.

Summary
In conclusion, the present study determined the complete mt genome sequence of P. cervi, which possesses the same gene order (except for tRNA-Glu and tRNA-Gly) as most other digeneans, consisting of 12 protein-coding genes, 2 rRNA genes and 22 tRNA genes. Phylogenetic trees, based on sequences of protein-coding genes, could identify the two orders represented (Diplostomida and Plagiorchiida). For members of the Diplostomida, relationships are exactly as expected from other studies. For the five members of the Plagiorchiida, results are not as consistent, but our sampling of this clade is very sparse and additional sequences are needed. The complete mtDNA sequence of P. cervi will add the knowledge to digenean mitochondrial genomics. It will also provide an important resource for the studies of inter-and intra-specific variation of the Paramphistomidae and a resource for comparative mitochondrial genomics and systematic studies of digeneans.