Mitochondrial DNA Genomes Organization and Phylogenetic Relationships Analysis of Eight Anemonefishes (Pomacentridae: Amphiprioninae)

Anemonefishes (Pomacentridae Amphiprioninae) are a group of 30 valid coral reef fish species with their phylogenetic relationships still under debate. The eight available mitogenomes of anemonefishes were used to reconstruct the molecular phylogenetic tree; six were obtained from this study (Amphiprion clarkii, A. frenatus, A. percula, A. perideraion, A. polymnus and Premnas biaculeatus) and two from GenBank (A. bicinctus and A. ocellaris). The seven Amphiprion species represent all four subgenera and P. biaculeatus is the only species from Premnas. The eight mitogenomes of anemonefishes encoded 13 protein-coding genes, two rRNA genes, 22 tRNA genes and two main non-coding regions, with the gene arrangement and translation direction basically identical to other typical vertebrate mitogenomes. Among the 13 protein-coding genes, A. ocellaris (AP006017) and A. percula (KJ174497) had the same length in ND5 with 1,866 bp, which were three nucleotides less than the other six anemonefishes. Both structures of ND5, however, could translate to amino acid successfully. Only four mitogenomes had the tandem repeats in D-loop; the tandem repeats were located in downstream after Conserved Sequence Block rather than the upstream and repeated in a simply way. The phylogenetic utility was tested with Bayesian and Maximum Likelihood methods using all 13 protein-coding genes. The results strongly supported that the subfamily Amphiprioninae was monophyletic and P. biaculeatus should be assigned to the genus Amphiprion. Premnas biaculeatus with the percula complex were revealed to be the ancient anemonefish species. The tree forms of ND1, COIII, ND4, Cytb, Cytb+12S rRNA, Cytb+COI and Cytb+COI+12S rRNA were similar to that 13 protein-coding genes, therefore, we suggested that the suitable single mitochondrial gene for phylogenetic analysis of anemonefishes maybe Cytb. Additional mitogenomes of anemonefishes with a combination of nuclear markers will be useful to substantiate these conclusions in future studies.


Introduction
. Analyses used the combination of mt genes (12S rRNA, 16S rRNA, ATP6, ATP8, COI, Cytb and ND3) and single copy nuclear genes (BMP4, RAG1 and RAG2) also inferred that the root clade of anemonefishes was the percula complex and P. biaculeatus, which dated back to 19 million years [19][20][21]. Litsios et al. [22] inferred that host specialist anemonefishes (the percula complex and P. biaculeatus) were environmental niche generalists based on seven nuclear markers (BMP4, Gylt, Hox6a, RAG1, Svep, S7 and Zic1). The most basal group was the percula complex only and excluded P. biaculeatus based on Cytb, 16S rRNA and the first half of mt control region [23]. To date, no studies have been conducted to elucidate the phylogenetic relationships among anemonefishes using mitogenomes.
This study utilized mitogenomes (i.e. all 13 protein-coding genes) and Bayesian and Maximum Likelihood (ML) approaches to verify the phylogenetic relationships within the Amphiprioninae. The eight mitogenomes of anemonefishes were compared; six were obtained from this study (A. clarkii, A. frenatus, A. percula, A. perideraion, A. polymnus and P. biaculeatus) and two from GenBank (A. bicinctus and A. ocellaris). The seven Amphiprion species represent all four subgenera even five complexes, and P. biaculeatus is the only species from Premnas. We tested the monophyly of the Amphiprioninae, examined the evolutionary status of P. biaculeatus within the Amphirioninae, and determined the ancestor species among anemonefishes.

Sample collection and identification
Specimens of six anemonefishes (A. clarkii, A. frenatus, A. percula, A. perideraion, A. polymnus and P. biaculeatus) were obtained from local aquariums (Xiamen, China) and the whole specimens were deposited in College of Ocean and Earth Sciences, Xiamen University (Table 1). After species identification [8], dorsal muscle samples were preserved in absolute ethanol solution and stored at -20°C till DNA extraction. This study was carried out in accordance with the guidelines for the Care and Use of Laboratory Animals. All anemonefishes surgery procedures were conducted under MS-222 (Tricaine Methanesulfonate) to induce sedation and anesthesia. The protocol was approved by the Animal Care and Use Ethics Committee of Xiamen University.

PCR amplification and sequencing
Total genomic DNA was extracted from the muscle by standard phenol-chloroform procedures [24]. The mitogenomes of anemonefishes were determined using eight consensus primer pairs with a long PCR technique (Table 2) [25]. PCR amplifications were carried out on an ABI 2700 Thermo Cycler (www.appliedbiosystems.com) in 25 μl reaction volumes, by using Takara Sequence assembly and gene annotation DNA sequences were assembled using Sequencher 4.1.4 (www.genecodes.com) to determine the mitogenomes of anemonefishes. Annotations of the mitogenomes were made by MitoAnnotator [26], including protein-coding genes, rRNA genes and non-coding regions. In addition, the tRNA genes were scanned by tRNA Scan-SE [27]. The annotation results were then submitted to NCBI by using Sequin (http://www.ncbi.nlm.nih.gov/projects/Sequin/). The organization map of mitogenomes was constructed by OrganellarGenomeDRAW [28].

Sequence analyses
Eight mitogenomes of anemonefishes were analyzed, including the six species assembled in this study and two species available from NCBI (GenBank accession numbers were listed in Table 3). Nucleotide compositions and pairwise sequence identities for mitogenomes were  [29] and MEGA 5.0 [30], respectively. For the control region, tandem repeat finder [31] was used to detect the tandem repeats. One-way ANOVA was conducted to test for significant differences in sequence variability of different regions at the level of 0.05.

Phylogenetic relationships analyses
All eight available mitogenomes of Anemonefishes (Labrodei: Pomacentridae, Amphiprioninae) and Abudefduf vaigiensis (Labrodei: Pomacentridae, Pomacentrinae) (GenBank accession number: AP006016) were used to phylogenetic relationships analyses. Chaetodon auripes (Percoidei: Chaetodontidae) (GenBank accession number: AP006004) was selected as an outgroup species. The 10 concatenated sequences (11,445 bp) of 13 protein-coding genes were aligned by CLUSTAL X [32]. Furthermore, jModelTest 2.0 [33] was used to infer the best fitting nucleotide substitution model for the 13 protein-coding genes based on both Akaike Information Criterion correction (AICc) and Bayesian Information Criterion (BIC). The best fitting models for the 13 protein-coding genes were shown in Table 4. Phylogenetic relationships analyses of 13 protein-coding genes were performed under the Bayesian framework using MrBayes 3.2 [34]. We also tested each of the 13 protein-coding genes, 12S rRNA and the combination of Cytb+12S rRNA, COI+12S rRNA, Cytb+COI and Cytb+COI+12S rRNA with Bayesian method. Two independent analyses were run for several million generations with the Markov Chain Monte Carlo (MCMC) method each using four chains (one cold and three heated), sampling every 10 generations till the average standard deviation of split frequencies lower than 0.01 ( Table 4). The first 10% of the sampled trees were discarded as "burnin", and the remaining trees were used to obtain a 50% majority rule consensus tree with Bayesian Posterior Probability (BPP). Phylogenetic relationships of 13 protein-coding genes were also inferred under the ML criterion using TREEFINDER [35]. Each analysis ran with 1,000 replicates using a random starting tree with propose TVM+G model based on AICc. Search replicates were marked by the log likelihood (lnL) scores, and only that with the best score was retained. The 50% majority consensus topology with bootstrap values was evaluated with 1,000 boot strap replications. The ML topology hypothesis was also tested under Approximately Unbiased (AU) test [36].

mtDNA organization and composition
The eight mitogenomes of anemonefishes encoded 13 protein-coding genes, two rRNA genes, 22 tRNA genes and two main non-coding regions, with the gene arrangement and translation direction basically identical to other typical vertebrate mitogenomes (Fig 2) [37]. The lengths of eight mitogenomes varied between 16,645 and 16,976 bp ( Table 3). The overall nucleotide similarity between the eight mitogenomes was high (91.12±2.9%). In the all 37 identified genes, most genes were encoded on the heavy-strand (H-strand) with the exceptions of the ND6 and eight tRNA genes which were located on the light-strand (L-strand) ( Table 5). The overall Hstrand nucleotide compositions of eight mitogenomes were 29.30±0.24% A, 25.71±0.32% T, 15.78±0.36% G and 29.20±0.19% C, showed an anti-G bias (p<0.05) ( Table 3).
The cumulative lengths of the 13 protein-coding genes ranged from 11,447 to 11,451 bp, accounted for 67.4% to 68.8% of the mitogenomes. The length of ND5 was 1,866 bp in A. ocellaris and A. percula, and 1,869 bp in the rest six mitogenomes. The phenomenon was attributed to the three nucleotides deletion in the upstream (30 nucleotides) of ND5, however, both structures could translate to amino acids successfully. Protein-coding genes of the eight mitogenomes were mostly initiated by the typical start codon ATG (Table 5). There were exceptions; the COI began with GTG, which was identical to most fish mitogenomes [38,39], as well as to chicken [40]. Additionally, except A. percula and A. frenatus which the initiation codons in ATP6 were ATG and CTG, respectively, the rest six mitogenomes retained GTG. Among the mt genetic code of vertebrates, the CTG is not a typical initiation codon; however, it was a common character in groupers (Epinephelidae) [41,42]. Meanwhile, for the termination codons (Table 5), COI, ATP8, ND4L, ND1 and ND6 were terminated with TAA or TAG; ND5 was terminated with AGA; and the other seven protein-coding genes were stop by incomplete codon T or TA that might be able to form complete termination signal UAA via post-transcriptional polyadenylation [43]. The usage of AGA stop codon in ND5 presumably was created from the ancestral TAG stop codon by deletion of the first nucleotide T and by use of R (A or G) as the third nucleotide, which occurred very early in the evolution of metazoans [44,45].
Among the two rRNA genes, the 12S rRNA was located between tRNA-Phe and tRNA-Val ranging from 948 to 951 bp in length and the 16S rRNA was located between tRNA-Val and tRNA-Leu (UUR) ranging from 1,695 to 1,698 bp in length ( Table 5). The 22 tRNA genes ranged from 66 to 74 bp in size (Table 5). Except tRNA-Ser (AGY) due to lack of the entire dihydrouridine (DHU) arm [46], the remaining 21 tRNAs could be folded into the typical clover-leaf secondary structure as determined by the tRNA-scan SE program [27].
The two main non-coding regions in the eight mitogenomes were origin of L-strand replication (O L ) and control region (D-loop) ( Table 5). The O L was located in the WANCY cluster [41] and varied from 31 to 34 bp in length (Table 5; Fig 2). D-loop was located between  tRNA-Pro and tRNA-Phe, which ranged in size from 836 to 1,231 bp (Tables 5 and 6). The overall nucleotide compositions of D-loop were 35.86±1.46% A, 30.52±1.07% T, 13.39±1.14% G and 20.23±1.50% C (Table 6), and the AT content (66.39±2.16%) was higher than the mitogenomes (55.02±0.47%) ( Table 3). The D-loop in anemonefishes consisted of three parts which were Termination Associated Sequence (TAS), Central Conserved Domain (CCD) and Conserved Sequence Block (CSB) (Fig 3). The TAS sequence was TA(G)CATATATGTA which contained the conserved TAS motif TA(G)CAT and the reversed complement TAS (cTAS) motif ATGTA. The TAS motif can pair with the cTAS motif to form a stable hairpin loop which presumably plays a significant role as sequence-specific signal for termination of mtDNA replication [47,48]. Six CSBs (CSB-F to CSB-A) were identified in the CCD. In addition, three CSBs (CSB-1 to CSB-3) were determined after the CCD with the exception of P. biaculeatus which only had the CSB-3. In the downstream of D-loop, variable tandem repeats after the CSB were found in A. clarkii, A. frenatus, A. polymnus and P. biaculeatus (Table 6). Similar to the gobiids (Gobiidae), the tandem repeats in the downstream of D-loop were just the short fragments and repeated in a simply way [49]. In contract to the tandem repeats in the upstream of D-loop of other fishes, each tandem repeats contained a conserved TAS motif and cTAS which formed the Extended Termination Associated Sequence (ETAS) [42,50].
Only the tree form of ATP8 did not support the Amphiprioninae as a monophyletic group; therefore, ATP8 maybe not a suitable single mt gene for phylogenetic relationships analysis of    [22] were similar to type I, however, the topology in Santini and Polacco [23] was close to type IV. The tree topology of 13 protein-coding genes was consistent with both the Bayesian and ML approaches of phylogenetic analyses, and the support values were robust which were above 80% bootstrap value on the ML tree and BBP of 1 on the Bayesian tree (Fig 4). The ML topology hypothesis was tested under AU test (p = 0.24), which indicated the reality of the tree. Firstly, the phylogenetic analyses topology which defined as type I supported the monophyly of Amphiprioninae (Table 4; Fig 4), same as previous studies based on partial mt DNA genes and nuclear genes [16][17][18][19][20][21][22][23]. Secondly, the genus Amphiprion was not a monophyletic group (Fig 4). On one hand, the percula complex and P. biaculeatus were grouped into one clade, which formed the ancestral taxon of the anemonefishes, as documented in previous studies [16][17][18][19][20][21][22]. This was in contrast to the finding of Santini and Polacco [23] who reported the basal group of anemonefishes was the percula complex only by using the first half sequence of the mt control region which evolves with a rapid evolutionary rate, which may lead to a reduction of resolution in the phylogenetic relationships analyses. On the other hand, the subgenera Amphiprion, Paramphiprion and Phalerebus formed the other clade and the basal was A. clarkii. Thirdly, the subgenus Amphiprion (represented by A.clarkii, A. frenatus and A. bicinctus) was not monophyletic as found in the previous studies [16][17][18][19][20][21][22][23].

Conclusions
The eight mitogenomes of anemonefishes were compared including six newly sequenced species from this study. The eight mitogenomes encoded 13 protein-coding genes, two rRNA genes, 22 tRNA genes and two main non-coding regions. In the 13 protein-coding genes, A. ocellaris (AP006017) and A. percula (KJ174497) had the same length in ND5 with 1,866 bp, which were three nucleotides less than the other six anemonefishes. Both structures of ND5, however, could translate to amino acid successfully. Only four mitogenomes (A. clarkii, A. frenatus, A. polymnus and P. biaculeatus) of anemonefishes had the tandem repeats in D-loop; the tandem repeats were located in downstream after CSB rather than the upstream and repeated in a simply way. Applying the 13 protein-coding genes to test the suggested taxonomic reorganization of the anemonefishes, the results supported the monophyly of the subfamily Amphiprioninae and the percula complex together with P. biaculeatus as the ancestral taxon of the anemonefishes. The tree forms of ND1, COIII, ND4, Cytb, Cytb+12S rRNA, Cytb+COI and Cytb+COI+12S rRNA were similar to that 13 protein-coding genes, therefore, we inferred that the suitable single mt gene for phylogenetic relationships analysis of anemonefishes maybe Cytb. In addition to offer insight into the evolution of the anemonefishes, the results of this work provided important molecular resources for the further studies of identification, conservation genetics, and other phylogenetic evolution of anemonefishes.