Genetic diversity of Entamoeba: Novel ribosomal lineages from cockroaches

Our current taxonomic perspective on Entamoeba is largely based on small-subunit ribosomal RNA genes (SSU rDNA) from Entamoeba species identified in vertebrate hosts with minor exceptions such as E. moshkovskii from sewage water and E. marina from marine sediment. Other Entamoeba species have also been morphologically identified and described from non-vertebrate species such as insects; however, their genetic diversity remains unknown. In order to further disclose the diversity of the genus, we investigated Entamoeba spp. in the intestines of three cockroach species: Periplaneta americana, Blaptica dubia, and Gromphadorhina oblongonota. We obtained 134 Entamoeba SSU rDNA sequences from 186 cockroaches by direct nested PCR using the DNA extracts of intestines from cockroaches, followed by scrutinized BLASTn screening and phylogenetic analyses. All the sequences identified in this study were distinct from those reported from known Entamoeba species, and considered as novel Entamoeba ribosomal lineages. Furthermore, they were positioned at the base of the clade of known Entamoeba species and displayed remarkable degree of genetic diversity comprising nine major groups in the three cockroach species. This is the first report of the diversity of SSU rDNA sequences from Entamoeba in non-vertebrate host species, and should help to understand the genetic diversity of the genus Entamoeba.


Introduction
The genus Entamoeba is an important taxonomic group consisting of parasitic species that reside in a variety of vertebrate and invertebrate hosts, and potentially free living species that are isolated from the environment.E. histolytica is one of the major causes of diarrheal diseases in tropical regions, which ranks fifth of DALY in 2015 [1].Since other Entamoeba species generally lack virulence in humans, comparative biology, biochemistry, and genetics have been applied to the Entamoeba genus mainly to attempt to discover the virulence-related genes and to understand the evolution of Entamoeba pathogenicity in humans.
In order to better understand the genetic diversity of Entamoeba inhabiting invertebrate organisms, we investigated Entamoeba from cockroaches.Here we report SSU rDNA-based genetic diversity of Entamoeba from three cockroach species: one common house cockroach, Periplaneta americana, and two forest cockroaches, Blaptica dubia (orange-spotted cockroach, Guyana spotted cockroach, or Argentinian wood cockroach) and Gromphadorhina oblongonota (Madagascar forest hissing cockroach).

Cockroach collection and isolation of intestinal contents
Three cockroach species were used in this study: Periplaneta americana (American cockroach), Blaptica dubia (Argentinian forest cockroach, Dubia cockroach) and Gromphadorhina oblongonota (Madagascar hissing cockroach).P. americana were collected from an apartment in Bangplee, located in an urban area of Samutprakarn, Thailand (13˚36' 0" N, 100˚36' 0" E) in April 21, 2016 and July 28, 2016 by manual capture (No specific permissions were required for field studies.The field studies did not involve endangered or protected species.).Individual bugs were identified as P. americana by their yellowish circular marking on the prothorax and were collected in two sampling periods.B. dubia and G. oblongonota (3-5 cm in size) were purchased from a pet shop in Tokushima, Japan (34˚4' 0" N, 134˚34' 0" E) where they were domestically bred.The cockroaches were dissected in order to isolate and excise their intestines.For the first batch of P. americana collected (Pa_01 to Pa_30), intestines isolated from 4 individual cockroaches were combined, and then ground in a sterile mortar and pestle in 2 ml of sterile normal saline; that is, sample Pa_01 contained the intestines of 4 cockroaches.For P. americana collected in the second period, B. dubia and G. oblongonota (Pa_31 to Pa_80, Bd_01 to Bd_22 and Go_01 to Go_14 respectively), the intestines were not combined and were ground separately.

DNA extraction and amplification of SSU rDNA derived from Entamoeba
DNA was extracted from approximately 500 μL of the ground intestine(s) using DNeasy Blood and Tissue kit (QIAGEN, Tokyo, Japan).A fragment corresponding to Entamoeba SSU rDNA was amplified by nested PCR using DNA extracted from the isolated cockroach intestine(s).In the first round of PCR, an approximately 1,950 bp long fragment corresponding to SSU rDNA was amplified using eukaryotic universal oligonucleotide primers specific for SSU rDNA (EukA: 5'-AACCTGGTTGATCCTGCCAGT-3' and EukB: 5'-TGATCCTTCTGCAGGTTC ACCTAC-3'; [16]) by Tks Gflex DNA Polymerase (TaKaRa, Shiga, Japan).PCR conditions consisted of 30 cycles of denaturation at 94˚C for 22 seconds, annealing at 42˚C for 1 minute and extension at 72˚C for 1 minute.One μL of PCR products were used as templates of the second round PCR.In the second round of PCR, an approximately 1,900 bp fragment of Entamoeba SSU rDNA was selectively amplified using oligonucleotide primers specific for Entamoeba SSU rDNA (01F: 5'-GCCAGTATTATATGCTGA-3'and 01R: 5'-CCTTGTTAC GACTTCTCCTT-3').PCR conditions consisted of 30 cycles of denaturation at 94˚C for 22 seconds, annealing at 52˚C for 1 minute and extension at 72˚C for 1 minute.

Sequencing and screening of SSU rDNA of Entamoeba from cockroaches
The amplicons obtained from the second round PCR were cloned into pCRTM-Blunt II-TOPO (Thermo Fisher Scientific, Waltham, Massachusetts, USA) and the plasmids were transfected into competent Escherichia coli DH5α cells.Five to twenty colonies were examined by PCR using the universal oligonucleotide primers M13F/R (5'-GTAAAACGACGGCCAGTG-3' and 5'-CAGGA AACAGCTATGACCATG-3')to confirm if an insert is present in the plasmids from the bacterial colonies.After purification of plasmids, an insert of each plasmid was fully sequenced in both directions with M13F, M13R, M13Mid1 (5'-TACTTTGAATAAATACGAGTGTT-3'), and M13Mid2 (5'-TCCCGTGTTGAGTCAAATTAA-3') primers.The latter two primers correspond to 18S rRNA gene.The sequences were examined by BLASTn [17] search against non-redundant (nr) nucleotide database of NCBI with default parameters to verify whether they only show highest similarity with Entamoeba.When needed, phylogenetic analysis (described below) was also used.Sequence reads were assembled using CLC Genomics Workbench Version 8.5.1 (Qiagen Aahus A/S, Aahus C, Denmark).

Molecular phylogenetic analysis
Molecular phylogenetic analysis was performed to determine the relationship of cockroachderived Entamoeba SSU rDNA with other eukaryotic organisms including other known Entamoeba species and Archamoebae.Analyses were performed as follows: 1) Sequences were aligned by MAFFT v7.187 [18], 2) aligned nucleotide sites were selected by Gblocks [19] and manual inspection using SeaView 4.6 [20], 3) Maximum-likelihood (ML) tree was inferred by RAxML 8.1.5[21] with General Time-Reversible (GTR) + gamma substitution model.Statistical confidence of ML trees was evaluated with bootstrap proportions of the trees from 100 or 1,000 replicates for screening and detailed analyses, respectively.In the screening, when a sequence analyzed showed monophyly with other known Entamoeba species, it was considered to be included in the Entamoeba genus.

Results and discussion
A total of 134 Entamoeba SSU rDNA sequences were obtained from 186 cockroaches The workflow of acquisition and screening of Entamoeba SSU rDNA genes from cockroaches is summarized in Fig 1 .In brief, we isolated and purified DNA from the intestines of 186 cockroaches (150 P. americana, 22 B. dubia, and 14 G. oblongonota), and SSU rDNA was amplified by nested PCR.Nested PCR was successful for 54, 16, and 8 samples, respectively.The plasmids that contained nested PCR products (256, 50 and, 36 from each cockroach group) were obtained and sequenced.Subsequently, BLASTn search and phylogenetic analyses were performed to exclude non-Entamoeba SSU rDNA sequences.Finally, 77, 39, and 18 Entamoeba SSU rDNA sequences were subjected to further analyses (Table 1).
Entamoeba SSU rDNA sequences from cockroaches are extremely heterogeneous, divergent from the reported sequences of known Entamoeba species, and composed of nine major groups All Entamoeba SSU rDNA sequences from cockroaches are divergent from the reported sequences from known Entamoeba species.An unrooted phylogenetic tree was inferred by Maximum-likelihood (ML) method using 134 cockroach-derived Entamoeba SSU rDNA sequences (Fig 2).The 134 sequences were segregated into 9 groups (A-I), each of which was supported by good bootstrap values (> 70%), with exceptions for branching at A-B/C-I (47%), F/G (66%), H/I (43%) and F-G/H-I (33%).

Phylogenetic position of Entamoeba SSU rDNA sequences from cockroaches in eukaryotes
To examine the phylogenetic position of these cockroach-derived Entamoeba sequences, the cladogram was reconstructed using SSU rDNA dataset composing of major eukaryotic supergroups and eight representative sequences from Group A to I from cockroach-derived Entamoeba (   Group C represents the largest group of cockroach-derived Entamoeba and consists of 65 sequences (49% of all cockroach-derived Entamoeba sequences) from P. americana (28/77), B. dubia (20/24) and G. oblongonota (17/18).This group can be divided into three sub-groups; sub-group 1 consists of 20 sequences from B. dubia and three sequences from G. oblongonota, sub-group 2 consists of 28 sequences derived only from P. americana, and sub-group 3 consists of 14 sequences derived only from G. oblongonota (Fig 4).Note that monophyly of sub-groups 1 and 3 is well supported by the highest bootstrap proportion, while sub-group 2 does not form monophyly and may consist of multiple divergent sub-groups.
Groups F and G were defined by a separate analysis using amoebae only from P. americana.In the tree excluding amoebae from G. oblongonota and B. dubia, each of the groups F and G formed an independent clade with high statistical support value (S1 Fig) .Whereas in the tree including amoebae from G. oblongonota and B. dubia, monophyly of group F was not reconstructed, but instead amoebae of groups F and G were shown to be monophyletic with weak statistical support value (66%).Since branch lengths leading to the amoebae of groups F and G are long, it is possible that these amoebae were attracted in the tree in

The genetic diversity of cockroach-derived Entamoeba among all Entamoeba and Archamoebae
To obtain better resolution of all Entamoeba including cockroach-derived amoebae and Archamoebae species, the ML tree of the representative taxa was inferred (Fig 5).In the resulting   tree, the monophyly of Entamoeba comprising representative cockroach-derived Entamoeba and 9 known Entamoeba species (E.histolytica, E. moshkovskii, E. terrapinae, E. equi, E. gingivalis, E. marina, E. muris, E. coli, and E. polecki) are strongly supported with bootstrap value (97%; gray arrow head).The monophyly of known Entamoeba is well supported (84%; magenta arrow head) and their inter-specific relationships are also unequivocally reconstructed (66% to 100% bootstrap values).The cockroach-derived Entamoeba forms three major independent clades: Group A, Group B, and the rest, Group C to I. All three clades are positioned basal to known Entamoeba species.Group A consists of the most basal ribosomal lineages of cockroach-derived Entamoeba, and the levels of observed divergence among them were relatively lower than those of other groups.On the other hand, group B comprises of members isolated exclusively from P. americana, is a sister group to known vertebrate-derived Entamoeba, although its statistical support was weak (63%; green arrow head).Group C to I forms a single largest statistically supported clade and is sister to the clade comprised of group B and known Entamoeba (84; cyan arrow head).
Polymorphism of Entamoeba identified in a single cockroach and presence of cockroach species-specific and common Entamoeba groups For all the samples except for the first set of P. americana specimens (i.e., Pa_02 to Pa_27), single cockroaches were analyzed without cockroaches being pooled.Multiple groups were identified occasionally in a single P. americana (Pa_33 to Pa_80) sample (Table 2).The highest number of Entamoeba groups found in a single cockroach was 3 (Pa_49 and Pa_62), while 79% (22 of 28) of P. americana were found to harbor only a single Entamoeba group.B. dubia SSU rDNA sequences were aligned using MAFFT v7.187.Well-aligned 1,224 nucleotide positions were selected by Gblocks and manual operation.Maximum-likelihood (ML) tree was inferred by RAxML 8.1.17using GTRGAMMA model.The number of bootstrap pseudoreplicate trees was 1,000.ML tree was visualized using FigTree 1.4.0 and Keynote 6.6.(31%; 5 of 16 cockroaches) had two Entamoeba groups (Table 3).In contrast, no G. oblongonota harboring multiple groups was found, although the sample size was small (8 cockroaches and 18 sequences; Table 4).Group C was the most common and highly shared group discovered from three cockroach species.The 23 sequences consisting the sub-group 1 of group C were mutually very similar (> 99.5% mutual positional identity; Table 5).In other words, almost identical Entamoeba sequences that belong to group C sub-group 1 were discovered from both the forest cockroaches (B.dubia and G.oblongonota), suggestive of conservation of genetic traits of this subgroup despite distinct host species and geographic origins.

Discovery of novel Entamoeba ribosomal lineages in cockroaches expands our understanding of genetic diversity of Entamoeba
We have demonstrated that the genetic diversity of Entamoeba derived from three cockroach species overwhelms that of previous reports which described diversity among species found in vertebrates, as well as the potential free living species (E.moshkovskii and E. marina).Despite our repeated attempts, we were unable to cultivate cockroach-derived Entamoeba and thus to get sufficient amount of genomic DNA or RNA for whole genome and transcriptome analyses.Hence, the genome of cockroach-derived Entamoeba remains to be elucidated.

Fig 3 ;
marked with green circles in Fig 2; group D and G were omitted because of their high evolutionary rates).The monophyly of the clade comprising cockroach-derived Entamoeba (Pa_61-11, Bd_18-6, Pa_49-13, Pa_33-4, Bd_18-8, Go_10-1, Pa_27-2, and Bd_21-3) and other Entamoeba species were strongly supported (Fig 3; black arrow).This clade is nested within the node that contains other Archamoebae (Pelomyxa belevskii, Rhizomastix libera, Mastigamoeba balamuthi and Endolimax nana) and Dictyostelium discoideum, with high bootstrap support (Fig 3; black arrow).Although the monophyly of Amoebozoa was not supported by the bootstrap value, these data are consistent with the premise that the newly identified Entamoeba sequences are from novel Entamoeba ribosomal lineages.Polymorphism of Entamoeba SSU rDNA sequences from cockroaches As shown above, cockroach-derived Entamoeba SSU rDNA sequences were categorized into 9 groups (Fig 2).Groups A, B, D, E, H, and I were independent and well separated clades with

Fig 1 .
Fig 1. Flow diagram depicting experimental procedures and the number of analyzed samples.The numbers in rectangles indicate those of samples from P. americana (first sampling), P. americana (second sampling), B. dubia and G. oblongonota, respectively.For samples from the first sampling of P. americana, the intestines from 4 cockroaches were pooled.https://doi.org/10.1371/journal.pone.0185233.g001 Fig 2 by a long branch attraction artifact.

Fig 2 .
Fig 2. SSU rDNA-based phylogenetic tree of 134 Entamoeba sequences from cockroaches.SSU rDNA sequences were aligned using MAFFT v7.187.Unambiguously aligned sequences composed of 1,023 nucleotides were selected by Gblocks and manual inspection.Maximum-likelihood (ML) tree was inferred by RAxML 8.1.17using GTRGAMMA model.The number of bootstrap pseudoreplicate trees was 1,000.ML tree was visualized using FigTree 1.4.0 and Keynote 6.6.2.Bootstrap values for major nodes are shown on each node.Nine groups (A to I) were shown to be monophyletic with high bootstrap support values.Representative sequences of each group used in Fig 3 orFig 4 are indicated by green circles or magenta circles, respectively.

Fig 3 .
Fig 2. SSU rDNA-based phylogenetic tree of 134 Entamoeba sequences from cockroaches.SSU rDNA sequences were aligned using MAFFT v7.187.Unambiguously aligned sequences composed of 1,023 nucleotides were selected by Gblocks and manual inspection.Maximum-likelihood (ML) tree was inferred by RAxML 8.1.17using GTRGAMMA model.The number of bootstrap pseudoreplicate trees was 1,000.ML tree was visualized using FigTree 1.4.0 and Keynote 6.6.2.Bootstrap values for major nodes are shown on each node.Nine groups (A to I) were shown to be monophyletic with high bootstrap support values.Representative sequences of each group used in Fig 3 orFig 4 are indicated by green circles or magenta circles, respectively.https://doi.org/10.1371/journal.pone.0185233.g002

Fig 4 .
Fig 4. Phylogenetic tree of SSU rDNA of Group C sequences of cockroach-derived Entamoeba.SSU rDNA sequences were aligned using MAFFT v7.187.Unambiguously aligned sequences composed of 1,224 nucleotides were selected by Gblocks and manual inspection.Maximum-likelihood (ML) tree was inferred by RAxML 8.1.17using GTRGAMMA model.The number of bootstrap pseudoreplicate trees was 1,000.ML tree was visualized using FigTree 1.4.0 and Keynote 6.6.2.Bootstrap values for major nodes are shown on each node.https://doi.org/10.1371/journal.pone.0185233.g004

Fig 5 .
Fig 5. Phylogenetic tree of SSU rDNA of representative cockroach-derived Entamoeba ribosomal lineages and other Archamoebae species.SSU rDNA sequences were aligned using MAFFT v7.187.Well-aligned 1,224 nucleotide positions were selected by Gblocks and manual operation.Maximum-likelihood (ML) tree was inferred by RAxML 8.1.17using GTRGAMMA model.The number of bootstrap pseudoreplicate trees was 1,000.ML tree was visualized using FigTree 1.4.0 and Keynote 6.6.2.Bootstrap values (over 60%) are shown on each branch.Monophyly of Entamoeba is strongly supported with high bootstrap value (97%; gray arrow head).Commencing with Pa_27-2 and Bd_21-3, all cockroach-derived Entamoeba are positioned at the base of Entamoeba clade.

S6Fig.
Multiple alignment using full length sequences of group C. SSU rDNA sequences were aligned using MAFFT v7.187.The whole part of the alignment was visualized by Sea-View4.The alignment indicates exact address of well aligned sites and variant sites.(PDF)