Skip to main content
Advertisement

< Back to Article

Fig 1.

Phylogenetic tree of Ehrlichia and Anaplasma genus shows three different clades.

A maximum likelihood tree of four Ehrlichia species (E. chaffeensis str. Arkansas, E. canis str. Jake, E. muris AS145, E. ruminantium str. Gardel), three Anaplasma species (A. phagocytophilum str. HZ, A. marginale str. Florida, A. centrale str. Israel) and W. endosymbiont of D. melanogaster (out group) was reconstructed on the basis of concatenated nucleic acid alignment of proteins shared by all species (core genomes) with 100 bootstrap resamplings. The following are represented for each bacterium: genome size (red), number of ORFs (blue), number of predicted T4 effectors (green), number of unique predicted effectors (yellow) and known major hosts (black symbols).

More »

Fig 1 Expand

Fig 2.

Comparison of the pools of predicted Type IV effectors among Anaplasmataceae species revealed strong conservation in Ehrlichia spp.

The colour gradient represents the similarity between sets of effectors (pale colours mean high similarity). The different species are ordered according to the phylogenetic tree (Fig 1). Clusters defined on the basis of similar effectors repertoires are marked on the left by black lines.

More »

Fig 2 Expand

Fig 3.

Network analysis of Anaplasmataceae species-specific pT4Es and host range suggests the existence of host-specific pT4Es.

This network of species-specific pT4Es was drawn using the hive plot algorithm, which is a rational visualization method for drawing networks based on their structural properties. Nodes are mapped to and positioned on radially distributed linear axes and edges are drawn as curved links. Ehrlichia species-specific pT4Es are represented by nodes on the a1 axis of the hive plot. The size of each node is linked to the number of species-specific pT4Es for a given Ehrlichia species which is colour coded as shown in the upper left rectangle: E. chaffeensis str. Arkansas (blue), E. canis str. Jake (orange), E. ruminantium str. Gardel (green) and E. muris AS145 (red). Purple node is the combined subset of pT4Es specific to E. chaffeensis str. Arkansas and E. muris AS145. Anaplasma species-specific pT4E are represented by nodes on the a3 axis whose size and nuance of grey depend on the Anaplasma species. The different hosts of these 7 Anaplasmataceae bacteria are represented by grey nodes on the a2 axis. Curved links a1-a2 and a3-a2 show the putative host specificity of each bacterium. Links between a1-a3 represent shared host-specific effectors. The colour of each link is related to the node from which it emerges.

More »

Fig 3 Expand

Fig 4.

Mapping of Ehrlichia spp. predicted Type IV effectors (pT4Es) and their homologies highlights the genomic plasticity of this genus.

Genomes of E. chaffeensis str. Arkansas (blue), E. canis str. Jake (orange), E. ruminantium str. Gardel (green) and E. muris AS145 (red) are represented in the outer circle of this Circos graph. The second and third circles represent the genes encoding the pT4Es (sense and antisense genes, respectively). The genes are colour coded depending on the genome in which they originated and species-specific genes are in black. Links using a spectral color scheme show homologies between pT4Es of the four genomes. The homologies between the pT4Es of the four genomes are also represented by squares of the corresponding genome colour.

More »

Fig 4 Expand

Fig 5.

The distribution of Ehrlichia type IV effectomes according to local gene density shows an enrichment of pT4Es in gene sparse regions.

Distribution of E. canis str. Jake genes according to the length of their flanking intergenic regions (FIRs). All E. canis genes were sorted in two-dimensional bins according to the length of their 5′ (y-axis) and 3′ (x-axis) FIRs. The number of genes in the bins is represented by a colour-coded density graph. Genes whose FIRs were both longer than the median length of FIRs were considered as gene-sparse region (GSR) genes. Genes whose FIRs were both below the median value were considered as gene-dense region (GDR) genes. In between region (IBR) genes are genes with a long 5′ FIR and short 3′ FIR, and inversely. For E. canis, this median value is 225 bp for 5′ FIRs and 304 bp for 3′ FIRs. The dashed line showing the median length of FIR delimits the genes in GSR, GDR and IBR. Candidate effectors predicted using the S4TE 2.0 algorithm were s plotted on this distribution according to their own 3′ and 5′ FIRs. A colour was assigned to each of the three following groups: red to GDRs, orange to IBRs, and blue to GSRs. Specific pT4Es are represented by a dot circled in black. In the top right corner, a Circos graph shows the distribution of E. canis str. Jake putative effectors along the genome. The outermost and second circles (in grey) represent E. canis antisense and sense genes, respectively. The third and innermost circles represent pT4Es. The black, red, orange and blue colour of each putative T4 effector corresponds to species-specific effectors located in GDRs, IBRs and GSRs, respectively.

More »

Fig 5 Expand

Fig 6.

The distribution of Ehrlichia type IV effectomes according to local gene density and ΔGC content shows an enrichment of pT4Es in high GC content regions.

Distribution of E. canis str. Jake genes according to the length of their flanking intergenic regions (FIRs). All E. canis genes were sorted in two-dimensional bins according to the length of their 5′ (y-axis) and 3′ (x-axis) FIR lengths. For each gene, the ΔGC content was calculated by subtracting the GC content of a gene by the average of GC content of all the genes. The mean of ΔGC of genes in the bins is represented by a colour-coded density graph. GSR, GDR and IBR were defined as described for the analysis of local gene density. A colour was assigned to each of the three following groups: red to GDRs, orange to IBRs, and blue to GSRs. Specific pT4Es are represented with a dot circled in black. In the top right corner, a density graph indicates the density of pT4Es according to ΔGC content (red line) and the density of other genes (black line).

More »

Fig 6 Expand

Fig 7.

Protein architecture network of Ehrlichia pT4Es shows a large number of interactions between protein domains.

This network of protein architecture of pT4Es is drawn using the hive plot algorithm, which is a rational visualization method for drawing networks based on their structural properties. Nodes are mapped to and positioned on radially distributed linear axes, and edges are drawn as curved links. Each node represents a specific domain or a list of specific domains (see table) found in Ehrlichia T4Es predicted by S4TE 2.0. Links between domains represent the association of these domains in the architecture of Ehrlichia pT4Es. Links between NLS, Coiled-coils, EPIYA and E block domains (the most abundant protein domains in Ehrlichia) and other nodes are red, blue, green and orange, respectively. Other links are pale grey. The table of domains identifies the protein domains for each node (numbered) and their occurrences in Ehrlichia spp predicted T4 effectomes. All the domains are ranked on the three axes (a1, a2, a3) according to the number of their links. Let X be the number of links between one domain and the others, X < 15 was represented on a1, 15 ≥ X ≤ 40 was represented on a2, and X > 40 was represented on a3, thus defining the most abundant domains. a1’, a2’ and a3’ split axes were drawn to represent the links between domains located on the same axis.

More »

Fig 7 Expand

Fig 8.

Protein architecture network of putative effectors for rarely occurring domains.

This network of pT4E protein architectures was drawn using the hive plot algorithm to produce a rational visualization method based only on the network structural properties. Each node represents a specific domain or a list of specific domains (see table) found in Ehrlichia T4Es predicted by S4TE 2.0. Links between domains represents the association between these domains in the architecture of Ehrlichia pT4Es. Nodes representing the most abundant domains presented in Fig 7 (NLS, Coiled-coils, EPIYA and E block domains) and their corresponding links to other nodes are not included in this graph to highlight the less abundant protein domains in Ehrlichia spp predicted T4 effectomes. The table of domains identifies the protein domains for each node (numbered) and their occurrences. All the domains are ranked on the three axes (a1, a2, a3) according to the number of their links. Let X be the number of links between one domain and the others, X < 3 was represented on axis a1, 3 ≥ X ≤ 5 was represented on axis a2, and X > 5 was represented on axis a3.

More »

Fig 8 Expand

Fig 9.

Ankyrin-containing predicted type IV effectors show diverse architectures and inter- and intragenic rearrangements in Ehrlichia spp.

A. Each protein in Ehrlichia spp. pT4E whose architecture includes an ankyrin domain is represented. B. Relative time phylogenetic tree build from 11 nucleotide sequences of Ankyrin-containing predicted type IV effectors (pT4Es). ECH_0653 was used as outgroup. The numbers in front of each node represent the relative theoretical time from the putative common ancestor of two branches. Evolutionary analyses were conducted in MEGA7 [34]. C. Dot plot of regions of similarities between ECH-0684 (x-axis) and ERGA_CDS_03830 (y-axis). This graph was constructed with the dotmatcher software included in the EMBOSS package, where all positions from the first input sequence are compared with all positions from the second input sequence using a specified substitution matrix and using a window size of 50 and a threshold of 50. The two sequences are the axes of the rectangular dotplot. Wherever there is "similarity" between a position from each sequence a dot is plotted.

More »

Fig 9 Expand

Fig 10.

The strong domain diversity of HATPase_c-containing putative type IV effectors defines three conserved families of effectors in Ehrlichia spp.

A. Each protein in Ehrlichia spp. type IV effectome whose architecture includes a HATPase_c domain is represented. B. Representation of the genomic context surrounding pT4E ERGA_CDS_03390 between position 557733 and 563669 of E. ruminantium str. Gardel genome. Pale grey arrows represent sense genes and dark grey arrows represent anti-sense genes. The GC content of this region of the genome was calculated using 200 bp windows and is represented by the black curve. The average GC content of this gene cluster (25.75% of GC) is indicated by the horizontal red line.

More »

Fig 10 Expand