Figure 1.
Phylogeny of strain HS and related Sodalis-allied endosymbionts and free-living bacteria based on maximum likelihood analyses of a 1.46 kb fragment of 16S rRNA and a 1.68 kb fragment of groEL.
Insect endosymbionts that do not have proper nomenclature are designed by the prefix “E”, followed by the name of their insect host. The numbers adjacent to nodes indicate maximum likelihood bootstrap values shown for nodes with bootstrap support >70%.
Figure 2.
Alignment between strain HS contigs (top) and chromosomes of SOPE (left) and S. glossinidius (right).
The draft strain HS contigs are depicted in an arbitrary color scheme (outer top ring). Contigs sharing <5 kb synteny with either the SOPE or S. glossinidius genome are uncolored. The uppermost plot (colored in purple and orange) depicts G+C skew, based on a 40 kb sliding window. For upper tracks, grey bars depict genes unique to strain HS whereas green bars depict strain HS genes that share orthologs with the aligned symbiont chromosome. For lower tracks, green and red bars represent (respectively) intact and disrupted orthologs of strain HS genes in the insect symbiont genomes, whereas blue bars highlight prophage and IS-element sequences in the insect symbiont chromosomes. Plots of pairwise nucleotide sequence identity are shown in the lower alignment following in silico removal of prophage and IS-elements from the SOPE and S. glossinidius sequences. Consensus oriC and dif sequences are labeled to indicate putative origins and termini of chromosome replication.
Figure 3.
Retention of strain HS orthologs in S. glossinidius and SOPE according to COG functional category.
The dark shaded component of each bar refers to intact genes retained in both S. glossinidius and SOPE. The intermediate shaded component refers to intact genes retained in only S. glossinidius (upper bar) or SOPE (lower bar) and the lighter shaded component refers to genes that are either absent or disrupted in both S. glossinidius and SOPE. The COG categories are organized in five larger groups with red representing genes involved in information storage and processing, blue representing genes involved in cellular processes and signaling, black representing genes involved in metabolism, green representing genes with poorly characterized functions, and yellow representing components of phages and IS-elements.
Figure 4.
Alignments of three regions of the S. glossinidius, strain HS, and SOPE chromosomes.
Alignments of three regions of the S. glossinidius, strain HS, and SOPE chromosomes, corresponding to SG0948–SG0977 (A), ps_SGL0466–SG0918 (B) and ps_SGL0318–ps_SGL0330 (C) in the most recent S. glossinidius annotation [25]. Putative ORFs and intergenic regions are drawn according to scale, oriented according to their inferred direction of transcription and color-coded according to COG functional categories. While all of the depicted strain HS genes have intact reading frames, the status of their orthologs in S. glossinidius and SOPE are shown in the outer bars (green = intact, purple = inactivated). Nonsense mutations (premature stop codons) are depicted by purple diamonds, and frameshifting indels are depicted by purple triangles. Light grey connecting bars are syntenic nucleotide alignments, while brown bars illustrate IS-element acquisitions that occur more frequently in SOPE.
Table 1.
General features of the strain HS, SOPE, and S. glossinidius genome sequences.
Figure 5.
Average size of strain HS orthologs classified as intact, pseudogenized, and absent in SOPE (green) and S. glossinidius (red).
The average size of all strain HS ORFs is also shown in orange. Error bars depict the standard errors of the mean.
Figure 6.
Densities of disrupting mutations in SOPE and S. glossinidius pseudogenes.
The numbers of frameshifting and truncating indels and nonsense mutations were computed from alignments of strain HS, SOPE and S. glossinidius orthologs. Mutation densities were computed according to the original strain HS ORF sizes (left) or the current SOPE or S. glossinidius pseudogene sizes (right).
Figure 7.
Numbers of cryptic pseudogenes in S. glossinidius and SOPE estimated using a Monte Carlo simulation.
The simulation was repeated with an increasing number of candidate pseudogenes until estimates of pseudogene number (red) and the size difference between pseudogenes and intact genes (blue) matched empirical values shown in Figure 5 and Table 1, as highlighted by bold bars. The densities of disrupting mutations in S. glossinidius and SOPE pseudogenes (which include cryptic pseudogenes) are shown in the upper left inset, corresponding to the data points highlighted in bold.
Figure 8.
Base composition bias and mutation rates observed in pairwise comparisons between strain HS, S. glossinidius and SOPE.
The evolutionary relationships between SOPE, strain HS and S. glossinidius are depicted by bold lines drawn to scale in accordance with levels of genome-wide divergence at 4-fold degenerate (GC4) sites. Upper boxes show genome-wide GC-percentages at 2nd codon position (GC2), GC4 and intergenic (GCI) sites. Lower boxes depict the number of substitutions per site for intact genes (dGC2 and dGC4) and pseudogenes (dGC2Ψ and dGC4Ψ). The data were obtained from pairwise analysis of point mutations in 1,355 intact genes and 1,376 pseudogenes shared between strain HS and S. glossinidius, and 1,414 intact genes and 1,194 pseudogenes shared between strain HS and SOPE.
Table 2.
Allelic spectrum of pseudogene mutations in strain HS orthologs found in SOPE and S. glossinidius.