Figure 1.
Position of GM041182 in the Mycobacterium tuberculosis complex genome phylogeny.
The phylogenetic tree is based on the mapping of Illumina data for representative strains of the complex to the newly generated M. africanum GM041182. The phylogeny was reconstructed using Neighbor-joining and is based on 9,699 variable positions in at least one strain. One thousand bootstrap pseudo-replicates were used to assess clade reliability. All nodes had more than 90% support. The positions of RD900h and RD900h are indicated. Numbers in branches refers to the corresponding number of SNPs inferred. Lineage names are according Hersbergh 2008 while numbers are according to Comas et al. 2010. A review of the nomenclature and comparison with other typing systems can be found in Coscolla and Gagneux 2010 [70] and Comas et al. 2009 [71].
Figure 2.
Number of pseudogene events that occurred in the different lineages of the M. tuberculosis complex.
As the initial number of events was inferred from a three way comparison (see text for details) the number observed is a subset of all possible events that have happened in the evolution of the MTBC. Colours in the figure represent the different lineages of the MTBC as defined in Figure 1. Green: M. africanum West African 2 (Lineage 6), Brown: M. africanum West African 1 (Lineage 5), Red: M. tuberculosis Euro-American (Lineage 4), Purple: M. tuberculosis India and East Africa (Lineage 3), Blue: M. tuberculosis East Asia (Lineage 2), Pink: M. tuberculosis The Philippines and Rim of Indian Ocean (Lineage 1).
Figure 3.
Distribution of the number of pseudogene events by functional category.
Clusters of Orthologous Groups (COG) categories were derived from the NCBI M. tuberculosis H37Rv annotation while essential/non-essential classification was derived from experiments on transposon mutagenesis [72], [73].
Figure 4.
A. Comparison of the RD900 locus in M. bovis, M. africanum and M. tuberculosis.
The RD900 deletion is present in M. bovis (AF212297)and M tuberculosis (H37Rv) identified through a genome comparison with M. africanum (GM041182). Figure adapted from use of Artemis Comparison Tool. B. Alignment of PknH1 and PknH2 in M. africanum (GM041182). The first two thirds of PknH1 and PknH2 have a high level of sequence identity except for two distinct regions. The first region is an INDEL region from codons 194–214 in PknH1. The second is a substitution region where there are 53 amino acids in PknH1 (green) instead of a region of 23 amino acids in PknH2 (red). The substitution region allows us to identify two different RD900 deletions; RD900h in M. tuberculosis, and RD900a in M. bovis. C. Alignment of the substitution region of the PknH gene of M. tuberculosis and M. bovis. A different composite PknH gene has resulted from two different RD900 deletions described in B.
Figure 5.
The RD900 locus in other mycobacteria.
RD900 region is deleted in both M. tuberculosis (H37Rv) and M. bovis (AF212297) but not M. africanum (GM041182) M. canetti (CIPT140010059), M. marinum (M) or M. ulcerans (Agy99). In M. marinum an additional PknH gene is present (designated as PknH3). PknH3 is also present in M. ulcerans, where the PknH1 locus and a 5′ CDS (grey) have been duplicated/translocated to somewhere else in the genome (position 4951200–4955915). The entire “ancestral” PknH locus is not present. PknH genes were identified through comparative genome comparison of available shotgun sequences.