The Beijing family is a successful group of M. tuberculosis strains, often associated with drug resistance and widely distributed throughout the world. Polymorphic genetic markers have been used to type particular M. tuberculosis strains. We recently identified a group of polymorphic DNA repair replication and recombination (3R) genes. It was shown that evolution of M. tuberculosis complex strains can be studied using 3R SNPs and a high-resolution tool for strain discrimination was developed. Here we investigated the genetic diversity and propose a phylogeny for Beijing strains by analyzing polymorphisms in 3R genes.
A group of 3R genes was sequenced in a collection of Beijing strains from different geographic origins. Sequence analysis and comparison with the ones of non-Beijing strains identified several SNPs. These SNPs were used to type a larger collection of Beijing strains and allowed identification of 26 different sequence types for which a phylogeny was constructed. Phylogenetic relationships established by sequence types were in agreement with evolutionary pathways suggested by other genetic markers, such as Large Sequence Polymorphisms (LSPs). A recent Beijing genotype (Bmyc10), which included 60% of strains from distinct parts of the world, appeared to be predominant.
We found SNPs in 3R genes associated with the Beijing family, which enabled discrimination of different groups and the proposal of a phylogeny. The Beijing family can be divided into different groups characterized by particular genetic polymorphisms that may reflect pathogenic features. These SNPs are new, potential genetic markers that may contribute to better understand the success of the Beijing family.
Citation: Mestre O, Luo T, Dos Vultos T, Kremer K, Murray A, Namouchi A, et al. (2011) Phylogeny of Mycobacterium tuberculosis Beijing Strains Constructed from Polymorphisms in Genes Involved in DNA Replication, Recombination and Repair. PLoS ONE 6(1): e16020. doi:10.1371/journal.pone.0016020
Editor: Anil Kumar Tyagi, University of Delhi, India
Received: September 13, 2010; Accepted: December 2, 2010; Published: January 20, 2011
Copyright: © 2011 Mestre et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work is part of the TB-ADAPT (LSHP-CT-2006-037919) and TB-VIR (Grant agreement n° 200973) projects supported by the European Commission under the Health Cooperation Work Programme of the 6th and 7th Framework Programme, respectively. This work was also supported by the Key Project of Chinese National Programs (2008ZX10003-010). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Mycobacterium tuberculosis is one of the most successful human pathogens, infecting nearly one third of the world's population. Despite efforts to combat the disease, tuberculosis (TB) remains a major public health problem, causing over 9 million new cases and 1.7 million deaths each year . Polymorphic genetic markers have been used to discriminate and subtype M. tuberculosis strains to identify outbreaks. IS6110 restriction fragment length polymorphism typing is one of the most widely used methods, however, this technique is time consuming, technically demanding and insufficiently discriminatory for isolates containing less than five copies of IS6110. This has led to the development of other methods based on the polymorphism of repetitive sequences, either the direct repeat (DR) region (spoligotyping) or mini satellites (variable numbers of tandem repeats (VNTR) typing) . Various M. tuberculosis families, such as the Beijing family, have been defined using these typing techniques . The Beijing family represents a global threat to TB control. It is estimated that more than a quarter of worldwide TB cases are caused by Beijing strains , . These strains have frequently been associated with drug resistance and their emergence and wide distribution suggests they have selective advantages over other M. tuberculosis strains , . Beijing strains have a characteristic spoligotype pattern ,  and VNTRs have been frequently used to type these strains, exhibiting differing discriminatory abilities per VNTR locus , .
The availability of whole-genome sequences has enabled comparative genomic analysis to identify single nucleotide polymorphisms (SNPs). SNPs have been used to differentiate between clinical isolates and are preferred over the use of repeats for the construction of phylogenetic trees, because recombination events that could occur independently at the level of repetitive sequences are avoided . Large numbers of SNPs have been identified and used to genotype worldwide strain collections. This supported the grouping of M. tuberculosis into major families and provided useful information about the evolutionary history of this monomorphic bacteria , , . As an example, the phylogeny of M. tuberculosis was recently established by sequencing 89 genes . Nevertheless, detailed phylogenies about the various M. tuberculosis lineages are still lacking.
We recently identified a group of highly polymorphic genes involved in DNA replication, recombination and repair (3R) in a set of geographically diverse M. tuberculosis strains. We showed that the evolution of M. tuberculosis could be studied using SNPs in 3R genes and a potential, new, high-resolution tool for strain discrimination was developed . Here we investigated the genetic diversity among Beijing family strains and searched for new polymorphisms in this family by sequencing 3R genes in a collection of Beijing strains from different geographic origins in order to disclose the phylogeny of the Beijing family.
Results and Discussion
A collection of 58 clinical isolates with a Beijing spoligotype  was used to search for variations in 3R genes. These isolates had different geographic origins: Madagascar (19), USA (18), The Netherlands (6), South Korea (2), South Africa (2), China (3), Malaysia (1), Mongolia (2), Thailand (2), Philippines (1), Singapore (1) and Russia (1) (Table S1). These Beijing isolates included the four different sublineages defined by large sequence polymorphisms (LSPs) previously described  (Figure 1B). Two non-Beijing M. tuberculosis strains, designated Myc1, which corresponds to the laboratory strain H37Rv, and Myc2 a clinical strain that belongs to Gutacker's cluster VI , were also included in this study.
This phylogenetic network was constructed using the median-joining algorithm with the final set of 48 SNPs characterized by sequencing 22 3R genes in 58 Beijing isolates plus one non-Beijing isolate (Myc2). Isolates are color coded according to their geographic origin (A), large sequence polymorphisms (LSPs) (B) and, variations in mutT2 mutT4 and ogt genes (C). The reference strain M. tuberculosis H37Rv (Myc1) was also included. The numbers in each branch correspond to SNPs (Table 1) that enabled discrimination of sequence types. Node sizes are proportional to the number of isolates belonging to the same sequence type: Bmyc4 node (2); Bmyc12 node (3); Bmyc13 node (3); Bmyc19 (2); Bmyc16 node (7); Bmyc10 node (23). See Table S1 for details about strains belonging to each node. Mv represents a median vector created by the software and can be interpreted as possibly extant unsampled sequences or extinct ancestral sequences.
Of the 56 described genes encoding 3R components , 22 were previously demonstrated to be polymorphic among Beijing strains , . These 22 genes (Table S2) were sequenced for each of the 58 Beijing isolates and the non-Beijing strain Myc2, resulting in approximately 1,6 Mbp of sequence data. Comparative analysis with the M. tuberculosis H37Rv (Myc1) genome sequence identified 48 SNPs (Table S2). Forty-one (85%) SNPs appeared to be specific for Beijing strains, as these were absent from the non-Beijing strain included in this study (Myc2) (Table S2), and also from the 86 non-Beijing M. tuberculosis strains included in a previous study . Nineteen (46%) of these SNPs corresponded to new variations, not previously described in Beijing strains , , . Thirty of the 41 Beijing specific SNPs (Table 1) enabled discrimination of 24 different sequence types for which a phylogenetic network was constructed using the Network software  (Figure 1A). Based on the inferred proteins, the number of non-synonymous SNPs (nsSNPs) was twice the number of synonymous SNPs (sSNPs) (Table 1). Phylogenetic relationships established by sequence types were in agreement with evolutionary pathways suggested by LSPs  and by SNPs in the putative DNA repair genes mutT2, muT4 and ogt  (Figure 1B and 1C). However, sequencing of the 22 genes was more discriminatory than LSPs; 24 sequence types versus four sublineages defined by the LSPs.
Next we investigated the set of 30 polymorphic SNPs (Table 1), discovered by sequence analysis of the 3R genes, in a larger collection of Beijing strains including 192 Beijing clinical isolates from China and 55 Beijing strains isolated in South Africa (Table S1). The M. tuberculosis Beijing strain, GC 1237, responsible for a tuberculosis epidemic in Gran Canaria, Spain  was also included.
A phylogenetic network was constructed from this larger set of isolates (Figure 2). Certain SNPs that were previously found in a single isolate, were confirmed with this larger sample. Overall, fourteen SNPs were found in more than one isolate and were therefore informative (Table 1). Two new sequence types (Bmyc25 and Bmyc26) were identified (Figure 2).
This phylogenetic network was constructed using the median-joining algorithm with the set of SNPs identified in the 3R genes analyzed on the final collection of 305 Beijing isolates. Isolates are color coded according to their geographic origin. M. tuberculosis strains Myc1 (H37Rv) and Myc2 are included as non-Beijing strains. The numbers in each branch correspond to SNPs (Table 1) that enabled discrimination of SNP types. Node sizes are proportional to the number of isolates belonging to the same SNP type: Bmyc1 node (2); Bmyc2 node (14); Bmyc4 node (13); Bmyc6 node (7); Bmyc25 node (28); Bmyc26 node (13); Bmyc12 node (3); Bmyc13 node (13); Bmyc16 node (7); Bmyc19 node (2); Bmyc10 node (188). See Table S1 for details about strains belonging to each node. Mv represents a median vector created by the software and can be interpreted as possibly extant unsampled sequences or extinct ancestral sequences. The relative proportion of isolates in each node, of a given geographic origin, may not reflect the population structure of the Beijing family of that geographic region.
The Beijing family can be divided into different groups characterized by particular SNPs. However, a recent sequence type, represented by the Bmyc10 node, appeared to be predominant in this family (Figure 2). Sixty-two percent of the isolates belonged to this group. This sequence type was found not only in China, where the Beijing family is highly prevalent, but also in other countries, where the Beijing family is less prevalent, such as Madagascar, The Netherlands and South Africa. In a recent study, a group of Beijing strains characterized by RD181 deletion and polymorphisms in mutT4 and mutT2 appear to be predominant in a collection of strains isolated in Italy . Strains belonging to the Bmyc10 node also had the RD181 deletion and the same SNPs in mutT4 and mutT2 genes (SNP6 and SNP12). This suggests that this might indeed be a prevalent group of Beijing strains which can be found in different parts of the world. The effect on enzyme characteristics of the variation in the mutT2 gene (a characteristic of all isolates found in the R1 node, (SNP12, Figure 2)) has been investigated . The results revealed significant changes in enzyme properties caused by a single amino acid substitution that leads to protein destabilization. It was suggested that this altered MutT2 enzyme may contribute to the success of strains due to an increase in nucleotide-dependent reactions. This suggests that the SNPs that we have discovered may have an effect on protein function and consequently confer advantageous phenotypes. Considering the high percentage of nsSNPs found (Table 1) it may be informative to investigate which of these variants might have a functional effect. They may confer advantageous phenotypes on certain Beijing genotypes, and play an important role in the evolution of the family. Our results showed that the Bmyc25 group might represent another predominant group of Beijing strains. This includes the Gran Canaria TB outbreak strain GC 1237 . These observations suggest that several Beijing subtypes may be the result of the resurgence of tuberculosis in different regions.
When compared to other pathogens, M. tuberculosis complex strains are highly clonal, sharing 99% similarity at the nucleotide level . In recent years, SNPs have been identified and used in order to get a more detailed insight into the evolutionary history of this organism , , , . SNP analysis is a simple and relatively fast way to compare organisms and trace back the evolutionary history of strains, as some SNPs are highly informative. The increasing number of genome sequencing projects is making SNP analyses more and more attractive. This will provide important data, particularly relevant to understanding the genetic basis for strain differences in pathogenesis. Allelic variation in 3R genes seems to be an important mechanism in evolution and adaptation of microorganisms. Therefore, defective 3R systems could potentially increase genomic variability due to higher mutation rates. Strains with higher mutation rates (mutators) may, under certain conditions, have a selective advantage. For example, a strain may acquire mutations that induce antibiotic resistance or facilitate evasion of the host immune response . The evolutionary history of a collection of 305 Beijing isolates was investigated by analyzing polymorphisms in 3R genes. We found SNPs in 3R genes associated with the Beijing family. These SNPs enabled discrimination of 26 different groups enabling a phylogenetic network to be constructed. The Beijing family can be divided into different groups presenting specific polymorphisms that may reflect pathogenic features. These new SNPs are potential genetic markers for Beijing strains that may contribute to a better understanding of the role of the Beijing family in the worldwide epidemic of tuberculosis.
Materials and Methods
M. tuberculosis Beijing clinical isolates included in this study are listed in Table S1. DNA from the 58 Beijing isolates, used to search for variations in 3R genes, was provided by the Madagascar Pasteur Institute (MG), RIVM, The Netherlands (NL), Scientific Institute of Public Health, Belgium (BE) and was used to amplify the 22 3R genes with primers listed in Table 2. These fragments were sequenced by the dideoxy chain-termination method using the Big Dye Terminator v3.1 cycle sequencing Kit (Perkin Elmer Applied Biosystems, Courtaboeuf, France) according to the manufacter's instructions. Sequencing products were run on an ABI prism 3100 Genetic Analyser (Applied Biosystems). Sequencing was also performed for SNP analysis of the non-beijing strain (myc2), the Bejing isolates from South Africa (ZA) and the GC 1237 strain (DNA provided from NRF Centre of Excellence in Biomedical Tuberculosis Research/MRC Centre for Molecular and Cellular Biology, South Africa and available in our laboratory).
Sequences were analysed using the software Genalys obtained at http://software.cng.fr/. The genome sequences of M.tuberculosis H37Rv were obtained from the Institut Pasteur at http://genolist.pasteur.fr and used for detection of SNPs.
A mismatched PCR method, using one wild-type primer and one containing the SNP which matched/mismatched the template DNA at the 3′-end of the primer (Table 2), was used to detect SNPs in the Beijing isolates from China (CN).
SNPs were concatenated resulting in one character string (nucleotide sequence) for each clinical isolate analyzed. A FASTA file was created to run in the Network software  to build a phylogeny based on the median-joining method. This software assumes that there is no recombination between genomes.
Full list of 48 SNPs identified in this study. The first line indicates the gene and the second line indicates the position on that gene where polymorphisms were identified in relation to M. tuberculosis H37Rv strain (bottom). Polymorphisms that characterize and allowed discrimination of the 26 sequence types (Figure 2 and Table 1) are marked in red.
Conceived and designed the experiments: OM TDV JM QG BG. Performed the experiments: OM TL CJ JR. Analyzed the data: OM TL TDV AN QG BG. Contributed reagents/materials/analysis tools: KK AM AN PB RW VR. Wrote the paper: OM TDV KK AM AN RW VR QG TL BG.
- 1. Anonymous (2008) WHO report 2008: global tuberculosis control—surveillance, planning, financing. Geneva, Switzerland: World Health Organization.
- 2. Van Soolingen D (2001) Molecular epidemiology of tuberculosis and other mycobacterial infections: main methodologies and achievements. J Intern Med 249: 1–26.
- 3. Kremer K, Glynn JR, Lillebaek T, Niemann S, Kurepina NE, et al. (2004) Definition of the Beijing/W lineage of Mycobacterium tuberculosis on the basis of genetic markers. J Clin Microbiol 42: 4040–4049.
- 4. European Concerted Action on New Generation Genetic Markers and Techniques for the Epidemiology and Control of Tuberculosis (2006) Beijing/W genotype Mycobacterium tuberculosis and drug resistance. Emerg Infect Dis 736–743. 2006/05/18 ed.
- 5. Bifani PJ, Mathema B, Kurepina NE, Kreiswirth BN (2002) Global dissemination of the Mycobacterium tuberculosis W-Beijing family strains. Trends Microbiol 10: 45–52.
- 6. Kremer K, Au BK, Yip PC, Skuce R, Supply P, et al. (2005) Use of variable-number tandem-repeat typing to differentiate Mycobacterium tuberculosis Beijing family isolates from Hong Kong and comparison with IS6110 restriction fragment length polymorphism typing and spoligotyping. J Clin Microbiol 43: 314–320.
- 7. Zhang L, Chen J, Shen X, Gui X, Mei J, et al. (2008) Highly polymorphic variable-number tandem repeats loci for differentiating Beijing genotype strains of Mycobacterium tuberculosis in Shanghai, China. FEMS Microbiol Lett 282: 22–31.
- 8. Achtman M (2008) Evolution, population structure, and phylogeography of genetically monomorphic bacterial pathogens. Annu Rev Microbiol 62: 53–70.
- 9. Baker L, Brown T, Maiden MC, Drobniewski F (2004) Silent nucleotide polymorphisms and a phylogeny for Mycobacterium tuberculosis. Emerg Infect Dis 10: 1568–1577.
- 10. Filliol I, Motiwala AS, Cavatore M, Qi W, Hazbon MH, et al. (2006) Global phylogeny of Mycobacterium tuberculosis based on single nucleotide polymorphism (SNP) analysis: insights into tuberculosis evolution, phylogenetic accuracy of other DNA fingerprinting systems, and recommendations for a minimal standard SNP set. J Bacteriol 188: 759–772.
- 11. Gutacker MM, Smoot JC, Migliaccio CA, Ricklefs SM, Hua S, et al. (2002) Genome-wide analysis of synonymous single nucleotide polymorphisms in Mycobacterium tuberculosis complex organisms: resolution of genetic relationships among closely related microbial strains. Genetics 162: 1533–1543.
- 12. Hershberg R, Lipatov M, Small PM, Sheffer H, Niemann S, et al. (2008) High functional diversity in Mycobacterium tuberculosis driven by genetic drift and human demography. PLoS Biol 6: e311.
- 13. Dos Vultos T, Mestre O, Rauzier J, Golec M, Rastogi N, et al. (2008) Evolution and diversity of clonal bacteria: the paradigm of Mycobacterium tuberculosis. PLoS One 3: e1538.
- 14. Tsolaki AG, Gagneux S, Pym AS, Goguet de la Salmoniere YO, Kreiswirth BN, et al. (2005) Genomic deletions classify the Beijing/W strains as a distinct genetic lineage of Mycobacterium tuberculosis. J Clin Microbiol 43: 3185–3191.
- 15. Rad ME, Bifani P, Martin C, Kremer K, Samper S, et al. (2003) Mutations in putative mutator genes of Mycobacterium tuberculosis strains of the W-Beijing family. Emerg Infect Dis 9: 838–845.
- 16. Bandelt HJ, Forster P, Rohl A (1999) Median-joining networks for inferring intraspecific phylogenies. Mol Biol Evol 16: 37–48.
- 17. Caminero JA, Pena MJ, Campos-Herrero MI, Rodriguez JC, Garcia I, et al. (2001) Epidemiological evidence of the spread of a Mycobacterium tuberculosis strain of the Beijing genotype on Gran Canaria Island. Am J Respir Crit Care Med 164: 1165–1170.
- 18. Rindi L, Lari N, Cuccu B, Garzelli C (2009) Evolutionary pathway of the Beijing lineage of Mycobacterium tuberculosis based on genomic deletions and mutT genes polymorphisms. Infect Genet Evol 9: 48–53.
- 19. Moreland NJ, Charlier C, Dingley AJ, Baker EN, Lott JS (2009) Making sense of a missense mutation: characterization of MutT2, a Nudix hydrolase from Mycobacterium tuberculosis, and the G58R mutant encoded in W-Beijing strains of M. tuberculosis. Biochemistry 48: 699–708.
- 20. Sreevatsan S, Pan X, Stockbauer KE, Connell ND, Kreiswirth BN, et al. (1997) Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination. Proc Natl Acad Sci U S A 94: 9869–9874.
- 21. Denamur E, Matic I (2006) Evolution of mutation rates in bacteria. Mol Microbiol 60: 820–827.