There is limited available data on the strain diversity of M tuberculosis in Peru, though there may be interesting lessons to learn from a setting where multidrug resistant TB has emerged as a major problem despite an apparently well-functioning DOTS control programme.
Spoligotyping was undertaken on 794 strains of M tuberculosis collected between 1999 and 2005 from 553 community-based patients and 241 hospital-based HIV co-infected patients with pulmonary tuberculosis in Lima, Peru. Phylogenetic and epidemiologic analyses permitted identification of clusters and exploration of spoligotype associations with drug resistance.
Mean patient age was 31.9 years, 63% were male and 30.4% were known to be HIV+. Rifampicin mono-resistance, isoniazid mono-resistance and multidrug resistance (MDR) were identified in 4.7%, 8.7% and 17.3% of strains respectively. Of 794 strains from 794 patients there were 149 different spoligotypes. Of these there were 27 strains (3.4%) with novel, unique orphan spoligotypes. 498 strains (62.7%) were clustered in the nine most common spoligotypes: 16.4% SIT 50 (clade H3), 12.3% SIT 53 (clade T1), 8.3% SIT 33 (LAM3), 7.4% SIT 42 (LAM9), 5.5% SIT 1 (Beijing), 3.9% SIT 47 (H1), 3.0% SIT 222 (clade unknown), 3.0% SIT1355 (LAM), and 2.8% SIT 92 (X3). Amongst HIV-negative community-based TB patients no associations were seen between drug resistance and specific spoligotypes; in contrast HIV-associated MDRTB, but not isoniazid or rifampicin mono-resistance, was associated with SIT42 and SIT53 strains.
Two spoligotypes were associated with MDR particularly amongst patients with HIV. The MDR-HIV association was significantly reduced after controlling for SIT42 and SIT53 status; residual confounding may explain the remaining apparent association. These data are suggestive of a prolonged, clonal, hospital-based outbreak of MDR disease amongst HIV patients but do not support a hypothesis of strain-specific propensity for the acquisition of resistance-conferring mutations.
Citation: Sheen P, Couvin D, Grandjean L, Zimic M, Dominguez M, Luna G, et al. (2013) Genetic Diversity of Mycobacterium tuberculosis in Peru and Exploration of Phylogenetic Associations with Drug Resistance. PLoS ONE8(6): e65873. https://doi.org/10.1371/journal.pone.0065873
Editor: Igor Mokrousov, St. Petersburg Pasteur Institute, Russian Federation
Received: December 20, 2012; Accepted: April 29, 2013; Published: June 24, 2013
Copyright: © 2013 Sheen et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: DAJM was supported by Wellcome Trust Fellowships (064672/Z/01/Z and 078067/Z/05/Z; www.wellcome.ac.uk). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Molecular fingerprinting of M. tuberculosis (MTB) permits investigation of the epidemiology of tuberculosis to a previously unattainable level of detail, revealing insights into the differential transmission success of strains whilst observation and analysis of this epidemiology can generate testable hypotheses about strain biology . There is limited data on the epidemiology and strain diversity of M tuberculosis in Peru , though there may be interesting lessons to learn from a setting where multidrug resistant TB has emerged as a major problem despite an apparently well-functioning DOTS control programme . Here we report the results of an exercise to spoligotype all the strains of a large bank of well-characterized MTB strains derived from research projects conducted in Lima, Peru between 1999 and 2005, conduct spoligotype-based phylogenetic analyses and explore phylogenetic associations with HIV infection and drug resistance.
The sampling framework for this study was opportunistic, making use of a strain bank of anonymised but phenotypically well characterized strains of M tuberculosis collected in the course of four clinical research studies conducted amongst adults with pulmonary tuberculosis in hospital and community-based studies in Lima, Peru. All studies had been reviewed and approved by the Institutional Review Board of Universidad Peruana Cayetano Heredia (Lima, Peru). Spoligotyping was undertaken on 794 strains of M tuberculosis (1 per patient) which came from (1) an unselected community-based cohort from south Lima (Feb 1999–May 2002) (n = 329), (2) a hospital-based cohort from an HIV unit (May 1999–Feb 2005) (n = 241), (3) an unselected community-based cohort from north Lima (June 2003–July 2004) (n = 155), (4) a selected community-based cohort from east Lima of TB patients reporting conventional risk factors for drug-resistant TB (Nov 2003–Oct 2005) (n = 69). Recruitment for each of these studies has been reported previously , , . Strains were stored at −70°C in the Laboratorio de Investigación de Enfermedades Infecciosas of Universidad Peruana Cayetano Heredia (UPCH) in Lima. Available patient data was limited to gender, age and HIV status; strain data included first line drug susceptibility profile and name of source study with date of collection of original clinical sample.
Spoligotyping was undertaken at UPCH in batches over several months and films were interpreted by two independent readers; for the rare occasions where there was lack of independent agreement and subsequent failure to resolve discrepancies between both readers spoligotyping was repeated and the new film was used. Phylogenetic analyses and the construction of phylogenetic trees and spoligoforests (drawn using the SpolTools software available through http://www.emi.unsw.edu.au/spolTools; , ) permitted identification of clusters and orphan strains by comparison with the SITVIT2 database (Institut Pasteur de Guadeloupe). The minimum spanning tree (MST), based on spoligotyping patterns, was drawn using BioNumerics software version 3.5. The MST is an undirected connected graph which links all the strains together with the fewest possible linkages between nearest neighbours. Contrarily to the MST, the spoligoforest trees are directed graphs which only evolve by loss of spacers. In these trees, nodes are not necessarily all connected (indeed, in case of too many changes between two strains, there are no edges linking them. In combination with SpolTools software, GraphViz software (http://www.graphviz.org)  was used to colour the orphan strains on the spoligoforest trees. Strains were categorized into spoligotype international types (SIT) and clades for the purpose of reporting strain diversity within the strain bank. In a univariate analysis odds ratios were computed for associations between strain groupings (by clade and by SIT separately) and patient gender, HIV status, strain year of origin, isoniazid mono-resistance, rifampicin mono-resistance and multidrug resistance (with each compared to drug susceptible reference group); in subsequent multivariate logistic regression only those clade or SIT associations with a p value<0.1 on univariate analysis were included in the model.
Mean patient age was 31.9 years (range 15–78, no gender difference), 63% were male and 30.4% were known to be HIV+. HIV infection was significantly more frequent amongst males (OR 3.00, 95% CI 2.1–4.3). Rifampicin mono-resistance, isoniazid mono-resistance and multidrug resistance (MDR) were identified in 4.7%, 8.7% and 17.3% of tested strains respectively (12 rifampicin and 20 isoniazid results were unavailable).
Of 794 strains from 794 patients there were 149 different spoligotypes identified. Of these there were 27 strains with novel, unique orphan spoligotypes, 718 which mapped to spoligotypes already described at least twice, and 49 which were newly created shared types either within the present study or after a match with an orphan in the database. Descriptions of the orphan strain spoligotypes and the demographics of the source patients are given in table 1. In table 2 the 122 already known spoligotype international types (SITs) and corresponding lineages detected in this strain set are shown along with the frequency that each occurred in this strain set as compared with the comparison global database. (The complete original dataset of 794 spoligotypes with accompanying clinical data is made available in table S1; the comparative frequency of the predominant SITs in this study with those reported elsewhere in Latin America is shown in table S2).
498 strains (62.7% of all 794) were clustered in the nine most common spoligotypes; 16.4% SIT 50 (clade H3), 12.3% SIT 53 (clade T1), 8.3% SIT 33 (LAM3), 7.4% SIT 42 (LAM9), 5.5% SIT 1 (Beijing), 3.9% SIT 47 (H1), 3.0% SIT 222 (clade unknown), 3.0% SIT1355 (LAM), and 2.8% SIT 92 (X3) (table 3).
The phylogenetic relationships between strains are illustrated in minimum spanning trees (Figure 1) which demonstrate that PGG2/3 (H, LAM, T, X, S) strains are highly predominant (representing 83.8% of all strains), most notably H (n = 228, 28.7%), LAM (n = 225, 28.3%) and T (n = 161, 20.3%). (figure S1 includes SIT numbers which can be seen by zooming in on pdf file). The spoligoforests shown in Figure 2 (and figure S2) highlight (regardless of layout technique) that SIT50 (H3) is the largest node (n = 130), followed by SIT53 (T1, n = 98), SIT33 (LAM3, n = 66), SIT42 (LAM9, n = 59) and SIT1 (Beijing, n = 44).
The phylogenetic tree connects each genotype based on degree of changes required to go from one allele to another. The structure of the tree is represented by branches (continuous vs. dotted lines) and circles representing each individual pattern. Note that the length of the branches represents the distance between patterns while the complexity of the lines (continuous, black dotted and gray dotted) denotes the number of allele/spacer changes between two patterns: solid lines, 1 or 2 changes (thicker ones indicate a single change, while the thinner ones indicate 2 changes); dotted lines, three or more changes (black dotted for 3, and grey dotted for 4 or more changes). The color of the circles is proportional to the number of clinical isolates in our study, illustrating unique isolates (sky blue) versus clustered isolates (Blue, 2–5 strains; dark blue, 6–9 strains; Bordeaux, 10–19 strains; Red, 20 and more). Note that orphan patterns are circled with the letter “o” in red. Patterns marked by an asterisk (*) indicate a strain with an unknown signature (unclassified).
http://www.emi.unsw.edu.au/spolTools; Reyes et al. 2008 ). Each spoligotype pattern from the study is represented by a node with area size being proportional to the total number of isolates with that specific pattern. Changes (loss of spacers) are represented by directed edges between nodes, with the arrowheads pointing to descendant spoligotypes. In this representation, the heuristic used selects a single inbound edge with a maximum weight using a Zipf model. Solid black lines link patterns that are very similar, i.e., loss of one spacer only (maximum weight being 1.0), while dashed lines represent links of weight comprised between 0.5 and 1, and dotted lines a weight less than 0.5. Orphan isolates, indicated in cyan, are isolated strains without interconnections with the other strains. This presentation illustrates for example the parental links for PGG2/3 strains such as SIT53 and SIT42, showing how SIT53 may be considered as the precursor of all other modern PGG2/3 patterns. SIT53 leads to SIT50/H3 by the loss of spacer 31, and it leads to SIT42 by the loss of four spacers (spacers 21–24), which in turn leads to SIT1355/LAM via SIT64/LAM6 then SIT95/LAM6. Through other spacer deletions, SIT53 leads to SIT91/X3 via SIT119/X1 and SIT92/X3. Lastly, SIT222/Unknown has no parental SITs in our study.
Strain clade associations
There was no predominant spoligotype associated with MDR amongst TB patients without HIV co-infection (table 4) - the odds of MDR were highest in those with disease caused by the LAM9 spoligotype SIT 42 though this was not statistically significant. Amongst patients with HIV co-infection this spoligotype was associated with by far the highest odds of MDR (87.5% of HIV patients with SIT42 disease had MDR); the T1 spoligotype SIT 53 was also associated with a increased odds of MDR, though only amongst patients with HIV (60.0% of HIV patients with SIT53 disease had MDR compared to 14.0% of HIV uninfected patients).
HIV infection was strongly associated with MDRTB in this analysis. This association was significantly reduced (though incompletely mitigated) after adjustment for confounders, an effect largely mediated by inclusion of SIT42 and SIT53 (Table 5). Indeed even after adjustment for HIV and other covariates SIT42 and SIT53 were independently associated with MDRTB, though not with either isoniazid or rifampicin resistance. On univariate analysis male gender was associated with MDRTB but this effect was entirely driven by the increased HIV prevalence in males and disappeared after adjustment in the multivariate model. Neither study site nor year of sample collection were associated with drug resistance in the multivariate model.
In this report of strain diversity from Peru covering the period 1999–2005 3.4% of spoligotypes observed were from novel, orphan strains. The nine most frequently observed spoligotypes (out of 149 observed) accounted for over 60% of all disease and the eight of these also featured amongst the nine most frequently observed in a previous study in north Lima ; 5.5% were SIT1/Beijing family. Strains belonging to Haarlem, T, LAM and Beijing families predominated, and drug-resistance was not shown to be associated with any specific family, including Beijing, findings consistent with the single previous report from Peru .
With the exception of the Beijing family strains, recently examined in greater detail  very few PGG1 strains (AFRI, BOV, EAI, Manu) were found in this study (n = 9, 1.1%). One may notice that these PGG1 strains are not located at central positions on the trees (spoligoforests and MST). Instead, they mostly occupied terminal leaves of the trees (in the MST), or were isolated with few or no connections with other strains in the spoligoforests.
However, PGG2/3 group (n = 665, 83.8%) strains which are predominant in the study occupied a more visible and central position on the trees. Spoligoforest trees have been used to highlight the predominance of some specific well known shared types (SIT). These trees can also give us an overview on the parental links that probably exist between strains belonging to different lineages. For example, one may notice on the top left of the hierarchical layout figure (2A), that SIT19/EAI2-Manila may lead to SIT1/Beijing, through loss of many spacers.
The MST very well shows the similarity (or the distance) between each strain, and clearly defines the major evolution of the MTB lineages present in the study. For example, one can notice that in Figure 1, the Beijing family group is very far from the strains present in the central nodes of strains belonging to the PGG2/3 group (Euro American). At the very bottom of the MST, we can note the presence of the only two strains belonging to EAI lineage (SIT19/EAI2-Manila and SIT11/EAI3-IND).
The spoligoforest tree demonstrates that most of the orphans belong to the modern PGG2/3 group (H, T, LAM, T, X, S). The orphan strains are mostly located at terminal positions on the trees or are located in the top right layer of the hierarchical layout as isolated strains without interconnections with the other strains. Indeed, none of the orphans explicitly belonged to the PGG1 group (considering that some orphans have an unknown lineage).
Given the celebrated performance of the Peruvian National Tuberculosis Control Programme in demonstrating the effectiveness of DOTS in bringing about a reduction in TB incidence , the emergence of MDRTB in Peru as a major threat might be viewed as surprising ; conventional wisdom suggests MDR is driven by weak healthcare systems. We were therefore interested in exploring whether other factors, such as strain-specific biological propensity for resistance, might be relevant. In unselected community-based TB patients (largely HIV-negative) there was no association observed between drug resistance and specific spoligotypes. However amongst patients with HIV recruited from a hospital setting MDR was particularly frequently seen amongst the SIT42 and SIT53 strains. After multivariate analysis to control for the effects of HIV infection, gender and year the effect size increased; given the lack of such an association in the community one hypothesis to explain this would be that this is highly suggestive of a prolonged nosocomial clonal outbreak with strains of these two spoligotypes. The alternative hypothesis of a biological predisposition of these specific strains to acquire drug resistance-conferring mutations is much less likely given the absence of an association with isoniazid or rifampicin mono-resistance. It is noteworthy that the association of HIV with MDR, though diminished after adjustment for SIT42 and SIT53, remained significant indicating that an outbreak with strains from these two spoligotypes is insufficient to explain the whole HIV-MDR association. We cannot exclude the possibility of residual confounding as the explanation for this apparent association.
There are acknowledged limitations of the data presented here. Most importantly our sampling strategy was opportunistic, making use of a strain bank derived from several studies with different designs so the study populations differed and though all relevant subgroups (community and hospital based, HIV infected and uninfected) were included the sample could not be considered representative. There are advantages in having a strain bank which is delinked from patient identifiers but the drawback is that only limited clinical data is available and returning to clinical notes for further detail is not possible – it would have been interesting to differentiate between new and retreatment cases and to investigate patient outcomes by strain, for example. Finally, because the samples were all from studies in adults we were unable to describe the strains causing paediatric disease thus missing an opportunity to clearly identify currently/recently circulating strains, and because all strains were from pulmonary TB patients we were unable to investigate whether extrapulmonary disease phenotype was associated with any particular strain in Peru as has been suggested elsewhere .
There are important strengths in our strain bank: (1) each sample evaluated here is a single strain from a unique patient – though serial strains are available in the bank, for this analysis we were careful to only examine one strain per patient, (2) availability of drug susceptibility data and HIV status for every strain enabled the analysis we report here with very few missing values, in contrast to an earlier report for which only 70% of strains had drug susceptibility data and HIV status was not reported , (3) the spread of strains includes diverse but well characterized patient demographic groups which are also geographically spread across metropolitan Lima (home to one third of the population of Peru and more than 75% of the incident TB), (4) the collection reported here span a time period of 6 years (indeed the bank continues to accumulate strains to the present day, extending the collection to more than 13 years) enabling investigation of temporal trends (none were found here).
In summary, we report the strain distribution of M tuberculosis isolates in Lima, Peru, highlight a significant proportion of novel spoligotypes, and hypothesize a prolonged, clonal, hospital-based outbreak of MDR disease amongst HIV patients but no evidence to support a hypothesis of strain-specific propensity for the acquisition of resistance-conferring mutations.
MST from manuscript Figure 1 presented in PDF format which (through zooming in) enables reading of SIT labels.
Discrete spoligotypes relationships for all isolates (n = 794) presented through a Fruchterman Reingold spoligoforest tree drawn using the SpolTools software (available through http://www.emi.unsw.edu.au/spolTools; Reyes et al. 2008 ). Each spoligotype pattern from the study is represented by a node with area size being proportional to the total number of isolates with that specific pattern. Changes (loss of spacers) are represented by directed edges between nodes, with the arrowheads pointing to descendant spoligotypes. In this representation, the heuristic used selects a single inbound edge with a maximum weight using a Zipf model. Solid black lines link patterns that are very similar, i.e., loss of one spacer only (maximum weight being 1.0), while dashed lines represent links of weight comprised between 0.5 and 1, and dotted lines a weight less than 0.5. Orphan isolates, indicated in cyan, appear at terminal positions on the tree, as isolated strains without interconnections with the other strains.
Detailed genotyping and drug-resistance data and demographic information on M. tuberculosis strains (n = 794) isolated from adults with pulmonary tuberculosis in Lima, Peru.
A comparison of the proportion of the most predominant SITs found in Peru as compared to neighbouring countries (Brazil, Colombia) and regions (Central America and Caribbean), recorded in the SITVIT2 database as consulted on 9 April 2013.
We thank Jorge Coronel and the late and sorely missed Luz Caviedes for their contribution to the original laboratory work.
Conceived and designed the experiments: PS LG MZ NR DM. Performed the experiments: PS DC MD GL NR. Analyzed the data: PS DC LG MZ MD NR DM. Contributed reagents/materials/analysis tools: PS DC LG RG NR DM. Wrote the paper: PS LG MZ GL RG NR DM.
- 1. Demay C, Liens B, Burguiere T, Hill V, Couvin D, et al. (2012) SITVITWEB–a publicly available international multimarker database for studying Mycobacterium tuberculosis genetic diversity and molecular epidemiology. Infect Genet Evol 12: 755–766.
- 2. Taype CA, Agapito JC, Accinelli RA, Espinoza JR, Godreuil S, et al. (2012) Genetic diversity, population structure and drug resistance of Mycobacterium tuberculosis in Peru. Infect Genet Evol 12: 577–585.
- 3. Suarez PG, Watt CJ, Alarcon E, Portocarrero J, Zavala D, et al. (2001) The dynamics of tuberculosis in response to 10 years of intensive control effort in Peru. J Infect Dis 184: 473–478.
- 4. Kawai V, Soto G, Gilman RH, Bautista CT, Caviedes L, et al. (2006) Tuberculosis mortality, drug resistance, and infectiousness in patients with and without HIV infection in Peru. Am J Trop Med Hyg 75: 1027–1033.
- 5. Moore DA, Evans CA, Gilman RH, Caviedes L, Coronel J, et al. (2006) Microscopic-observation drug-susceptibility assay for the diagnosis of TB. N Engl J Med 355: 1539–1550.
- 6. Moore DA, Mendoza D, Gilman RH, Evans CA, Hollm Delgado MG, et al. (2004) Microscopic observation drug susceptibility assay, a rapid, reliable diagnostic test for multidrug-resistant tuberculosis suitable for use in resource-poor settings. J Clin Microbiol 42: 4432–4437.
- 7. Tang C, Reyes JF, Luciani F, Francis AR, Tanaka MM (2008) spolTools: online utilities for analyzing spoligotypes of the Mycobacterium tuberculosis complex. Bioinformatics 24: 2414–2415.
- 8. Reyes JF, Francis AR, Tanaka MM (2008) Models of deletion for visualizing bacterial variation: an application to tuberculosis spoligotypes. BMC Bioinformatics 9: 496.
- 9. Ellson J, Gansner E, Koutsofios L, North SC, Woodhull G (2002) Graphviz - Open Source Graph Drawing Tools. In: Mutzel P, Jünger M, Leipert S, editors. Heidelberg: Springer-Verlag Berlin. pp. 483–484.
- 10. Iwamoto T, Grandjean L, Arikawa K, Nakanishi N, Caviedes L, et al. (2012) Genetic diversity and transmission characteristics of Beijing family strains of Mycobacterium tuberculosis in Peru. PLoS One 7: e49651.
- 11. Zignol M, van Gemert W, Falzon D, Sismanidis C, Glaziou P, et al. (2012) Surveillance of anti-tuberculosis drug resistance in the world: an updated analysis, 2007–2010. Bull World Health Organ 90: 111–119D.
- 12. Svensson E, Millet J, Lindqvist A, Olsson M, Ridell M, et al. (2011) Impact of immigration on tuberculosis epidemiology in a low-incidence country. Clin Microbiol Infect 17: 881–887.