In-Depth Molecular Characterization of Mycobacterium tuberculosis from New Delhi – Predominance of Drug Resistant Isolates of the ‘Modern’ (TbD1−) Type

Background India has the highest estimated burden of tuberculosis in the world, accounting for 21% of all tuberculosis cases world-wide. However, due to lack of systematic analysis using multiple markers the available information on the genomic diversity of Mycobacterium tuberculosis in India is limited. Methodology/Principal Findings Thus, 65 M. tuberculosis isolates from New Delhi, India were analyzed by spoligotyping, MIRU-VNTR, large deletion PCR typing and single nucleotide polymorphism analysis (SNP). The Central Asian (CAS) 1 _DELHI sub-lineage was the most prevalent sub-lineage comprising 46.2% (n = 30) of all isolates, with shared-type (ST) 26 being the most dominant genotype comprising 24.6% (n = 16) of all isolates. Other sub-lineages observed were: East-African Indian (EAI)-5 (9.2%, n = 6), EAI6_BGD1 (6.2%, n = 4), EAI3_IND, CAS and T1 with 6.2% each (n = 4 each), Beijing (4.6%, n = 3), CAS2 (3.1%, n = 2), and X1 and X2 with 1 isolate each. Genotyping results from five isolates (7.7%) did not match any existing spoligopatterns, and one isolate, ST124, belonged to an undefined lineage. Twenty-six percent of the isolates belonged to the TbD1+ PGG1 genogroup. SNP analysis of the pncA gene revealed a CAS-lineage specific silent mutation, S65S, which was observed for all CAS-lineage isolates (except two ST26 isolates) and in 1 orphan. Mutations in the pncA gene, conferring resistance to pyrazinamide, were observed in 15.4% of all isolates. Collectively, mutations in the rpoB gene, the katG gene and in both rpoB and katG genes, conferring resistance to rifampicin and isoniazid, respectively, were more frequent in CAS1_DELHI isolates compared to non-CAS_DELHI isolates (OR: 3.1, CI95% [1.11, 8.70], P = 0.045). The increased frequency of drug-resistance could not be linked to the patients' history of previous anti-tuberculosis treatment (OR: 1.156, CI95% [0.40, 3.36], P = 0.79). Fifty-six percent of all new tuberculosis patients had mutations in either the katG gene or the rpoB gene, or in both katG and rpoB genes. Conclusion CAS1_DELHI isolates circulating in New Delhi, India have a high frequency of mutations in the rpoB and katG genes. A silent mutation (S65S) in the pncA gene can be used as a putative genetic marker for CAS-lineage isolates.


Introduction
The Indian subcontinent has been a global hotspot for the growth and spread of the TB epidemic in recent times [1] and has served as the corridor of early world-wide dissemination of M. tuberculosis during the ancient era [2]. Mycobacterium tuberculosis genotypes from the Indian subcontinent have largely been described using, in most cases, single techniques to define bacterial diversity [3][4][5][6][7]. Such studies, although important, underestimate the clonal diversity of M. tuberculosis. IS6110 restriction-fragment length polymorphism (RFLP) has been the gold standard for population-based molecular epidemiological studies of TB with the purpose of identifying pathways and predictors of ongoing TB transmission [8,9]. However, some M. tuberculosis isolates, especially those belonging to ancient lineages, have few or no IS6110 copies [10,11]. Moreover, systematic fingerprinting of all M. tuberculosis isolates by IS6110 RFLP is time-consuming, cumbersome and the results have poor inter-laboratory portability [10,12]. Thus, a combination of two rapid PCR-based molecular typing methods; spoligotyping and variable number tandem repeats of mycobacterial interspersed repetitive unit (MIRU-VNTR) has become an attractive alternative to IS6110 RFLP. A study by Hanekom et al [13] suggests that there are discriminatory limitations to the available genotyping methods in high TBincidence areas and that the choice of appropriate MIRU-VNTR loci require further investigation in diverse M. tuberculosis lineages in low-and high TB-endemic countries. A few recent studies that have analyzed tubercle bacilli from India using multiple markers [2,14,15] reveal a clear predominance of two important genogroups (TbD1 2 and TbD1 + groups of strains), confined to North and South India, respectively. While the TbD1 + East African Indian (EAI) genofamily and the TbD1 2 hypervirulent and multidrug-resistant (MDR) -associated Beijing strains predominates in the southern part of the country, the TbD1 2 Delhi types, which, by spoligotyping are characterized by lack of at least spacers 4-7 and 23-34 [16] are overwhelmingly represented in Delhi and its adjoining states [7,17,18]. While EAI genotypes have been dissected at genomic levels [14], no studies till date have been performed to investigate the genetic composition of the Delhi type/Central Asian (CAS) 1 isolates. Our study therefore, aimed at analyzing the genetic makeup of circulating M. tuberculosis isolates in metropolitan Delhi with an in-depth characterization of major lineages and sub-lineages.

Patients and isolates
Following study approval by the Ministry of Health (reference: V.21011/20/2002-CCD), India and the Lala Ram Sarup (LRS) Institute of Tuberculosis and Respiratory Diseases, New Delhi, isolates were obtained from pulmonary tuberculosis (TB) patients attending the TB clinic of the LRS Institute. Sputum samples were collected as part of the routine diagnostic examination of suspected pulmonary TB patients at LRS. Verbal consent for analysis of M. tuberculosis isolates from sputum samples were obtained from all patients. Additionally, as part of the routine clinical examination at the LRS Institute, information on current and previous history of M. tuberculosis infection, history of contact with a TB case and a history of receiving previous anti-tuberculosis treatment was available. None of the study participants could be identified based on the analyses undertaken on the M. tuberculosis isolates analyzed in this study. Of 100 culture-positive acid-fast isolates, 23 isolates were not viable upon sub-culturing. Of 77 isolates that were available for this study, 9 were by the Hain's kit (Hain Lifescience, Nehren, Germany) and 16S rRNA sequencing found to be atypical mycobacteria and excluded from further analyses. Thus, isolates from 68 patients were included in this study. Forty-eight percent of the patients included were male and the mean age was 30.6 years with 84.6% of the patients ,45 years of age. Twentyfour (38.1%) patients were previously treated for TB.

DNA isolation
Chromosomal DNA was isolated as described by van Embden et al [9] or by heat-extraction using a loopful of cells suspended in 200 ml of TE buffer (10 mM Tris-Cl, 1 mM EDTA), and heatkilled by incubation at 95uC for 20 min. The supernatant containing the extracted DNA was collected by centrifugation at 12000 rpm for 10 min.

Spoligotyping
Spoligotyping was performed with commercially available activated Biodyne C membranes with the 43 synthetic oligonucleotides covalently bound to the membranes (Isogen Bioscience BV, Maarssen, The Netherlands), as recommended by the manufacturer. The hybridization patterns were converted into binary and octal formats according to Dale et al [19] and compared with previously reported strains in the SpolDB4 [16] database.

MIRU-VNTR analysis
Isolates were genotyped by MIRU-VNTR by amplification of 15 loci as described by Supply et al [20]. PCR amplification was performed in a total volume of 20 ml containing 1 ml DNA, 0.04-0.4 mM of all 15 primer sets, and Hotstart Taq Plus polymerase Master Mix (Qiagen). All reactions were subjected to 95uC for 5 min, followed by 30 cycles of 30 sec at 95uC, 1 min at 55uC, 1.5 min at 72uC and terminated by 7 min at 72uC. Genotyping was performed using multiplex PCR with a Rox-labeled MapMarker 1000 size standard (PE Applied Biosystems) for sizing of the PCR products. The PCR fragments were analyzed using a capillarybased electrophoresis sequencer (ABI 3700), and sizing of the various VNTR alleles were done using the Peak Scanner Software v1.0 (PE Applied Biosystems). The number of repeats present in each locus was determined and alleles were assigned numerical values accordingly. Furthermore, isolates with identical MIRU-VNTR genotype were defined as belonging to the same cluster.

Single Nucleotide Polymorphisms
Isolates were assigned to Principal Genetic Groups (PGG) after PCR and subsequent SNP analysis of katG 463 and gyrA 95 using primers as described in Sreevatsan et al [21] (Table 1). Other mutations in the rpoB, embB, katG, pncA and gyrA genes of the M. tuberculosis isolates were determined by PCR and sequencing using primers listed in Table 1. PCR was performed in 20 ml reactions containing 12.5 ml Taq Polymerase (GoTaq Mastermix, Promega), 0.04 mM of each primer, 9.5 ml H 2 O and 1 ml DNA. The PCR reaction consisted of an initial 2 min for denaturation at 95uC, 30 cycles of 95uC for 1 min, 55-65uC, for 1 min, 72uC for 2 min and elongation time of 72uC for 5 min. Sequencing reaction consisted of an initial 5 min for denaturation at 96uC, 25 cycles of 96uC for 15 sec, 50uC for 10 sec, 60uC for 4 min and elongation time of 72uC for 15 min using the BigDye 3.1 Terminator Cycle Sequencing Kit (Applied Biosystems Inc). The sequencing was carried out in an ABI 377 automatic DNA sequencer (Applied Biosystems Inc., Foster, CA) and nucleotide sequences were analyzed using DNASTAR software (DNASTAR, Inc., USA).
An isolate was defined as being MDR if mutations in both the rpoB and katG genes were observed.

TbD1 analysis
The presence or absence of TbD1 was analyzed by PCR using 2 primer sets complementary to the sequences of the deleted region or complementary to the internal sequences of the intact region [22]. Isolates containing the TbD1 region (TbD1+) produced a PCR product using the internal primers, whereas isolates that lacked the TbD1 region (TbD12) produced a PCR product using the flanking primers only.

Cluster analysis
All results from this study were entered into the Bionumerics program (Bionumerics version 3.1, Applied Maths, Saint-Martens-Latem, Belgium). A dendrogram of the MIRU-VNTR profiles were produced following Pearson correlation and unweighted pair group method with arithmetic average analysis.

Statistical analysis
The Hunter-Gaston discriminatory index (HGDI) was calculated for each of the 15 MIRU-VNTR loci as described previously [23]. The discriminatory nature of each loci were, based on HGDI values, considered as highly .0.6, moderately 0.3-0.6 and poorly ,0.3 as suggested by Sola et al. [24]. Differences between proportions were analyzed using the chi-square exact test (with Yates' correction for continuity) and expressed as odds ratios (ORs) with 95% confidence interval (95%CI). The patient's history of previous anti-TB treatment was taken into consideration when assessing a link between mutations conferring anti-TB drugresistance and the CAS1_DELHI sub-lineage isolates using simple and multivariable logistic regression analyses using SPSS version 14.0 (Softonic). A P value of ,0.05 was taken to represent statistical significance.

MIRU-VNTR
Of the 68 isolates analyzed by MIRU-VNTR, 3 isolates were suspected to be of mixed M. tuberculosis sub-populations and were excluded from further analysis. Of the remaining 65 isolates, 21 isolates could, by MIRU-VNTR, be distributed into 10 clusters. Three clusters comprising 6 isolates could be discriminated by SNP analysis of katG/rpoB genes. Two clusters comprising 2 isolates each could be further discriminated by spoligotyping, whereas in one cluster comprising 3 isolates, two isolates of the same genotype were discriminated by SNP analysis of rpoB/embB genes. The third isolate in this cluster belonged to a different shared-type ( Figure 1). Eight isolates comprising 4 clusters could not be discriminated by a combination of MIRU-VNTR and SNP analyses of katG, rpoB, embB, or pncA. No epidemiologic link could be established between the patients. MIRU26 was the most discriminatory locus for isolates belonging to the CAS1_DELHI sub-lineage with a HGDI value of 0.788506, whereas QUB-26 was the most discriminatory locus for EAI5 isolates with a HGDI of 0.8. MIRU-VNTR discriminated only moderately between isolates of the EAI6_BGD1 sub-lineage with ETR-A and QUB-4156c giving the highest HGDI value of 0.5.

Single Nucleotide Polymorphism Analysis
PCR and subsequent SNP analysis was performed on 65 isolates for the genes katG 364 and gyrA 95 . Eight isolates (12.3%) had a pointmutation at position katG 463 , whereas no mutations were observed for gyrA 95 and hence, the isolates were assigned to PGG2. Six of the 8 isolates with a mutation in katG 463 belonged to the T1 (n = 4), X1 (n = 1) and X2 (n = 1) sub-lineages, whereas 2 isolates belonged to previously undescribed lineages. All other isolates (87.7%) analyzed were assigned to the genogroup PGG1 ( Figure 1, Table 3). SNP analysis was also performed for 4 genes know to be involved in resistance to the anti-TB drugs isoniazid [INH] (katG), rifampicin [RIF] (rpoB), ethambutol (embB) and pyrazinamide [PZA] (pncA). The point-mutations observed for all 4 genes are described in Table 3. Thirty-two isolates had a mutation in at least the rpoB gene of which 23 had combined katG and rpoB mutations (MDR-TB). Of the 23 MDR-TB isolates, 11 belonged to the CAS1_DELHI sub-lineage, 3 to the Beijing lineage, 3 to the EAI3_IND sub-lineage, 2 to the CAS lineage, 2 to the EAI5 sublineage, 1 to the CAS2 sub-lineage, and 1 to the X2 sub-lineage. Nine isolates had combined mutations in the katG, rpoB and embB genes. Three isolates had a mutation in the katG gene only, whereas 1 isolate had combined katG and embB mutations. Of the 10 isolates with a mutation in the pncA gene conferring resistance to PZA, 7 also had mutations in both rpoB and katG, 1 isolate had mutations in rpoB and embB, whereas the 2 remaining isolates with a pncA mutation were wild-type for the other 3 genes analysed. One isolate had a mutation in the embB gene only.
The most frequently observed mutation was a silent C-.T mutation in codon 65 of the pncA gene. This mutation was observed in all but two isolates of the CAS lineage and in 1 orphan, whereas none of the other isolates had this mutation (Table 3, Figure 1). The most frequently observed mutation in katG was a G-.C substitution at position 944, resulting in an S315T substitution in 22 isolates. Twelve isolates had a C-.T mutation in position 1350 of rpoB, resulting in an S531L substitution, whereas 6 isolates had an A-.G mutation at position 917 in embB, causing an M306I substitution (Table 3).

Discussion
Genomic plasticity and phenotypic versatility of pathogenic bacteria are the two major forces that stratify potentially virulent and benign types [25]. Our understanding of mycobacterial evolution is partly based on this paradigm where pathogenic 'specialist' forms are thought to have evolved from their 'generalist' predecessors [26]. While globally, the core gene pools of such bacteria are highly clonal [21], there may be certain plastic regions that experience frequent genomic recasting [22]. Accurate analysis of the repertoire of such elements in a given population based on multiple genome profiling approaches could reveal significant aspects of ongoing evolution (and transmission) and the available magnitude of genetic diversity among the strains could thus be unraveled. This ultimately helps one understand transmission cum virulence potentials of strains and should ideally constitute an important agenda for epidemic control programs.
M. tuberculosis strains of the so called PGG1 [21] have been prevalent in India with CAS (TbD1 2 ) and EAI (TbD1 + ) being the major genogroups. While EAI strains have been the focus of attention for some time owing to their presumed 'low virulent' characteristics [27,28] and their links to ancestral African predecessors [21,29], the origins and routes of the spread of the CAS strains are not clear. Although these lineages are prevalent in countries of Central-and West-Asian regions it is not evident why they are specifically predominant in the Delhi region. Despite the characteristic deletion of the TbD1 region, CAS strains have never been linked to any institutionalized outbreaks as seen for example in the case of the TbD1 2 Beijing-lineage isolates [3,30]. The spread of the CAS strains through the western corridors and subsequent adaptation to the North Indian populations is an imminent possibility. However, more in-depth analyses involving multiple genetic loci are essential in order to delineate the origin and spread route of the CAS strains. Our study was partly undertaken with this objective.
The genotypic analyses employed in this study showed that the majority of isolates (73.8%) belonged to the TbD1 2 PGG1 genogroup, whereas only 26.2% belonged to lineages with an intact TbD1 region. The relatively low number of TbD1 + isolates observed in this study is in line with previous studies from other parts of North India, which show a predominance of TbD1 2   strains in the North, whereas Central-and South India are dominated by TbD1 + strains [2,7,14,18,31]. Spoligotyping results from this study confirmed previous studies [5,18,32] which show that the CAS1_DELHI sub-lineage dominates in the Delhi region and that ST26 is one of the most frequently observed genotypes in this area. Strains of the ST1 genotype (Beijing lineage) are reportedly the most prevalent genotype world-wide [16]. In contrast to other urban areas in South-East Asia [33], for this study, only 4.6% (n = 3) of the isolates belonged to the ST1 genotype compared to 24.6% (n = 16) of ST26 genotype. The low frequency of Beijing isolates in this region is in concordance with other studies from Karachi, Pakistan [34] and Delhi, India [7]. Like isolates of the hypervirulent and MDR-associated ST1 genotype, the ST26 isolates belong to the TbD1 2 PGG1 genogroup. Isolates of the Beijing lineage have been shown to be less prevalent in North India compared to the central and southern parts [18]. All ST1 isolates and 31.3% of ST26 isolates in this study had mutations in at least the rpoB and katG genes. Unlike the higher frequency of MDR reported for isolates belonging to the ST1 genotype, no significant increase in the frequency of combined mutations in the rpoB and katG genes in the ST26 genotype could be established. Furthermore, in agreement with previously published results [4,34], isolates belonging to the CAS1 sub-lineages did not show an increased rate of MDR. However, our study showed that mutations conferring resistance to both RIF and INH, key first-line anti-TB drugs, were more frequent in CAS1_DELHI isolates compared to non-CAS_DELHI isolates (OR: 3.1, P = 0.045). Mutations (deletions and insertions) in the rpoB and katG genes are considered to be synonymous with resistance to RIF [35] and INH [36], respectively. The estimated frequency of spontaneous mutations conferring resistance to INH are 3.5610 26 and 3.1610 28 for RIF [37]. The risk of a double spontaneous mutation is very low, estimated to occur at a frequency of 9610 214 [38]. For this study, mutations in the rpoB gene were observed in 34 out of 65 (52.3%) of the isolates, of which 79.4% (n = 27) of the M. tuberculosis isolates belonged to TbD12 genotypes. Sixty percent of the isolates with a mutation in the rpoB gene belonged to the CAS1_DELHI sub-lineage. The MIRU-VNTR results clearly show that although similar (26 of 36 CAS isolates had .90% similarity in their MIRU-VNTR types), the isolates were not clones from a recent outbreak. There was no apparent association between patients having received previous treatment for TB and mutations conferring drug-resistance to RIF and INH for CAS1_DELHI sublineages, suggesting that there may be a higher frequency of circulating drug-resistant CAS1_DELHI isolates in New Delhi.
Other genotypic studies from India [7,18] are based solely on spoligotypes. In our study, high resolution genotyping, based on the multiple markers employed, was able to provide clues with regard to the evolutionary history of the CAS sub-lineages. The secondary sequence typing approach (gyrA/katG and pncA genes) revealed a CAS-lineage specific silent C-.T mutation in codon 65 of the pncA gene. This mutation was present in all observed ST's within the CAS lineage, except for 2 out of 16 ST26 (CAS1_DELHI) isolates, suggesting that the mutating lineages are descendents of the ancestral CAS1_DELHI type. Previous studies [39][40][41] show that there does not seem to be any association between this mutation and resistance to PZA. In order to determine if this mutation is generalizable to CAS isolates from other geographical locations, isolates from Myanmar (n = 15) and South Africa (n = 4) were also analyzed. All isolates from South Africa and Myanmar, except two isolates from Myanmar belonging to the ST26 and ST1378 genotypes, had the C-.T mutation in codon 65 of the pncA gene. This mutation could potentially be utilized to develop a rapid CAS specific PCR-based genotyping assay, which could be useful in phylogeographical studies and in epidemiological investigations of TB outbreaks.
In addition to the silent mutation in codon 65 of the pncA gene, 9 other point-mutations in the pncA gene were observed ( Figure 1 and Table 3). The pncA gene codes for the enzyme pyrazinamidase, which converts the pro-drug PZA, to its active form. PZA is one of the cornerstone drugs retained in the treatment of MDR-TB. Previous studies [42][43][44] have shown that mutations in the structural gene or promoter region of pncA, leading to an amino acid substitution, confer resistance to PZA in 72-97% of the cases. We have recently shown that a high proportion of South African MDR M. tuberculosis isolates are resistant to PZA [42]. In this study, we show that 30.4% of the MDR isolates were also resistant to PZA. Further studies with systematic testing for PZA are required to assess if these findings are representative for MDR isolates from other states in India.
Careful monitoring of circulating strains is required to detect new and emerging genotypes. Isolates belonging to the CAS1_-DELHI sub-lineage should be fully characterized with regards to their level of virulence, transmissibility and mutability to determine if they share certain characteristics with the hypervirulent and highly transmissible Beijing lineage. The association between the CAS1_DELHI isolates and drug-resistance observed for this study may be attributed to a low sample size and the fact that the isolates were collected from a single hospital in an urban setting. Thus larger studies, including isolates from more remote areas, are required in order to fully delineate the association of the CAS1_DELHI isolates with drug-resistance. Additionally, screening for the pncA S65S mutation in isolates from other shared-types of the CAS-lineage may aid in the reconstruction of the phylogenetic relationships of genotypes within this lineage.