Whole Genome Sequencing Reveals Complex Evolution Patterns of Multidrug-Resistant Mycobacterium tuberculosis Beijing Strains in Patients

Multidrug-resistant (MDR) Mycobacterium tuberculosis complex (MTBC) strains represent a major threat for tuberculosis (TB) control. Treatment of MDR-TB patients is long and less effective, resulting in a significant number of treatment failures. The development of further resistances leads to extensively drug-resistant (XDR) variants. However, data on the individual reasons for treatment failure, e.g. an induced mutational burst, and on the evolution of bacteria in the patient are only sparsely available. To address this question, we investigated the intra-patient evolution of serial MTBC isolates obtained from three MDR-TB patients undergoing longitudinal treatment, finally leading to XDR-TB. Sequential isolates displayed identical IS6110 fingerprint patterns, suggesting the absence of exogenous re-infection. We utilized whole genome sequencing (WGS) to screen for variations in three isolates from Patient A and four isolates from Patient B and C, respectively. Acquired polymorphisms were subsequently validated in up to 15 serial isolates by Sanger sequencing. We determined eight (Patient A) and nine (Patient B) polymorphisms, which occurred in a stepwise manner during the course of the therapy and were linked to resistance or a potential compensatory mechanism. For both patients, our analysis revealed the long-term co-existence of clonal subpopulations that displayed different drug resistance allele combinations. Out of these, the most resistant clone was fixed in the population. In contrast, baseline and follow-up isolates of Patient C were distinguished each by eleven unique polymorphisms, indicating an exogenous re-infection with an XDR strain not detected by IS6110 RFLP typing. Our study demonstrates that intra-patient microevolution of MDR-MTBC strains under longitudinal treatment is more complex than previously anticipated. However, a mutator phenotype was not detected. The presence of different subpopulations might confound phenotypic and molecular drug resistance tests. Furthermore, high resolution WGS analysis is necessary to accurately detect exogenous re-infection as classical genotyping lacks discriminatory power in high incidence settings.


Introduction
Infection with strains of the Mycobacterium tuberculosis complex (MTBC), the causative agent of tuberculosis (TB), is responsible for approximately 1.4 million deaths annually [1]. Even more worrisome is the worldwide emergence of resistant, multidrug-resistant (MDR), or extensively drug-resistant (XDR) MTBC strains that pose a serious challenge for global TB control and make successful treatment difficult or impossible [1]. MDR strains are resistant to at least isoniazid (H) and rifampicin (R), the most effective anti-TB first-line drugs. XDR is defined as MDR with additional resistance to a fluoroquinolone, e.g. levofloxacin (Lfx), moxifloxacin (Mfx), ofloxacin (Ofx), plus resistance to any of the second-line injectable drugs capreomycin (Cm), amikacin (Am) or kanamycin (Km). Recently, the WHO estimated 650.000 cases (including 150,000 deaths) of MDR-TB with an estimated XDR-TB rate of 9% in 2010 [1].
Drug resistant TB infection can be caused by transmission of already resistant strains (primary resistance) or by selection of resistance conferring mutations during inadequate therapy (secondary resistance). Our own work [2] and a recent meta analysis [3] indicated that the success of MDR treatment is low in high incidence settings. Even during effective second-line treatment, the development of additional resistances led to XDR-TB in a significant number of cases. Transmission and acquisition of drug resistance appear to depend on the genetic background (phylogenetic lineage) of a strain population circulating in a particular region. Different genetic backgrounds might compensate for the fitness costs of acquired resistance leading to more efficient transmission of resistant strains or to rapid emergence of resistance in the course of the therapy due to higher mutation rates [4,5]. This notion is strongly supported by the fact that in areas with the highest proportions of MDR strains (mainly in Eastern Europe), MDR-TB has been shown to be strongly associated with infecting strains belonging to the Beijing lineage of the MTBC [6][7][8][9].
Following earlier studies on the occurence of E.coli [10,11] and P. aeruginosa [12] mutator phenotypes (with up to 1,000 fold increased mutation rates) in changing environmental conditions, de Steenwinkel and colleagues hypothesized a higher mutation rate for Beijing strains under antibiotic therapy. The authors showed that Beijing genotype strains exhibit significantly higher (1.6 × 10 −5 to 5.4 × 10 −3 versus 6.3 × 10 −8 to 3.8 × 10 −4 , p = 0.003) mutation rates during experimental rifampicin treatment compared to strains of the East African Indian (EAI) lineage [5]. This observation was confirmed by Ford et al. who reported a more rapid acquisition of drug resistance in vitro for lineage 2 strains (comprising Beijing isolates) compared to lineage 4 strains, although on a much lower level [13]. While these studies pointed to a higher mutation rate of lineage 2 strains that might explain an enhanced resistance acquisition during standard therapy, earlier studies using fluctuation assays did not show an enhanced mutation capacity of Beijing strains [14].
Considering these controversial findings, direct investigations of the genome evolution of clinical isolates in transmission chains or in patients undergoing longitudinal therapy are desirable. Recent studies applying Next Generation Sequencing (NGS) technologies revealed that M. tuberculosis genomes are highly stable during human to human transmission and evolve with a mutation rate of approximately 0.5 single nucleotide polymorphisms (SNPs) per genome per year [15,16]. So far, only two studies have used NGS to analyze the genomes of serial isolates obtained from patients during longitudinal treatment [17,18]. Saunders et al. 2011 determined a very low mutation rate in a non-Beijing strain (personal communication) and concluded that its genome is relatively stable within the host showing lower variability as suggested by studies on emergence of drug resistance and hypermutability [5,13]. However, Sun et al. showed for serial isolates of three patients infected with Beijing strains that the evolution in the patient might be more complex and the population structure within a patient can be diverse [18]. The authors identified 8-41 SNPs per sample, the majority of which (82.7%) were only found at frequencies below 20%, including four to five different resistance conferring variants [18]. This might go in line with higher mutation rates of Beijing strains in general leading to more complex evolutionary patterns and faster acquisition of drug resistance. The authors also used specific analysis tools to define low level variants (5%) that explained a significant number of variations in their study.
To follow up on these findings, we investigated whether a higher mutation rate is a general Beijing genotype characteristic that holds true for Beijing strains from other areas with high rates of MDR-TB. Therefore, we carried out whole genome sequencing (WGS) of serial isolates derived from three MDR-TB patients infected with Beijing strains undergoing long-term treatment of up to 32 months. All isolates belonged to the W-Beijing family and were closely related to strain types that were described as successful MDR clones in Karakalpakstan (Uzbekistan) [7], Kazakhstan [6], Turkmenistan [7] and Abkhazia (Republic of Georgia) [8] (see Figure 1). Special attention was given to the way mutations occurred, e.g. in a stepwise manner related to the emergence of resistance to additional drugs as suggested by Saunders and colleagues [17], or by mutational burst during antibiotic treatment. A transient or stable MTBC mutator phenotype, caused by higher replication failure rates, might have substantially increased mutation rates (10 to 1,000 fold increased) as indicated by previous studies [5,10]. This in turn would co-select numerous mutations, e.g. in non-coding regions and non-functional mutations that hitchhike together with new resistance mutations. To investigate this question we screened the genomes for hitchhiking mutations that are likely to emerge in a hypermutable clone.

Patient derived isolates
The three patients (A, B, C) were part of epidemiological studies in Germany and Eastern Europe. Standardized protocols for mycobacterial growth, DNA extraction, drug susceptibility testing (DST), RFLP fingerprint, 24-loci MIRU-VNTR and spoligotyping can be found in respective publications (Patient A -Germany [19], Patient B -Abkhazia (Republic of Georgia) [8,20], Patient C -Karakalpakstan (Uzbekistan) [2]). DNA was isolated from subcultures grown on Löwenstein-Jensen slants from sputum culture without single colony passage to allow for investigation of the population diversity within one isolate.

Identification of Polymorphisms
Genomic DNA of selected isolates was sequenced on Illumina instruments at GATC Biotech (Konstanz, Germany). For Patient A and C derived isolates we used 72 bp paired end reads, and for Patient B 46 bp paired end reads, respectively. Reads were mapped to the M. tuberculosis H37Rv genome (GenBank ID: NC_000962.2) employing the exact alignment program SARUMAN [21]. For all isolates, over 96% of the M. tuberculosis H37Rv genome was covered with at least one read. Paired end reads from all analyzed isolates were deposited in the NCBI sequence read archive (SRA) and can be found under the accession number SRA060746.
Genomic regions featuring high GC content or repetitive elements (e.g. PE/PPE/PGRS gene families, ESAT-6 like, lppA, lppB, pks12) offer a serious challenge to WGS and data analysis, which is often indicated by a low variant frequency (<50%). To avoid the inclusion of false positives in our data set and to detect the fixation of possible beneficial variants within the population, we extracted polymorphisms from mapped reads by in house Perl scripts using rather conservative quality threshold levels, i.e. a minimum coverage of ten reads and a minimum allele frequency of 75%. Detected polymorphisms of all analyzed isolates were combined and majority nucleotides for positions without a variant call were inferred from mapping data ( Table S7 in File S1). We screened serial patient derived isolates subsequently for fixed mutations that occurred during the treatment with at least 90% allele frequency and adjusted coverage thresholds, and one outlier allowed. All polymorphisms differentiating serial isolates and genes associated with acquired drug resistances were further validated by Sanger sequencing using an ABI3130xl Sequencer. In addition, we applied VarDetect (vers. 200601251500) [22] to analyze peak height ratios of heterogeneous SNPs in order to estimate the frequency of different dominating subpopulations in all serial patient derived isolates (see also Method S1).

Results
Based on our large in house database comprising IS6110 RFLP and spoligotyping patterns of MTBC strains analyzed in previous studies, we selected three TB patients infected with W-Beijing strains exhibiting a massive acquisition of additional drug resistances during treatment, finally leading to XDR-TB. In each case, exogenous re-infection was excluded in the underlying study by identical IS6110 fingerprint patterns of the serial isolates ( Figure 1). The three patients have been enrolled in studies performed in Germany [19], Abkhazia (Republic of Georgia) [20], and Karakalpakstan (Uzbekistan) [2] .
Three isolates from Patient A and four isolates from Patient B and C, respectively, covering the whole treatment period, have been subjected to WGS. Obtained reads were processed with our in house software pipeline described in Material and Methods and Figure 2. SNPs and small deletions of subpopulations predominating in selected isolates and genes associated with resistance acquisition were further examined by Sanger sequencing in up to 15 serial isolates and then linked to the acquisition of phenotypic resistance to first and second-line drugs. Furthermore, we applied VarDetect [22] to calculate peak height ratios for heterogeneous SNPs and to roughly estimate the nucleotide mixture and the frequency of each subpopulation, respectively, in all patient derived isolates.

Intra-host genome evolution of Patient A and B isolates was restricted to acquisition of drug resistance conferring mutations
Patient A was dually infected with an H and streptomycin (S) resistant non-Beijing strain and a MDR W-Beijing strain that was resistant to H, R, S and ethambutol (E). The initial treatment was effective against the non-Beijing strain but inappropriate for the W-Beijing type strain. Outdated DST data led to the development of resistance to six additional drugs (including pyrazinamide (Z), prothionamide (Pto), PAS, Am, Cm, Lfx) after clearance of the non-Beijing strain, resulting in a "hyper-resistant" phenotype after 32 months of MDR-TB treatment [19].
In comparison to H37Rv, we identified 1072 polymorphisms by WGS, four of which were variable among the three selected serial isolates, including a large deletion (3.1kb) in one isolate (Table S1 in File S1). With respect to previously described drug resistance associated mutations, all isolates had the following variants: S315T in the gene katG known to confer resistance to H [23], K43R in the gene rpsL shown to confer resistance to S [24], and M306I in the gene embB linked with resistance to E [25] (Table S1 in File S1). No SNP was present in the 81bp "hot-spot" region of the gene rpoB ranging from codons 426-452 in H37Rv (GenBank ID: NC_000962.2), but all isolates  carried the mutations V170F in rpoB and V483A, E1092D in rpoC. The mutation V483A in rpoC has been described to compensate for fitness costs of classical rpoB "hot-spot" mutations causing resistance to R [26]. The mutation E1092D in rpoC was also found in R susceptible Beijing strains [27], thus most likely representing a phylogenetic rather than a resistance causing variant. Hence, the mutation V170F in rpoB is the most likely cause of resistance to R in this Beijing strain that might have acquired enhanced virulence by the compensatory mutation V483A in rpoC. All isolates exhibited the nonsense mutation Q215* in the gene ethA (monooxygenase) previously described to be involved in resistance to Pto [28,29]. However, phenotypic resistance to Pto was not detected till month 15 ( Figure 3C) and coincided with the mutation R59H in Rv0565c, encoding a putative monooxygenase.
To further examine the exact course of genome evolution, we analyzed the four detected polymorphisms differentiating selected serial isolates and genes associated with resistance acquisition to Z, Pto, PAS, Am, Cm, Lfx (pncA, ethA, thyA, rrs, gyrA, respectively) by Sanger sequencing for eleven Patient A derived isolates (Table S2 in File S1). This analysis revealed additional heterogeneous SNPs indicating the simultaneous presence and competition of three subpopulations (A1, A2 and A3) emerging in Patient A and coinciding with resistance to Z and Pto ( Figure 3A). Nucleotide variants, characterizing each subpopulation, were found for the genes pncA associated with resistance to Z [24,30] and Rv0565c. The clone pncA L159V (A1) was found in one patient derived isolate at month 15 only ( Figure 3A). In contrast, clones pncA T100P (A2) and pncA -(A3), exhibiting a 3.1kb deletion affecting Rv2042 to Rv2045 (including pncA) in combination with R59H in Rv0565c, were detected by PCR analysis until month 36 ( Figure 3B). During the course of the treatment, clones pncA -(A3) and pncA T100P (A2) independently acquired the mutation 1401a>g in the gene rrs conferring cross-resistance to Am and Cm [24,31]. Subsequently, 34 months after starting the treatment, clone pncA T100P (A2) acquired either the mutation D94G or D94N in the gene gyrA, while clone pncA -(A3) acquired the mutation D94A in the gene gyrA ( Figure 3A), all leading to resistance to Lfx [24,32].
Patient B was initially enrolled in a prospective crosssectional study between 2003 and 2005, which targeted determinants of drug-resistant tuberculosis in Abkhazia (Republic of Georgia) [8,20]. The patient was integrated into the study as non-MDR case, while the baseline isolate was tested resistant to H, S and Km. During the course of first-line antibiotic treatment (13 months) comprising a regimen of H, R, E and Z, additional resistances to R and E were detected and Patient B was classified as MDR case. According to the latest DST results the treatment was changed to ethionamide (Eto), Cm, Ofx, Cycloserine (Cs) and PAS (month 0 of MDR therapy). During the course of another 21 months of MDR treatment, 15 serial isolates were obtained with alternating episodes of R, E, Ofx, Eto resistant and susceptible isolates. All isolates were initially tested susceptible to Cm, Cs and PAS, respectively. Based on the sequencing results, DST was repeated for selected isolates and we could confirm cross resistance between Km and Cm, as well as resistance acquisition to Cs and PAS during treatment ( Figure 4B). WGS analysis identified 1575 polymorphisms compared to H37Rv, six of which were variable among the four selected  Table S2 in File S1). isolates (Table S3 in File S1). All isolates featured the mutation katG (S315T) in combination with inhA (-15C>T) conferring resistance to H [23] and low level resistance to Eto [33,34], rpsL (K43R) conferring resistance to S [24], pncA (L151S) associated with resistance to Z [24,30] and rrs (1401a>g) conferring cross resistance to Km and Cm [24,31] respectively. Furthermore, we found the mutation L452P in the "hot-spot" region of rpoB that has been controversially described to be associated with low level and high level resistance to R [35] and the mutation G406D in the gene embB, shown to confer resistance to E [36]. Drug resistance to E was not detected in the first two isolates. However, resistance to E was found by susceptibility testing of the following isolates.
Again we screened all Patient B derived serial isolates (n=15) by Sanger sequencing for the six mutations identified in selected specimens analyzed by WGS plus genes associated with acquired resistance to R, E, Ofx, Eto, Cs, PAS (rpoB, embB, gyrA, ethA, alr and thyA) ( Table S4 in File S1).  Table S4 in File S1). Overall eight SNPs and one 1bp frameshift mutation emerged during treatment in correlation with acquisition of additional resistances and associated with the presence of five subpopulations (B0, B1, B2, B2.1, B2.2). The subpopulations were initially distinguished by different mutations in rpoB: A435G (B1) and A584G (B2), both correlating with an R resistant phenotype ( Figure 4A, 4B). The MDR baseline R susceptible clone (B0) could still be detected on month twelve exhibiting no additional mutations. Clone rpoB A584G (B2) was identified on month eight for the first time and later independently acquired either the mutation D94Y (B2.1) or D94G (B2.2), respectively, in gyrA conferring resistance to Ofx [32]. The clone B2.1 exhibiting the combination rpoB A584G and gyrA D94Y further acquired the mutation S22L in the gene alr (Rv3423c, D-alanine racemase) on month eleven and was the only clone found in isolates after 13 months of MDR-TB therapy, thus outcompeting all other clones ( Figure 4A). Subsequently, clone B2.1 further acquired mutations in the genes thyA (Rv2764c, thymidylate synthase), thyX (Rv2754c, thymidylate synthase) and Rv1887 (hypothetical protein). The mutation S22L in the gene alr coincided with a Cs resistant phenotype, whereas a frame shift mutation in thyA was linked with PAS resistance [37]. Interestingly, the disruption of the thyA gene by a 1 bp deletion (A217fs) was linked with a mutation in the upstream region of thyX (-3G>A). On month 20 of MDR therapy we detected another descendant of B2.1 carrying the mutations -19C>G upstream of thyX (instead of -3G>A) in combination with Q245R in the gene Rv1887, representing another clone possibly compensating for the loss of ThyA ( Figure 4A).

Whole Genome Sequencing Revealed an Exogenous re-Infection with an XDR Strain in Patient C
Patient C was selected from an MDR-TB treatment cohort of 87 patients treated in Karakalpakstan (Uzbekistan). In the initial study, we investigated the development of Ofx resistance and extensively drug-resistant tuberculosis during MDR-TB treatment and controlled for exogenous re-infection by IS6110 RFLP typing of the serial isolates obtained in case of treatment failure [2].
The baseline isolate was tested phenotypically resistant to H, R, S, Z and Pto. After eleven months of MDR-TB therapy, the isolate (month 11) was also tested resistant to Ofx, Cm and Am, thus classifying the isolate as XDR. On month 15 after starting the treatment, the patient derived isolate was tested resistant to E. The DST pattern was confirmed by a final isolate four months later ( Figure 5B). All four isolates exhibited the same IS6110 and spoligotype pattern, as well as identical mutations conferring resistance to H (katG S315N), E (embB M306V), R (rpoB S450L) and S (rpsl K88R) ( Figure 5A, Table  S5 in File S1), suggesting treatment failure with development of resistance to four (respectively three, when considering the possibility that isolates of month 0 and 11 could already have been resistant to E) additional antibiotics during 15 months of MDR-TB therapy.
In contrast to the findings in patients A and B showing a low variability and stepwise acquisition of few SNPs over time, we identified eight SNPs and three deletions unique to the initial isolate and eleven SNPs unique to the following isolates ( Figure 5A, Table S5 and S6 in File S1), clearly separating the latter three isolates (month [11][12][13][14][15][16][17][18][19] from the baseline isolate. All three isolates obtained after month 11 exhibited no further polymorphisms and had identical SNP profiles with known drug resistance conferring mutations that coincided with an XDR phenotype ( Figure 5A, 5B), e.g. the mutation D94N in the gene gyrA conferring resistance to Ofx [31,32], and the mutation 1401a>g in the gene rrs linked with Am and Cm cross resistance [24,31].
Likewise, mutations associated with resistance to Z [24,30] were different between baseline (pncA g.395_527del) and follow up isolates (pncA L120R), further contradicting an evolutionary link between the latter isolates and the baseline isolate ( Figure 5A). Based on this data, we concluded that Patient C was re-infected with an XDR strain exhibiting the same IS6110 pattern as the first MDR strain that was successfully treated by the applied MDR regimen [2]. This notion was supported by MIRU-VNTR typing performed later that revealed a difference in one out of 24 MIRU-VNTR loci (Figure 1).

Discussion
In this study, we used WGS to determine the genome evolution of W-Beijing strains under longitudinal MDR-TB treatment of up to 32 months. Despite constant exposure to antibiotic stress during the whole treatment period, we could not find any indications for a burst of mutations e.g. by induction of erroneous or loosened replication control in the isolates investigated. Such a mutator type is expected to acquire a considerable number of additional mutations, e.g. non-functional mutations or intergenic mutations co-segregating with new drug resistance conferring mutations. However, nearly all mutations that were fixed in the population had a clear association with drug resistance mechanisms and occurred in a stepwise manner in strict correlation with the acquisition of new phenotypic resistances (see Table 1). This gradual acquisition of resistance conferring mutations and the absence of hitchhiking mutations contradicts the hypothesis of a stress induced mutational burst as proposed by others [5,38].
In line with previous studies [18,39,40], our data demonstrate the longitudinal co-existence of different clonal subpopulations within one patient. We observed a complex clonal population structure with a variety of subpopulations displaying unique resistance conferring mutation profiles as well as a different composition of predominant subpopulations in distinct isolates taken at the same time point. Target sequencing of up to 15 serial isolates of the whole treatment period extended these findings and revealed the longitudinal parallel evolution of few dominating clones within one patient. This intra-patient diversity is likely to influence the performance of molecular and phenotypic drug resistance tests and needs to be considered for interpretation of routine diagnostic results [41,42]. New NGS based diagnostic algorithms need to consider that the presence of mixed populations in MDR-TB patients might be the normal situation rather than the exception, which poses special challenges for NGS based resistance diagnostics as those algorithms must be optimized to unravel distinct subpopulations.
The intra-patient evolution observed in patients A and B was characterized by several phases of ineffective drug regimens during which clonal competition drove the selection towards highly resistant phenotypes that outcompeted the less resistant or less fit variants. This evolutionary process led to the acquisition of additional resistance mutations and/or possible compensatory mutations, e.g. as found in the upstream region of thyX (-19C>G, -3G>A) in PAS resistant isolates in Patient B. Importantly, these mixed populations not only comprised compositions of susceptible and resistant alleles that might be expected when resistant cells are selected and replace susceptible ancestors, but they also demonstrated the parallel selection and evolution of different resistant subpopulations. This underlines the ability of MTBC strains to adapt and evolve within the patient. However, more studies are necessary to answer the intriguing question of how the phylogenetic background of clinical isolates can influence treatment outcome.
Our data demonstrated that second-line drugs used for treatment are indeed acting on MDR strains by exerting an evolutionary pressure that selects for drug resistance conferring mutations. For the first-line drug rifampicin, one interesting question is the importance of mutations outside the rpoB "hot-spot" region (codons 426-452, H37Rv annotation), which is usually interrogated by molecular resistance assays, for the development of resistance. For Patient B, our data indicated a stepwise acquisition of resistance to rifampicin that involved the initial acquisition of a probable low level resistance mutation (rpoB L452P) [35], followed by additional rpoB mutations: A435G (low level resistance) [43] and A584G (not reported so far), which finally conferred high level resistance to rifampicin. Furthermore, we found a yet unknown mutation V170F in the gene rpoB, in combination with the mutation V483A in the gene rpoC, suggested to be a potential compensatory mutation [26], which might explain the transmissibility of the MDR strain that infected Patient A [19].
Molecular mechanisms for the oral bacteriostatic second-line drugs PAS and Cs are only partly understood [24]. A previous study suggested that mutations in the gene thyA play a role in resistance development to PAS [37]. We could link a frameshift mutation in thyA (A217fs) to the emergence of PAS resistance in Patient B. This is in line with a recent report demonstrating that the absence of functional ThyA confers resistance to PAS [44]. Interestingly, mutations in the upstream region of thyX (-19C>G, -3G>A) emerged in the ThyA defective isolates. We suggest that these potential thyX promoter mutations compensate for the likely loss of function of ThyA. This notion is supported by findings determined in a recent study from Fivian-Hughes et al. [44]. The authors showed that ThyA and ThyX are different unrelated types of thymidylate (dTMP) synthases; while deletion of thyA conferred resistance to PAS, ThyX most likely compensated the loss of ThyA with a low dTMP synthase activity [44]. Thus, up-regulation of ThyX might represent an important compensatory mechanism contributing to fitness levels of PAS resistant strains.
Furthermore, our analysis linked resistance to cycloserine to mutations in the gene alr (S22L). This is in line with reports suggesting the mycobacterial alanine racemase (Alr) as target for Cs [24,45] and overexpression experiments of the D-alanine racemase that led to Cs resistance in Mycobacterium smegmatis [46]. Cross resistance to the prodrugs ethionamide (Eto) and prothionamide (Pto) was suggested to be mediated by alterations in the gene ethA (monooxygenase) [28] or by upregulation of the target protein InhA [33,34]. The data obtained for Patient A isolates let us assume that the disruption (Q215*) of ethA alone did not confer resistance to Pto. As suggested previously [29], the activation of Eto and Pto may include additional enzymes. Here, our data indicated the involvement of the acquired mutation R59H in the putative monooxygenase Rv0565c that coincided with resistance acquisition to Pto ( Figure 3A, 3B). Ambiguous DST results for Eto in Patient B derived isolates are most likely the result of low level resistance conferred by the mutation fabG1/inhA -15 C>T.
We are confident that we identified an exogenous reinfection of a MDR patient with an XDR strain indistinguishable by IS6110 fingerprint patterns and with identical resistance mutations to four first-line drugs. However, a mixed infection at baseline cannot be completely ruled out by the data obtained in the study. The XDR clone might have been absent in the baseline specimen by chance or only been present at very low frequency (not detectable by the applied 1% proportion method for DST). Importantly, we can exclude the clonal relatedness of both strains, and thereby highlight the risk of XDR-TB transmission in high incidence MDR-TB regions. Recent studies applying WGS on outbreak isolates defined the expected variations between clinical isolates from confirmed transmission chains to a maximum of five SNPs [15,16]. In contrast, baseline and latter isolates obtained from Patient C were distinguished each by eleven unique polymorphisms, and also displayed a single locus variant in the 24-loci MIRU-VNTR patterns. Early WGS based analysis of the genomes of closely related Beijing isolates already demonstrated that standard genotyping methods obviously lack discriminatory power. This makes the interpretation of molecular epidemiological studies on isolates from Eastern European high incidence settings, applying only such markers, rather difficult [47]. These observations have clear consequences for the interpretation of molecular epidemiological studies especially in high incidence areas and suggest that the analysis of ongoing transmission by standard molecular epidemiological techniques may establish false links between patients. This might lead to an overestimation of ongoing MTBC transmission and underestimation of the complexity of the pathogen population [47].

Conclusion
Our results showed that the intra-patient evolution of W-Beijing strains was characterized by a stepwise acquisition of resistance conferring mutations selecting for the most resistant phenotype during inadequate MDR-TB therapy. This incremental process without an accumulation of mutations hitchhiking with prominent resistance-associated mutations does not support the hypothesis of a stress induced mutational burst. However, the evolution of MTBC strains within the patient during drug treatment can be complex and mixtures of susceptible and resistant subpopulations as well as mixtures of subpopulations with different resistance conferring mutations can co-exist over long time periods. This diversity gives rise to MTBC strains which can adapt and evolve within the patient and therefore underlines the importance of a correct identification of mixed infections and treatment failures, arguing strongly for the application of genome based molecular epidemiology.

Supporting Information
File S1. Tables S1-S7. The Tables S1-S7 include detailed results on genotypic and phenotypic drug resistance acquisition and applied regimens for Patient A, B and C. Individual tables of the filtered NGS data are given for each Patient as well as the combined table with all analyzed genome positions.