Whole Genome Sequencing Investigation of a Tuberculosis Outbreak in Port-au-Prince, Haiti Caused by a Strain with a “Low-Level” rpoB Mutation L511P – Insights into a Mechanism of Resistance Escalation

The World Health Organization recommends diagnosing Multidrug-Resistant Tuberculosis (MDR-TB) in high burden countries by detection of mutations in Rifampin (RIF) Resistance Determining Region of Mycobacterium tuberculosis rpoB gene with rapid molecular tests GeneXpert MTB/RIF and Hain MTBDRplus. Such mutations are found in >95% of Mycobacterium tuberculosis strains resistant to RIF by conventional culture-based drug susceptibility testing (DST). However routine diagnostic screening with molecular tests uncovered specific “low level” rpoB mutations conferring resistance to RIF below the critical concentration of 1 μg/ml in some phenotypically susceptible strains. Cases with discrepant phenotypic (susceptible) and genotypic (resistant) results for resistance to RIF account for at least 10% of resistant diagnoses by molecular tests and urgently require new guidelines to inform therapeutic decision making. Eight strains with a “low level” rpoB mutation L511P were isolated by GHESKIO laboratory between 2008 and 2012 from 6 HIV-negative and 2 HIV-positive patients during routine molecular testing. Five isolates with a single L511P mutation and two isolates with double mutation L511P&M515T had MICs for RIF between 0.125 and 0.5 μg/ml and tested susceptible in culture-based DST. The eighth isolate carried a double mutation L511P&D516C and was phenotypically resistant to RIF. All eight strains shared the same spoligotype SIT 53 commonly found in Haiti but classic epidemiological investigation failed to uncover direct contacts between the patients. Whole Genome Sequencing (WGS) revealed that L511P cluster isolates resulted from a clonal expansion of an ancestral strain resistant to Isoniazid and to a very low level of RIF. Under the selective pressure of RIF-based therapy the strain acquired mutation in the M306 codon of embB followed by secondary mutations in rpoB and escalation of resistance level. This scenario highlights the importance of subcritical resistance to RIF for both clinical management of patients and public health and provides support for introducing rpoB mutations as proxy for MICs into laboratory diagnosis of RIF resistance. This study illustrates that WGS is a promising multi-purpose genotyping tool for high-burden settings as it provides both “gold standard” sequencing results for prediction of drug susceptibility and a high-resolution data for epidemiological investigation in a single assay.

bility testing (DST). However routine diagnostic screening with molecular tests uncovered specific "low level" rpoB mutations conferring resistance to RIF below the critical concentration of 1 μg/ml in some phenotypically susceptible strains. Cases with discrepant phenotypic (susceptible) and genotypic (resistant) results for resistance to RIF account for at least 10% of resistant diagnoses by molecular tests and urgently require new guidelines to inform therapeutic decision making. Eight strains with a "low level" rpoB mutation L511P were isolated by GHESKIO laboratory between 2008 and 2012 from 6 HIV-negative and 2 HIV-positive patients during routine molecular testing. Five isolates with a single L511P mutation and two isolates with double mutation L511P&M515T had MICs for RIF between 0.125 and 0.5 μg/ml and tested susceptible in culture-based DST. The eighth isolate carried a double mutation L511P&D516C and was phenotypically resistant to RIF. All eight strains shared the same spoligotype SIT 53 commonly found in Haiti but classic epidemiological investigation failed to uncover direct contacts between the patients. Whole Genome Sequencing (WGS) revealed that L511P cluster isolates resulted from a clonal expansion of an ancestral strain resistant to Isoniazid and to a very low level of RIF. Under the selective pressure of RIF-based therapy the strain acquired mutation in the M306 codon of embB followed by secondary mutations in rpoB and escalation of resistance level. This scenario highlights the Introduction Multidrug resistant tuberculosis (MDR-TB) is defined by resistance to at least two anti-tuberculosis drugs-Rifampin (RIF) and Isoniazid (INH). MDR-TB cases do not respond to the standard therapy and require prolonged and expensive treatment with toxic second-line antibiotics [1]. Resistance to RIF is frequently accompanied by resistance to INH and is used as a surrogate marker for MDR-TB [2,3]. Furthermore, the World Health Organization (WHO) recommends treating patients infected with RIF mono-resistant strains with same second-line antibiotics as MDR-TB patients [4]. Therefore rapid and accurate laboratory diagnosis of resistance to RIF is a key factor for selecting appropriate treatment regimen and limiting transmission of drug resistant disease.
Traditional culture-based Drug Susceptibility Testing (DST) is based on monitoring growth of Mycobacterium tuberculosis (MTB) in medium supplemented with a critical concentration of RIF, currently set at 1 μg/ml [5]. Due to the slow growth rate of the organism, turn around time for laboratory diagnosis of RIF resistance lies between one and two months. The past several years have seen a change in the paradigm of testing for RIF resistance with the increased use of rapid molecular tests based on the detection of mutations in the 81 bp Rifampin Resistance Determining Region [RRDR] of the rpoB gene. Such mutations are found in >95% of MTB strains resistant to RIF by culture-based DST [6,7]. Because molecular tests PCR-amplify bacterial DNA, they can be utilized directly on clinical samples to diagnose resistance to RIF within days (MDRTBplus) or hours (GeneXpert MTB/RIF).
Numerous studies demonstrated high sensitivity of molecular tests for early detection of phenotypically RIF-resistant cases [8][9][10]. However their use for routine diagnostic screening uncovered unexpected rpoB mutations in some MTB strains phenotypically susceptible to RIF. Cases with discrepant genotypic (resistant) and phenotypic (susceptible) RIF susceptibility results account for at least 10% of all RIF-resistant diagnoses obtained with the molecular tests [11][12][13][14] and present a novel challenge for interpretation of laboratory findings. Some of the discrepant strains harbor "low-level" rpoB mutations that confer MICs above the background level determined for fully susceptible strains but below the critical concentration of 1 μg/ml currently used in standardized culture-based susceptibility tests [14][15][16].
There is an increasing evidence that the subcritical level of resistance to RIF can negatively impact clinical outcomes of treatment with standard RIF-based TB regimens [11,17]. However the scope of the problem it poses for public health remains unknown. Isolates with rpoB mutations causing a low level resistance to RIF were reported to be rare in clinical practice and difficult to propagate in vivo [6,11,15]. They were mostly documented in treatment failure and relapse cases [11] or in immune-compromised patients [18]. These findings supported the hypothesis that "low-level" rpoB mutations impose a high fitness cost on MTB strains, which limits their infectivity and transmissibility [19][20][21][22].
The Mycobacteriology Laboratory at the Groupe Haïtien d'Etude du Sarcome de Kaposi et des Infections Opportunistes [GHESKIO] in Port-au-Prince, Haiti, detected several types of "low level" rpoB mutations in clinical MTB isolates with discrepant phenotypic and genotypic susceptibility results during routine screening of diagnostic samples with molecular tests [14]. Strains with a particular low-level resistance mutation obtained from independent patients were often clustered by spoligotyping, which suggested that they might be circulating in the community. We applied Whole Genome Sequencing (WGS) to investigate one suspected outbreak of 8 cases with a "low level" rpoB mutation L511P.

Ethics statement
The study was approved by the Institutional Review Board of Weill Cornell Medical College (New York, USA) and the Institutional Review Board of GHESKIO Centres (Port-au-Prince, Haiti). Clinical and epidemiological data were extracted from patients' charts. As this was a retrospective clinical chart review, the requirement for informed consent was waived by the institutional review boards.

Laboratory procedures
Primary specimens and MTB isolates in GHESKIO Mycobacteriology laboratory were routinely screened for mutations conferring resistance to RIF with MTBDRplus (Hain Life Sciences, Nehren, Germany) or/and GeneXpert MTB/RIF (Cepheid, CA, USA) assays. Isolates found resistant with molecular tests were further analyzed by sequencing of genes linked to resistance to RIF (rpoB), INH (katG, inhA, ahpC), ethambutol (EMB) (embB), pyrazinamide (PZA) (pncA) and fluoroquinolones (gyrA) and by spoligotyping. Isolates were also tested with conventional phenotypic DST on solid and liquid media. Minimal Inhibitory Concentration (MIC) to RIF was determined with Alamar Blue Broth Microdilution assay in 96-well plates. Laboratory procedures were described in detail elsewhere [14].

Mycobacterial isolates
Eight MTB isolates from the previous report [14] were found to have rpoB mutation L511P and were further investigated in this study. Patients' IDs correspond to the strain IDs used in a previous report as following:

TB treatment in Haiti
Patients are treated using Directly Observed Therapy according to the WHO guidelines [23][24][25][26]. Category I regimen consists of four drugs RIF, INH, EMB and PZA. Patients with drugsusceptible TB with recurrence, treatment failure or default receive category II treatment, which includes the addition of streptomycin. Those with MDR-TB receive an individualized WHO category IV regimen including at least 4 active drugs and one injectable agent (kanamycin or capreomycin) based on the DST results.

Classic epidemiological investigation
Clinical data were retrospectively extracted from patient's charts including epidemiological questionnaires administered to all individuals treated in GHESKIO's MDR-TB treatment hospital. The questionnaires were adapted for use in Haiti from Gardy et al. [27]. Notification and testing of close contacts was done with the consent of the patient.

Whole Genome Sequencing and data analysis
Frozen primary MGIT cultures were re-grown on Lowenstein Jensen slant, DNA was extracted with CTAB method [28] and additionally purified with Qiagen DNeasy Blood & Tissue Kit [QIAGEN, Hilden, Germany]. One microgram of purified DNA was used to prepare DNA insert libraries of 150-250 bp with Illumina Genomic sample kit (Illumina, Inc, San Diego, CA). Sequencing was performed on Illumina HiSeq 2000 analyzer according to the manufacturer's instructions. Single-end 50 bp reads were aligned to MTB reference genome H37Rv (GenBank NC_000962.2) with Novalign package (Novocraft Inc.; v.2.0.7). Single Nucleotide Polymorphisms (SNPs) and short INDEL variant discovery, genotyping and filtering were carried out with the Genome Analysis Toolkit version 1.6-6 [29]. SNP, INDEL discovery and genotyping were done on each sample and on all 7 samples simultaneously using standard hard filtering parameters or variant quality score recalibration [30].
Short reads were assembled into contigs with velvet (v.1.2.03) using kmer size 41 after comparing results using kmer sizes 27 to 47. Total assembled contigs lengths amounted to 98% of the reference genome. The contigs were mapped to the reference genome with BLAT and BLAST. SNP positions in highly similar, paralog gene families (PPE, PE_PGRS and wag22) and those where one or more isolates displayed an ambiguous residue with over 20% match with reference alleles were excluded. Mutation effects on protein and RNA genes were determined with Variant Effect Predictor [31] based on H37Rv annotations. All sequence data used in this work have been deposited into SRA under accession number "experiment SRX750472". 23 primer pairs for confirmatory Sanger sequencing (3730XL) were designed with BLAST using 1000 bp upstream and downstream of the SNP. All sequencing was performed in Cornell University Bio Resource Center, Ithaca, NY.

L511P MTB cluster
From March 2008 to June 2012 153 individual MTB isolates with mutations in rpoB RRDR were found by routine screening of clinical specimens and isolates in GHESKIO TB laboratory in Port-au-Prince, Haiti [14]. Eight of them (5.2%) harbored a "low level" rpoB mutation L511P (cTg->cCg). Five isolates with a single L511P mutation and two isolates with double mutation L511P&M515T had MICs to RIF between 0.125 and 0.5 μg/ml and tested susceptible to RIF in culture-based DST. The eighth isolate with a double mutation L511P&D516C was phenotypically RIF-resistant (Table 1). All eight strains shared the same spoligotype SIT 53, and were tentatively designated as an L511P cluster.

Patients-clinical history and epidemiological investigation
Clinical and epidemiological characteristics of the 8 patients from L511P cluster are presented in Table 1 and Fig 1d. Remarkably none of the interviewed patients reported contact with any other patient in the group, however we identified a few possible cases of indirect exposure.
Patient A was referred to GHESKIO center in 2010 after failing Category I and Category II TB treatments. Her first cousin once removed patient B independently entered MDR-TB treatment with a 10-month old son (patient C) in 2011. Both women belonged to the same extended family with significant history of pulmonary tuberculosis and treatment failure (Fig 1c). However they lived in different cities 50 miles apart and never knowingly met before admittance to GHESKIO MDR-TB hospital. Patient A's mother was diagnosed with TB in 1996. Patient A lived with and took care of her mother until the latter died in 1997. Their house was frequented by extended family. Subsequently four close family members developed TB. Two of them, who were HIV co-infected, but not on the antiretroviral therapy (ART), died while receiving Category I TB treatment in 1998 and 1999. Two others, who were HIV-negative, were successfully cured with Category I treatment in 2007 and 2009. Patient B reported seeing patient A's mother when she had active TB in 1996.
HIV co-infected patient D was treated for TB with Category I regimen in different health care facilities in 2005 and 2006. In 2010 he had recurrent TB after discontinuing ART, was treated with Category I regimen in GHESKIO, failed and was switched to Category II regimen. He showed delayed response to Category II treatment with AFB smear staying positive after 3 months. After 5 months his symptoms improved, AFB smear converted to negative and he is currently remaining symptom-free. This patient lived in the same densely populated poor neighborhood of Port-au-Prince as patient A but did not knowingly have social interaction with her or her extended family.
Patient G was treated for TB with Category I regimen in 2008 and was declared cured. However shortly after in 2009 he relapsed and died before his drug regimen could be adjusted. Patients E and F presented with their first episodes of TB in 2010 and 2011 respectfully. Patient E was cured with Category I treatment, Patient F was placed on Category IV treatment following a RIF-resistant result of MTBDRplus assay and was also cured. Of note, patient F reported living in the same neighborhood as patient G back in 2008 when he was having TB symptoms. The eighth patient H, who was diagnosed in 2008 was lost to follow-up and her epidemiological data were unavailable.

Whole Genome Sequencing of the L511P MTB cluster isolates
While epidemiological investigation uncovered very few links between the 8 cases with rpoB mutation L511P, common spoligotype SIT 53 suggested that they might belong to an outbreak. Spoligotyping separates MTB isolates into "types" according to presence or absence of 43 spacers in the Direct Repeat locus. However this genotyping method is not sufficiently discriminative to demonstrate that the isolates are closely related, especially since SIT 53 is common in Haiti and accounts for 6-7% of circulating MTB strains [32]. Therefore we subjected MTB isolates from 6 patients-A, B, C, D, E and F to Whole Genome Sequencing (WGS) to test the hypothesis that the L511P cluster was an outbreak and to try to delineate its evolutionary history. MTB isolates obtained from patients G and H couldn't be analyzed because they were not viable in 2012.
An unrelated MDR-TB isolate with the same spoligotype SIT 53 but without the L511P rpoB mutation was also selected for WGS. This isolate is referred to as "outgroup" in the text and Fig 1. Alignment of sequencing reads to the 4.41 Mb of the reference MTB genome H37Rv confirmed the SIT 53 spoligotype pattern in all 7 isolates and identified a total of 755 positions that had SNPs, short insertions or deletions in at least one of them (Fig 1a). The 6 isolates from the suspected outbreak were separated from the reference strain H37Rv by 527-531 sequence variants. They were closer to the Haitian outgroup isolate with the same spoligotype SIT 53 with distances between 383 and 390 sequence variants.
At the same time only 22 "non-identical SNPs" and 1 short deletion of 4 nucleotides were identified between the 6 L511P cluster isolates, all of which were subsequently confirmed by Sanger sequencing (Fig 1B). Individual members of the cluster demonstrated two (F) four (B, C and E), eight (A) and nine (D) non-identical sequence variants. These results confirmed that TB infection of the 6 patients resulted from the clonal expansion of a single ancestral strain and investigated cluster of cases presented an outbreak.
Isolates from the mother B and her baby C appeared identical. Both had 4 SNPs and shared 2 of them with patient A (a family member) and patient D (a possible neighbor of A). Therefore WGS confirmed the putative links established by the classic epidemiological investigation and indicated that patients A, B, C and D shared a more recent common ancestor.
Conserved DNA extracts from 2 isolates unavailable for WGS were amplified with primer pairs used to confirm the L511P cluster's 23 non-identical sequence variants and analyzed with Sanger sequencing (Fig 1B). We discovered that patient G's isolate had SNP profile identical with the isolate from patient F. Therefore WGS established a novel connection between the two patients, not previously identified by the classic epidemiological investigation.

SNPs in the L511P MTB cluster associated with drug resistance
Twenty three unique and 528 shared sequence variants found in the 6 outbreak isolates were examined for known links to resistance to anti-tuberculosis drugs (S1 Table), [33][34][35]. Variants found in the outgroup isolate are not discussed here.
WGS confirmed mutations in katG (S315T, all 6 strains), embB (M306I, strains A, B, C and D) and rpoB (L511P in all 6 strains, D516C in strain A, M515T in strains B and C) identified with Sanger sequencing. We found no mutations in rpoA or rpoC genes known to compensate for the fitness cost in RIF-resistant strains [34][35][36].
Mutations in gyrA (E21Q) and gyrB (K526Q) were found in all sequenced strains. However they were not associated with resistance to Fluoroquinolones as strains tested susceptible to Ofloxacin in a concentration range of 1 to 0.031 μg/ml. Full list of annotated sequence variants is available upon request.

Discussion
Use of molecular tools improves the accuracy and resolution of classic epidemiological investigations [37], which is especially needed in high-burden settings, where transmission often occurs outside of the household through social interactions untraceable by traditional methods [38][39][40]. In particular this applies to Haiti where 1.5 million people in densely populated Portau-Prince metropolitan area were displaced into tent camps after the earthquake of January 2010 [41]. Traditional MTB genotyping methods-IS6110 RFLP, Spoligotyping and MIR-U-VNTR each capture changes in only small part of MTB genome and are often used in tandem combinations [42,43]. As cost of WGS is rapidly declining, it is increasingly utilized to investigate outbreaks [27,44,45], to predict resistance to drugs [35] and to delineate evolution of MDR-TB strains [37]. Arguably, WGS could be a powerful tool in countries like Haiti where the capacity for genotyping is not adequate or altogether non-existent.
Our results demonstrated superior resolution power of the WGS typing in high prevalence setting when compared to classic epidemiological investigation-it was able to confirm a possible indirect link between patients (A and B) and to uncover unknown links (D with A and B; F with G). At least four patients with history of treatment failure would likely have been misclassified as cases with acquired resistance, while WGS unequivocally demonstrated that they were cases of primary MDR-TB. Our unpublished data indicate that primary MDR-TB is an important factor driving MDR-TB epidemics in Haiti and so resources need to be allocated not only for MDR-TB diagnostics and treatment but also for the active tracing and screening of contacts of known TB patients. The eight isolates characterized in this study probably represent only the tip of the iceberg in the outbreak since mycobacterial culture is not routinely performed for diagnosis of TB in Haiti where GHESKIO laboratory is the only facility with culture capacity. For that reason the outbreak transmission chain could not be reconstructed and the index case was not identified.
Recent WGS studies estimated the evolution rate in settings with low TB and HIV prevalence as 0.3-0.5 SNPs per genome per year resulting in accumulation of maximum 5 genetic changes in three years and 10 genetic changes in 10 years [45,46]. Based on the time of the earliest microbiologically diagnosed case, we estimate that the outbreak started in or before 2007. Individual isolates harbored 2 to 8 sequence variants not associated with drug resistance. Isolates with >5 SNPs came from patients with history of treatment with anti-tuberculosis drugs. Multiple treatment episodes like in the case of patient D create a "bottleneck" effect-killing part of the bacteria and selecting those with mutations providing any degree of survival advantage [44]. Overall the low number of SNPs found in our cases is in agreement with the estimate for molecular clock obtained in developed countries and is consistent with long periods of latent infection and few steps in the infection transmission chain in the L511P cluster.
To our knowledge this is the first report that delineates the fate of infections involving MTB strains with RIF MIC of 0.125 μg/ml, which is eight times lower than the currently accepted critical concentration of the drug. WGS confirmed that at least 7 out of the 8 cases in L511P cluster resulted from the clonal expansion of the same ancestral strain. Although the strain fitness was not determined in this study, it did not prevent transmission from mother to child, among distant family members, and between the casual social contacts. Immuno-compromised patient status was not a prerequisite for transmission since six of the eight patients who provided MTB isolates were HIV-negative.
Under the selective pressure of treatment with first line TB drugs, the strain initially resistant only to INH and to the very low level of RIF, acquired embB mutation at position M306. "Canonical" mutations in M306 codon of embB are necessary but not sufficient for development of a high-level resistance to EMB [47,48]. However they are strongly associated with emergence of resistance to RIF and to other anti-tuberculosis drugs by altering the properties of the efflux pump and effectively lowering their concentration inside the cells [49]. Accordingly, M306I embB mutation precluded the acquisition of a secondary rpoB mutation in two independent instances. Both times RIF resistance level escalated from the baseline 0.125 μg/ml-to intermediate level of 0.5 μg/ml in patient A (L511P & M515T) and to a high resistance level of >8 μg/ml in patient C (L511P & D516C).
In conclusion, this study reports the first results obtained from a project to genotype MDR-TB isolates in Haiti with WGS. While only a very limited number of strains were sequenced, we already gained novel insights into the public health significance of MTB strains with a sub-critical resistance to RIF by demonstrating their transmissibility and potential for escalation of resistance level under conditions of treatment with RIF-based regimens. We expect that introduction of routine WGS of MDR-TB strains in Haiti will provide necessary data to improve diagnosis, deepen understanding about genetic determinants of resistance and help shaping more effective public health policies to combat the disease.
Supporting Information S1 Table. L511P cluster-mutation found in genes linked to resistance to drugs and conventional DST results. List of drugs and genes adopted from [35]. Drugs with conventional susceptibility result are shown in bold. NT-not tested Ã One of eight isolates with double mutations L511P&D516C was resistant to RIF ÃÃ Position in E. coli genome shown in parentheses (DOCX)

Acknowledgments
We are very grateful to the Expand TB WHO Program for supplying GHESKIO laboratory with materials and reagents, to Foundation Merieux for helping to build and maintain the Biosafety Level 3 facility and to the staff of Bacteriology Laboratory of the New York State Department of Health for continuing assistance in staff training and quality control. Our special thanks to the director of Cornell Genomic Core Facility Peter Schweitzer for his valuable advise and help in organizing sequencing of our isolates.