Impact of Replacing Smear Microscopy with Xpert MTB/RIF for Diagnosing Tuberculosis in Brazil: A Stepped-Wedge Cluster-Randomized Trial

Betina Durovni and colleagues evaluated whether implementation of Xpert MTB/RIF increased the notification rate of laboratory-confirmed pulmonary tuberculosis and reduced the time to tuberculosis treatment initiation in 14 Brazilian primary care laboratories. Please see later in the article for the Editors' Summary


Introduction
The battle against tuberculosis (TB), a leading cause of death worldwide [1], has been hampered by a lack of accurate and rapid diagnostic tests, including those for drug resistance. The automated real-time PCR-based Xpert MTB/RIF assay (Xpert; Cepheid, Sunnyvale, California, US) can detect in 2 h the presence of a Mycobacterium tuberculosis-specific sequence of the rpoB gene as well as mutations in this gene responsible for most cases of phenotypic rifampicin resistance [2]. Xpert has proved to be feasible [3,4], accurate [5], and cost-effective [6][7][8] under field conditions in different settings, including at point of care in peripheral clinics. Since the World Health Organization (WHO) endorsed the use of Xpert in populations with high rates of drugresistant TB and HIV co-infection [9] in 2010, more than 85 peerreviewed papers have reported the assay's accuracy for different specimens and populations [10].
However, the true clinical and public health performance of diagnostic tests is influenced by the treatment decisions made based on test results, and on delays in processing samples and reporting results [11]. For scaling up new diagnostics, decisionmakers need pragmatic randomized controlled trials with patientrelevant endpoints, such as time to treatment initiation and treatment outcomes [12,13]. A recent randomized controlled trial in sub-Saharan Africa [4] was the first to demonstrate that despite a higher proportion of TB confirmation for Xpert than for smear microscopy (83% versus 50%), overall TB detection did not increase, because of the high rates of empirical treatment. In addition, Xpert diagnosis did not result in decreased morbidity at 2 and 6 mo of treatment. Studies conducted in different populations (populations with high HIV co-infection rates [14] and hospitalized patients [15]) also failed to show improvement in clinical outcomes, despite the reduced time to TB diagnosis with Xpert [14,15]. In both cases, rates of empirical treatment were also very high.
Moreover, substantial controversies remain about where Xpert capability should be located (peripheral clinics versus centralized laboratories), Xpert's role in increasing detection of drugsusceptible as well as multidrug-resistant (MDR) TB, and the optimal management of Xpert rifampicin-resistant cases before confirmatory phenotypic drug susceptibility testing (DST) results are available [10]. Also, the benefits of implementation of Xpert in routine medical care remain to be established. Despite the limited available evidence of the programmatic benefits of the adoption of the assay, by September 2013, 95 out of the 145 countries eligible for concessional prices had procured cartridges for the public sector [16].
In the context of a pilot rollout project in Brazil, we conducted a pragmatic trial to evaluate the effect of replacing two-sample smear examinations by one-sample Xpert on pulmonary TB notification to the national notification system and time to treatment initiation in routine public health practice. The trial's design, analysis, and reporting adhered to the principles of the CONSORT statement for pragmatic trials [17].

Ethics
The study was approved by the Brazil National Ethics Commission (CONEP #494/2011), the Rio de Janeiro Municipal Health Department Review Board (CEP SMS #236/11), and the Tropical Medicine Foundation of Amazonas Review Board (CEP FMT/HVD, 24 November 2011). The need for informed consent was waived by the ethical boards because this was a pilot implementation of a diagnostic test in routine practice, and only routine reporting data were used for the analysis.

Study Setting and Participants
With 82,775 TB patients notified to Brazil's national notification system (Sistema de Informação de Agravos de Notificação [SINAN]) in 2012, Brazil is one of the 22 high-TB-burden countries [18]. Sputum smear examination (stained for acid-fast bacilli) is the mainstay of pulmonary TB diagnosis, with mycobacterial culture and DST recommended for specific subpopulations only, in particular for previously treated patients [19]. An estimated 26% of new patients start treatment on clinical/ radiological grounds, without bacteriological confirmation, and in over 70% of retreatment patients no culture or DST is performed [20]. The Brazilian National TB Program recommends empirical treatment while awaiting culture results if, despite a course of broad antibiotics, symptoms persist and there is a high clinical suspicion despite negative smear results ( Figure S1) [19]. Rates of co-infection with HIV (9.7%) and of rifampicin resistance (,2% in 2010; Draurio Barreira, Director of the Brazilian National TB Program, personal communication) are relatively low.
The study was conducted in the cities of Manaus and Rio de Janeiro, which notified 1,315 and 4,959 new pulmonary TB cases [21], respectively, in 2011. In Rio de Janeiro (2010 population: 6,320,446) [22], Xpert was introduced in all 11 public primary care laboratories. In Manaus (2010 population: 1,802,014) [22], Xpert was introduced in three public laboratories, including an HIV referral hospital and a TB referral center. These laboratories cover 70% of TB diagnoses in both cities.
Patients whose sputum samples were sent to the study laboratories for diagnosis of pulmonary TB between 4 February and 4 October 2012 were eligible. There were no exclusion criteria, but in the Xpert arm, samples considered insufficient or inadequate for Xpert processing according to the manufacturer's guidance [23] were tested only by smear examination. A sputum sample was considered insufficient for Xpert testing if its volume was less than 1 ml, and was considered inadequate for Xpert testing if on macroscopic examination it did not contain sputum or was blood-stained (as this may inhibit the PCR reaction) [24].

Study Design
This trial was a group-based comparison with phased introduction of Xpert to replace sputum smears as the initial diagnostic test for new pulmonary TB ( Figure 1) [25]. A stepped-wedge design was chosen as it allowed a randomized comparison within a pilot project before national rollout. The units of comparison were TB laboratories and the clinics that use their services. The 14 trial laboratories were randomly assigned to the order in which they entered the intervention. To prevent imbalanced randomization with respect to important confounding variables as a result of the relatively small number of units, we applied restrained randomization based on the size of the monthly case load (low [n = 2], intermediate [n = 10], and high [n = 2]) of the laboratories, and on the estimated HIV prevalence (low [n = 12] and high [n = 2]) among the patients [26]. Allocation was not concealed, but laboratory staff and physicians were blinded to the order of entry into the intervention until Xpert was introduced.
In the smear microscopy arm, up to two sputum smears per patient were examined by conventional light microscopy based on direct Ziehl-Neelsen staining, as per routine. In the Xpert arm, usually only one of the submitted samples was examined. The second sample was processed for Xpert testing only if the first one was inadequate or insufficient, or if an error in processing occurred. If the first sample was processed successfully, the second sample was discarded. Results were reported back to the requesting clinic. Patients with an Xpert rifampicin-resistant result were referred to a referral center, and provisionally started on firstline treatment while awaiting confirmation by phenotypic DST, in line with existing Brazilian National TB Program guidelines (Löwenstein-Jensen medium or Mycobacteria Growth Indicator Tube [BD Microbiology Systems, Cockeysville, Maryland, US], according to the referral laboratory routine). This policy at the time of the study was based on the low drug resistance prevalence in Brazil and the expected low positive predictive value (PPV) of an Xpert rifampicin resistance result. Sample size was calculated based on a laboratory-confirmed TB notification rate of 50/ 100,000/year, an average cluster population of 500,000, a coefficient of variation of 0.25, and an additional design effect due to the cluster design of 1.5 [26,27]. The study was powered to be able to detect a 60% increase in laboratory-confirmed pulmonary TB with a 5% type I error and an 80% type II error in the 8-mo study period.
All laboratories started off providing samples in the smear microscopy arm. Two laboratories then switched overnight to the Xpert arm every month, so that in the eighth (final) month of the trial, all units were in the Xpert arm ( Figure 1). Fourth generation Xpert cartridges (G4) were used.
Primary endpoints were (1) the notification rate of laboratoryconfirmed pulmonary TB to SINAN by any of the clinics relying on study laboratories' services, measured by the difference and the ratio of rates in the intervention versus the baseline period, and (2) time to treatment initiation, estimated by the notification date minus the laboratory result date. In Brazil, notification of TB to SINAN is mandatory, and is done at the time of treatment initiation, such that notification can be considered to indicate that, and when, a patient started treatment.
Secondary endpoints were the following notification rates: for pulmonary TB despite a negative test result, for pulmonary TB without any laboratory result reported, and for overall pulmonary TB irrespective of laboratory test result. Additional endpoints were the rate of Xpert tests positive for rifampicin resistance and the proportion of patients with a rifampicin-resistant Xpert result confirmed by conventional DST (PPV).

Data Collection, Management, and Analysis
Data were collected from the routine laboratory reporting system (Gerenciamento de Ambiente Laboratorial [GAL]) and SINAN.
GAL contains details and results of all diagnostic tests ordered in the public laboratory system, entered by the laboratories. SINAN contains demographic and clinical data on all patients starting TB treatment, entered by the treating physicians or nurses. Entries in GAL were checked periodically for discrepancies against the regular TB laboratory logbooks and Xpert machines' logs; errors were corrected directly in GAL. When the sample collection period was completed, GAL records related to diagnostic testing were extracted and allocated to the smear microscopy or Xpert arm according to sample processing dates. For the smear microscopy arm, any (first or second) positive test was considered a positive result. Pulmonary TB notifications for the study period were extracted from SINAN and checked manually for inconsistencies.
For Rio de Janeiro, pulmonary TB cases notified by clinics outside the municipal primary care network were excluded. For Manaus, all notified pulmonary TB cases were included. Pulmonary TB cases in Manaus that were notified in SINAN but not identified in GAL were assigned to one of the three participating laboratories by adding to each laboratory a number of notified cases proportional to the number notified with laboratory confirmation, stratified by month, sex, and age group.
The databases were linked using RECLINK [28] by name, date of birth, and sex; additional manual linkage was performed using the following algorithm in Stata version 12 (Stata Corp, College Station, Texas, US): patients were considered identical if (1) sex, clinic, and date of birth were the same, and name was similar except for missing given names, abbreviations, or different spelling, or (2) sex and clinic were the same; name was similar except for missing given names, abbreviations, or different spelling; there was a 0-to 14-d difference in result report date in GAL and start of treatment according to SINAN; and date of birth differed for day only, month only, or year only, or the date and month were swapped (e.g., 11 April and 4 November).
Culture and DST results for patients with Xpert rifampicinresistant samples were obtained from the Brazilian MDR TB reporting system by manual linkage [29].
Analyses were performed in Stata version 12. Numbers of laboratory diagnoses of TB and TB notifications were calculated for the smear microscopy and Xpert arms, stratified by municipality, age group, sex, HIV co-infection, and study month. Since the trial did not follow cohorts of patients, the units of analysis were not individual patients but populations with their number of notified cases. We therefore constructed an aggregated database of the number of TB notifications and population denominators for each of the 896 strata combining laboratory (n = 14), study month (n = 8), sex (n = 2), and age group (n = 4). For calculation of diagnostic and notification rates, population denominators took into account projected growth during the study period based on age-and sex-specific projected growth rates (separately for Rio de Janeiro and Manaus) [30] and were adjusted for variations in monthly number of days clinics were open by weighting the number of person-months for the proportion of patients with suspected TB with samples examined each month out of the total number examined by the laboratory during the whole study period, stratified by sex and age group.
The primary analyses were cluster-averaged, i.e., they compared the means of the cluster-specific notification rates between the 14 Xpert and the 14 smear microscopy cluster periods by their ratios and differences. Since the cluster-averaged method does not allow likelihood-based approaches for multivariable analysis, we adjusted the resulting rate ratio for potential confounding variables using a population-averaged quasi-likelihood method [26]. This method consisted of fitting a multivariable Poisson regression model that included all covariates except the notification rate (i.e., the endpoint of interest), and then comparing the model residuals for both trial arms by t-test or Wilcoxon rank sum test as appropriate [26]. The covariates included municipality (Rio de Janeiro or Manaus), age, sex, and rate of positive smear examinations observed in the first month, when all study laboratories were using smear examination. The last covariate was included to adjust for the baseline level of the endpoint parameter because with small numbers of randomization units (clusters), the randomization may not result in balanced distribution between the trial arms with respect to the expected endpoint. This cluster-averaged analysis approach provides the most robust results when the number of clusters is small, but has low statistical power [26]. Therefore, as a secondary analysis, we fitted mixed multilevel Poisson regression models to the overall aggregated data, specifying laboratory as the level of clustering to correct for within-laboratory correlation [31]. The multivariable models included municipality, age, sex, baseline smear positivity rate, and calendar time as covariates.
For all notified patients for whom a sputum sample had been submitted for laboratory testing, time to standard first-line treatment initiation was calculated as days between sputum processing date (obtained from GAL) and date of notification (obtained from SINAN) as a proxy for treatment initiation date. Cluster-averaged mean time intervals between sputum processing and start of treatment were compared with the Wilcoxon signedrank test. For Xpert rifampicin-resistant cases, time to initiation of treatment with second-line drugs could be assessed only in the intervention arm.
We calculated the crude proportion of diagnostic samples that tested positive for rifampicin resistance by Xpert, as well as the PPV compared to DST. All notification rates are expressed per year.
We excluded any laboratory diagnosis made in the Xpert arm by smear examination, for two reasons. First, this approach would best reflect the situation of only Xpert being available, and therefore would best quantify the impact on TB notification of replacing smear examination with Xpert testing. Second, this approach would err on the conservative side with regard to the magnitude of increase in notification of laboratory-confirmed TB due to the use of Xpert. Unlike trials in which the endpoint is derived by dividing the number of diagnosed patients (numerator) by the number of tested patients (denominator), the endpoint in the present trial had as the denominator the population served by the study laboratories. Therefore, excluding patients diagnosed by smear microscopy in the intervention arm from the numerator did not affect the denominator, such that after excluding the smeardiagnosed patients, the notification rate for the intervention arm would by definition be lower than that without this exclusion, bringing the notification rate ratio for the intervention compared to the baseline arm closer to one. For the primary endpoints, we also show the intention to treat (ITT) analysis, including laboratory diagnoses made through smear examination in the Xpert arm.

Results
During the study period, the 14 laboratories examined 34,758 sputum specimens. Excluded were 4,731 (28.8%) specimens examined in the smear microscopy arm, and 5,800 (31.7%) examined in the Xpert arm, mostly those obtained for treatment follow-up and duplicate samples ( Figure 2). The number of duplicate samples excluded was larger in the smear microscopy arm because often two samples per patient were examined. In total, 11,705 specimens in the smear microscopy arm and 12,522 in the Xpert arm were included in the primary analysis.

Primary Endpoints
Over the study period, 4,660 patients were notified with pulmonary TB to SINAN by clinics served by the study laboratories. Of these, 2,216 patients (47.6%) could be linked to positive test results (76.0% of all 2,914 positive test results) and 529 (11.4%) to negative test results (2.5% of 21,234 negative results). The remaining 1,915 (41.1%) notified patients could not be linked to any study period test result ( Figure 2). Conversely, 695 positive tests could not be linked to cases in the notification system, 303 (26.6% [95% CI = 24.0%, 29.2%]) in the baseline and 392 (22.1% [95% CI = 20.2%, 24.0%]) in the intervention arm (p = 0.003). We do not know whether these patients were treated without notification.
There was no difference in sex, age, or TB treatment history among these three groups (positive, negative, or no test results) or between the smear microscopy and Xpert arms. In both arms, HIV-infected patients were notified with laboratory-negative pulmonary TB more often than patients with negative or unknown HIV status were (Table 1). Table 2 shows the unadjusted results for the notification rate of laboratory-confirmed TB (primary endpoint). The cluster-averaged laboratory-confirmed TB notification rate was 30.5/ 100,000/year in the smear microscopy arm versus 48.7/ 100,000/year in the Xpert arm, for a notification rate ratio of 1.59 (95% CI = 1.31, 1.88) and a notification rate difference of 18.2/100,000/year (95% CI = 9.4, 26.8) favoring the Xpert arm. Notification rates for laboratory-confirmed TB were higher for men than for women, and highest in the age group 15-59 y; these patterns were consistent across both arms. In the ITT analysis (Tables S1 and S2), which includes laboratory diagnosis by smear microscopy in the Xpert arm, the TB notification rate ratio was slightly higher: 1.67 (95% CI = 1.39, 1.96). The reasons for examining these 2,170 Xpert-arm specimens by smear microscopy were as follows: insufficient volume (1,151; 53.0%), inadequate  material (e.g., saliva) (200; 9.2%), and logistical obstacles (819; 37.7%). The last referred to a single laboratory where the problems were solved after the first month of implementation. Thirteen of the 14 laboratories showed an increase in laboratory-confirmed TB notification rate with the switch to Xpert (Table 3), although the difference was not significant for five of these laboratories. The laboratory-specific notification rate ratios ranged significantly from 0.95 (95% CI = 0.65, 1.37) to 2.95 (95% CI = 1.48, 5.56), with a median of 1.53. Possible changes over time in the effectiveness of the intervention were examined by plotting the notification rate ratio of laboratory-confirmed TB against the number of months since the switch from smear examination to Xpert ( Figure S2) and by plotting the difference of the notification rate ratio in the intervention and baseline arms ( Figure S3). During any of the study months, the laboratoryconfirmed TB notification rate based on Xpert exceeded that based on smear examination ( Figure S3), and the notification rate for laboratory-confirmed TB based on Xpert remained stable over time ( Figure S2). The notification rate for overall pulmonary TB increased for 11 laboratories and decreased for three, with the laboratory-specific notification rate ratio ranging from 0.76 to 1.75 (median 1.16; Figure S4).
We performed sensitivity analyses in which we assumed that an incrementing proportion of the laboratory-positive patients for whom no notification record was found was identified as ''not notified'' because of failed linkage between the GAL and SINAN databases. Of these analyses, the one assuming 100% failed database linkage equals the analysis comparing between the intervention and baseline arms the rates of positive laboratory tests irrespective of notification. There was no significant change in rate ratio ( Figure 3). This means that the missing notifications of patients with positive laboratory results, whether because of failed database linkage or because of failed notification, occurred at random and did not bias our primary endpoints.
Treatment was initiated before sputum sample processing in 417 (36.5%) and 585 (36.5%) of those patients notified with a bacteriological test result in the smear microscopy and Xpert arms, respectively. Overall, the cluster-averaged time interval between sputum processing (generally the same day as sputum collection) and start of treatment decreased from a median of 11.

Secondary Endpoints
In contrast to the increase in the cluster-averaged laboratoryconfirmed notification rate with the switch to Xpert, there were no significant changes in the cluster-averaged notification rate of laboratory-negative pulmonary TB (notification rate ratio 0.61, 95% CI = ,0.01, 1.23), non-tested pulmonary TB (notification rate ratio 0.97, 95% CI = 0.63, 1.30), or overall pulmonary TB (notification rate ratio 1.15, 95% CI = 0.94, 1.37; Table 4). With multivariable adjustment, only the laboratory-negative TB notification rate ratio decreased significantly (0.52, 95% CI = 0.21, 0.84, p = 0.004; Table 4). Results for secondary endpoints stratified by patients' and laboratories' characteristics are presented in Tables 5-8. Across the laboratories, the population-based rates of positive test results regardless of notification ( Figure S5) showed a distribution similar to that of the laboratory-confirmed TB notifications ( Figure S4).

Secondary Analyses
Multivariable adjustment of the mixed multilevel models resulted in some increase in notification rate ratios for all endpoints (Table S3). The model for laboratory-confirmed TB showed a significant interaction between trial arm and baseline smear-positive rate: the notification rate ratio for Xpert versus smear microscopy decreased from 1.97 (95% CI = 1.58, 2.46) in laboratories with the lowest baseline smear-positive rates to 1.28 (95% CI = 1.01-1.61) in laboratories with the highest baseline rates (Table S4). The decreased notification rate ratio for laboratory-confirmed TB with increasing smear positivity rate was not associated with the proportion of smears graded as scanty (i.e., 1-9 bacilli per 100 fields). Restricting the analyses to the period during which laboratories contributed to both arms (months 2 to 7) also did not affect the results (Table S5; Figure S4).

Discussion
This pragmatic trial showed that in a setting where laboratory diagnosis for pulmonary TB is largely restricted to sputum smear examination, implementing Xpert on a single sputum specimen increased laboratory-confirmed TB rates by 59% (31%-88%) and reduced time to treatment initiation from 11 to 8 d. The increase in TB confirmation was robust to potential confounding as well as to potential selection bias due to non-linkage of laboratory and notification databases, and was sustained over the study period. However, the overall notification rate of pulmonary TBregardless of test results-did not increase significantly. In addition, because of the characteristics of the test, Xpert could accurately and promptly detect rifampicin resistance, which necessitates second-line drug treatment. Even in this setting with low drug resistance prevalence, the PPV for rifampicin resistance was high, although more than one-third of patients had no confirmatory DST result. Since culture with phenotypic DST is not routinely done in the country, these resistant cases would probably only be detected after treatment failure.
Our findings suggest that in this primary care setting, Xpert placed in laboratories is more useful for confirming pulmonary TB than for increasing TB detection. TB confirmation is relevant because it can potentially prevent many patients with respiratory symptoms who do not have TB from receiving unnecessary treatment and from having their true diagnosis delayed, although this remains to be demonstrated. The absence of a significant effect on detection of overall TB, a finding also recently reported from sub-Saharan Africa [4,32], is probably largely explained by the large proportion of patients who were started on empirical treatment [33]. Empirical treatment for TB is a medical decision that depends in pretest probability, the patient's clinical condition, and test availability. The sensitivity and specificity of empirical treatment based on the WHO algorithm [34] varies substantially [35]. The pooled specificity for this algorithm in smear-negative patients was 69% in a meta-analysis [35], suggesting that a great number of patients falsely diagnosed with TB on clinical and radiological grounds could benefit from better diagnosis. In our study, 44 There are alternative explanations for at least some of the nonlaboratory-confirmed notifications. A positive test result may have been issued outside the study laboratories, such as at hospitals or small primary care laboratories, or database linkage may have failed despite the various algorithms used to address the lack of unique patient identifiers. Indeed, incomplete linkage has been described in previous studies in Brazil [36,37]. Finally, dropout between diagnosis and treatment may be underreported [38]. Incomplete linkage and notification are also likely causes of the high proportion of positive laboratory results for which no disease notification could be found [36,37]. We have no details about the patients who had a positive test but could not be retrieved in the notification database. The crude proportion of these missing patients decreased from the baseline to the intervention period. The rate ratio for laboratory-positive notifications (1.59, 95% CI = 1.31, 1.88; Figure 4) was equal to the rate ratio for positive laboratory results (1.60, 95% CI = 1.33, 1.86; Figure S5), and we observed similarity in age and sex distribution between patients with and without laboratory results, both suggesting that missing data happened at random. Together these data suggest that dropout between diagnosis and treatment was lower in the intervention than in the baseline arm, although the true magnitude of the difference cannot be established. Although initial loss to follow-up declined, it still was substantial, and likely added to transmission in the community. However, Xpert implementation has the potential to diminish transmission by reducing time to treatment initiation and initial loss to follow-up. Despite its limitations, reliance on routine reporting data allowed a highly pragmatic trial design, closely resembling routine clinical practice. In particular, it obviated the need for individual informed consent, which would have made a trial of this size unfeasible and would have carried a risk of non-participation. In addition, the stepped-wedge trial design had the advantage of allowing an assessment of the effectiveness of this diagnostic intervention during its implementation with a limited number of laboratories, while a parallel cluster-randomized design would have required a larger number of randomization units [26]. The design has, however, potential for bias, in particular when assignment of the outcome to a study arm is not straightforward, such as with delayed treatment effects, or when conditions that affect the outcome change over time [39]. We believe that neither possibility for bias applies to our study. The primary endpoint of disease notification occurred within weeks after the diagnostic test, so that mis-assignment is unlikely.
Because of the small number of clusters in our study, we opted for the most robust and conservative statistical approach based on cluster-averaged rather than overall rates [26]. This primary analysis did not allow adjustment for time effects. An increase in effectiveness over time after the switch to Xpert could indicate a learning curve effect, while a decrease over time would suggest that the excess cases notified in the Xpert arm compared to the smear microscopy arm reflect detection of a temporary ''backlog'' of prevalent TB cases not identified by smear examination, rather than recent incident cases. However, Figures S1 and S2 show no consistent pattern of change with time since the start of using Xpert, making it unlikely that these effects occurred. Furthermore, the secondary analysis based on overall rates supported the primary analysis results, and adjustment for time effects only increased the effect of Xpert on notification rates.
Another possible source of bias was the unexpected high proportion of insufficient-volume samples in the Xpert arm, which had to be examined microscopically. However, the ITT analysis in which such smear examinations, when positive, were included only increased the effect of Xpert, and to limited extent. Indeed, recent data suggest that Xpert sensitivity is unaffected by sputum volume [40].
There was substantial variation in notification rate ratios for laboratory-confirmed TB across the 14 study laboratories. The largest relative and absolute increases were observed for the laboratories with relatively low notification rates in the smear microscopy arm. Since the sensitivity of Xpert is less operatordependent than that of smear examination [3], especially for specimens with low bacterial load, these differences between the laboratories probably reflect differences in the operator-dependent sensitivity of smear examination. This difference would imply that Xpert improves TB case detection most where smear laboratory performance is suboptimal, e.g., because of high workload or inexperienced technicians, even though this may not translate into improvement of case finding for reasons such as empirical treatment and dropout between diagnosis and treatment. The relatively low proportion of Xpert rifampicin-resistant cases with culture and DST results available in the public sector calls for more active follow-up of those patients during scale-up. The high PPV of Xpert for rifampicin resistance in this low drug resistance setting confirms the very high specificity of the G4 test [41] and its high PPV in programmatic settings, as recently reported in South Africa [42]. The PPV may be even higher if we take into account that clinically and epidemiologically relevant rifampicin resistance may not be detected by phenotypic DST [43]. Moreover, the high PPV for rifampicin resistance indicates that appropriate treatment decisions could be made as soon as Xpert rifampicin resistance results are available. However, because of the small number of rifampicin-resistant cases in this study, the wide confidence intervals, and the large amount of missing information, caution in interpreting this finding is needed. Further investigation during the scale-up of Xpert in the country is recommended in order to determine the appropriateness of changing the regimen upon two consecutive Xpert tests positive for rifampicin resistance, as recently recommended by WHO [10]. Negative predictive values could not be estimated from our data because of the study design. Results for negative predictive values are conflicting: while a recent study from India [44] suggested that in that setting, Xpert may miss a substantial proportion of rifampicin-resistant cases, in South Korea, the negative predictive value of Xpert was high (94.9%) [45].
Both smear microscopy and Xpert in principle are same-day tests. In a recent pragmatic trial, more patients had a same-day diagnosis and treatment initiation with Xpert than with smear microscopy [4]. This can be relevant if same-day diagnosis prevents dropout from treatment initiation. In our study, despite the significant reduction in time to treatment of patients with drugsusceptible TB from 11 to 8 d, a delay of more than a week is still unacceptable for a same-day test. This delay could partly be explained by the need for transport of samples to the laboratory. Also, clinic routines in which patients are requested to come back after 1 or 2 wk to hear the results of diagnostic tests may not have been adjusted yet, shortly after introduction of Xpert. Comprehensive health-care interventions addressing all factors contributing to delays in treatment initiation, but also improvement of uptake of Xpert testing (i.e., reduction in treatment initiation without laboratory testing), are necessary if the population is to fully benefit from this new diagnostic technology, as found in South Africa [32]. We discuss operational implications in more detail in another article [46].
Studies having treatment outcomes as endpoints are needed to evaluate the possible effects of Xpert beyond its immediate advantages. The 59% (95% CI = 31%, 88%) increase in laboratory-confirmed diagnoses could have a substantial impact on patient adherence to TB treatment. From the patient's perspective, motivation may be higher to engage in a long-term treatment with documented evidence of the presence of M. tuberculosis.
In conclusion, this programmatic study showed the effectiveness of replacing smear microscopy with Xpert for TB case confirmation and reduction of time to treatment initiation at the population level. These results support the Brazilian Ministry of Health's decision to adopt Xpert as a replacement for smear microscopy in 92 municipalities that cover more than 55% of new TB cases countrywide. However, important challenges remain in order to take full advantage of the potential of this technology in pragmatic conditions, such as reducing more dramatically treatment initiation delays and avoiding unnecessary empirical treatment. Table 4. Cluster-averaged notification rates, rate differences, and rate ratios for laboratory-confirmed TB, TB with negative test result, TB with no testing, and overall pulmonary TB.

Editors' Summary
Background. Tuberculosis-a contagious bacterial disease that usually infects the lungs-is a global public health problem. Each year, about 8.6 million people develop active tuberculosis and at least 1.3 million people die from the disease, mainly in resource-limited countries. Mycobacterium tuberculosis, the bacterium that causes tuberculosis, is spread in airborne droplets when people with active disease cough or sneeze. The characteristic symptoms of tuberculosis include cough, weight loss, and night sweats. Diagnostic tests for tuberculosis include sputum smear microscopy (microscopic analysis of mucus coughed up from the lungs), the growth (culture) of M. tuberculosis from sputum samples, and molecular tests (for example, the Xpert MTB/RIF test) that rapidly and accurately detect M. tuberculosis in sputum and determine its antibiotic resistance. Tuberculosis can be cured by taking several antibiotics daily for at least six months, although the emergence of multidrug-resistant tuberculosis is making the disease increasingly hard to treat.
Why Was This Study Done? Quick, accurate diagnosis of active tuberculosis is essential to reduce the global tuberculosis burden, but in most high-burden settings diagnosis relies on sputum smear analysis, which fails to identify many infected people. Mycobacterial culture correctly identifies more infected people but is slow, costly, and rarely available in resource-limited settings. In late 2010, therefore, the World Health Organization recommended the routine use of the Xpert MTB/RIF assay (Xpert) for tuberculosis diagnosis, and several resource-limited countries are currently scaling up the use of Xpert in their national tuberculosis control programs. However, although Xpert works well in ideal conditions, little is known about its performance in routine (real-life) settings. In this pragmatic stepped-wedge clusterrandomized trial, the researchers assess the impact of replacing smear microscopy with Xpert for the diagnosis of tuberculosis in Brazil, an upper-middle-income country with a high tuberculosis burden. A pragmatic trial asks whether an intervention works under real-life conditions; a steppedwedge cluster-randomized trial sequentially and randomly rolls out an intervention to groups (clusters) of people.
What Did the Researchers Do and Find? The researchers randomly assigned 14 tuberculosis diagnosis laboratories in two cities to switch at different times from smear microscopy to Xpert for tuberculosis diagnosis. Specifically, at the start of the eight-month trial, all the laboratories used smear microscopy for tuberculosis diagnosis. At the end of each month, two laboratories switched to using Xpert, so that in the final month of the trial, all the laboratories were using Xpert. During the trial, 11,705 samples from patients with symptoms consistent with tuberculosis were examined using smear microscopy (baseline arm), and 12,522 samples were examined using Xpert (intervention arm). The researchers obtained the results of these tests from a database of all the diagnostic tests ordered in the Brazilian public laboratory system, and they obtained data on tuberculosis notifications during the trial period from the national notification system.
In total, 9.7% and 14.2% of the tests in the baseline and intervention arm, respectively, were positive, and the laboratory-confirmed tuberculosis notification rate was 1.59 times higher in the Xpert arm than in the smear microscopy arm. However, the overall notification rate (which included people who began treatment on the basis of symptoms alone) did not increase during the trial. The time to treatment (the time between the laboratory test date and the notification date, when treatment usually starts in Brazil) was about 11 days and eight days in the smear microscopy and Xpert arms, respectively.
What Do These Findings Mean? The findings indicate that, in a setting where laboratory diagnosis for tuberculosis was largely restricted to sputum smear examination, the implementation of Xpert increased the rates of laboratoryconfirmed pulmonary (lung) tuberculosis notifications and reduced the time to treatment initiation, two endpoints of public health relevance. However, implementation of Xpert did not increase the overall notification rate of pulmonary tuberculosis (probably because of the high rate of empiric tuberculosis treatment in Brazil), although it did facilitate accurate and rapid detection of rifampicin resistance. The accuracy of these findings may be limited by certain aspects of the trial design, and further studies are needed to evaluate the possible effects of Xpert beyond diagnosis and the time to treatment initiation. Nevertheless, these findings suggest that replacing smear microscopy with Xpert has the potential to increase the confirmation (but not detection) of pulmonary tuberculosis and to reduce the time to treatment initiation at the population level in Brazil and other resource-limited countries.