Skip to main content
  • Loading metrics

A Multi-Country Non-Inferiority Cluster Randomized Trial of Frontloaded Smear Microscopy for the Diagnosis of Pulmonary Tuberculosis



More than 50 million people around the world are investigated for tuberculosis using sputum smear microscopy annually. This process requires repeated visits and patients often drop out.

Methods and Findings

This clinical trial of adults with cough ≥2 wk duration (in Ethiopia, Nepal, Nigeria, and Yemen) compared the sensitivity/specificity of two sputum samples collected “on the spot” during the first visit plus one sputum sample collected the following morning (spot-spot-morning [SSM]) versus the standard spot-morning-spot (SMS) scheme. Analyses were per protocol analysis (PPA) and intention to treat (ITT). A sub-analysis compared just the first two smears of each scheme, spot-spot and spot-morning.

In total, 6,627 patients (3,052 SSM/3,575 SMS) were enrolled; 6,466 had culture and 1,526 were culture-positive. The sensitivity of SSM (ITT, 70.2%, 95% CI 66.5%–73.9%) was non-inferior to the sensitivity of SMS (PPA, 65.9%, 95% CI 62.3%–69.5%). Similarly, the specificity of SSM (ITT, 96.9%, 95% CI 93.2%–99.9%) was non-inferior to the specificity of SMS (ITT, 97.6%, 95% CI 94.0%–99.9%). The sensitivity of spot-spot (ITT, 63.6%, 95% CI 59.7%–67.5%) was also non-inferior to spot-morning (ITT, 64.8%, 95% CI 61.3%–68.3%), as the difference was within the selected −5% non-inferiority limit (difference ITT = 1.4%, 95% CI −3.7% to 6.6%). Patients screened using the SSM scheme were more likely to provide the first two specimens than patients screened with the SMS scheme (98% versus 94.2%, p<0.01). The PPA and ITT analysis resulted in similar results.


The sensitivity and specificity of SSM are non-inferior to those of SMS, with a higher proportion of patients submitting specimens. The scheme identifies most smear-positive patients on the first day of consultation.

Trial Registration

Current Controlled Trials ISRCTN53339491

Please see later in the article for the Editors' Summary

Editors' Summary


Every year, nearly 10 million people develop tuberculosis—a contagious bacterial infection that usually affects the lungs (pulmonary tuberculosis)—and about 1.7 million people die from the disease. Mycobacterium tuberculosis, which causes tuberculosis, is spread in airborne droplets when people with the disease cough or sneeze. Thus, to control tuberculosis, it is essential that infected individuals are rapidly identified and treated. The “gold standard” diagnostic test for tuberculosis is mycobacterial culture, in which laboratory staff try to grow M. tuberculosis from sputum (mucus brought up from the lungs by coughing). However, although this test is sensitive (it detects most patients with tuberculosis) and has a high specificity (a low rate of false-positive results), it is too slow to produce results and too complex for routine use in the low- and middle-income countries where tuberculosis mainly occurs. In these countries, patients are usually investigated using direct sputum smear microscopy, a cheaper but less sensitive test in which multiple sputum samples treated with the acid-fast Ziehl-Neelsen stain are examined for the presence of M. tuberculosis bacilli.

Why Was This Study Done?

In most national tuberculosis control programs, patients provide an “on the spot” specimen during their initial consultation, a specimen collected at home the next morning, and another on-the-spot specimen when they bring their morning specimen to the clinic (a “spot-morning-spot,” or SMS, collection scheme). Unfortunately, patients often fail to return with their morning sample. Furthermore, the examination of three samples strains the limited laboratory resources of developing countries. Based on several recent reviews, the World Health Organization recently recommended that only two samples need be examined, a policy change that reduces the laboratory workload but does not avoid the problems of collecting a morning sample and patient drop-out during the diagnostic process. In this non-inferiority, cluster randomized trial, the researchers compare the sensitivity and specificity of a spot-spot-morning (SSM; two on-the-spot specimens collected during the first clinic visit an hour apart, and a third specimen collected at home the next morning) scheme for tuberculosis diagnosis with those of the standard SMS scheme. A non-inferiority trial investigates whether an intervention is not worse than a control intervention; a cluster randomized trial randomly assigns groups of patients rather than individual patients to the test and control interventions.

What Did the Researchers Do and Find?

The researchers enrolled 6,627 patients in Ethiopia, Nepal, Nigeria, and Yemen who had had a cough for more than two weeks (a characteristic symptom of tuberculosis). A quarter of the patients had culture-positive tuberculosis. The centers participating in the study were randomly assigned each week for a year to use either the SMS or the SSM sample collection scheme. Compared to mycobacterial culture, the sensitivities of the SSM and SMS schemes were 70.2% and 65.9%, respectively, which indicates that the new scheme was non-inferior to the SMS scheme. Similarly, the specificity of SSM (96.9%) was non-inferior to that of SMS (97.6%). Importantly, the sensitivity of diagnosis using just the first two samples collected in the SSM scheme was also non-inferior to the sensitivity of diagnosis using the first two samples collected in the SMS scheme (63.6% versus 64.8%; the researchers defined non-inferiority of SSM as a difference in its sensitivity compared to that of SMS of less than −5%). Finally, patients tested using the SSM scheme were more likely to provide the first two samples than patients tested using the SMS scheme (98% versus 94.2%).

What Do These Findings Mean?

These findings suggest that a sputum collection scheme in which two samples are collected one hour apart followed by a morning specimen could identify as many smear-positive patients as the standard SMS scheme. Importantly, they also indicate that examination of the first two specimens alone identifies most smear-positive patients independently of which scheme is used. These findings suggest that the SSM scheme might be more suitable for tuberculosis diagnosis than the SMS scheme in locations where patients are likely to drop out of the diagnosis process (for example, in low- and middle-income countries, where patients often live a long way from clinics). However, for an SSM scheme to work effectively, an on-site laboratory with a same-day turn-around service will be essential, and tuberculosis clinics will need to minimize contact between patients waiting to provide their second on-the-spot specimen.

Additional Information

Please access these Web sites via the online version of this summary at


Nine million people developed tuberculosis (TB) and 1.7 million died from the disease in 2008 [1], with over 90% of cases occurring in low- and middle-income countries (LMICs) [1]. Most patients in LMICs are investigated by direct sputum smear microscopy, which, although widely available, has low sensitivity [2] and requires the examination of multiple specimens over several days to maximise the identification of cases [3],[4]. Most national TB programmes (NTPs) collect specimens using a spot-morning-spot (SMS) scheme, whereby patients provide one “on the spot” specimen at the time of consultation, one specimen produced at home the morning of the following day, and a third specimen on the spot when the patient brings the morning specimen to the service. This scheme became widely adopted after a study by Andrews and colleagues in the 1950s concluded that this combination identified the highest number of patients with the lowest number of visits [5]. Since the scheme requires at least two visits, however, patients often abandon the diagnostic process [6][8].

A recent systematic review indicated that the first two sputum specimens identified 95%–98% of the smear-positive cases identified from three specimens [9]. Thus, given the excessive workloads of many laboratories in LMICs, the World Health Organization (WHO) recommended a reduction (from three to two) of the number of specimens examined under certain circumstances [10],[11]. Although this change may reduce laboratory workloads, patients would still make the same number of visits to the diagnostic centres because the collection of a morning specimen obliges all patients to come back to the centre the next day.

Andrews and colleagues' study had reported that the yield of the three specimens was independent of the order in which these specimens were collected [5], and a scheme collecting three specimens as spot-spot-morning (SSM) has recently been reported to result in the same yield as the standard SMS scheme [12]. Although this distinction seems small, this finding may be important, as most patients with positive smear microscopy are identified with the first two smears, and examining two on-the-spot smears may identify most smear-positive cases on the first day of consultation, thus avoiding the need for patients to return the next day. The latter report, however, had collected four specimens as spot-spot-morning-spot to examine different permutations [12][14] and did not explore whether a spot-spot-morning scheme would reduce the number of patients defaulting from the diagnostic process.

Schemes that collect specimens in an accelerated fashion may have the potential to improve diagnostic services in LMICs by reducing the number of visits and the number of patients defaulting [15][17]. We therefore conducted a trial to assess whether the sensitivity and specificity of schemes collecting most of the on-the-spot specimens on the day of first consultation is non-inferior to the current standard scheme, and a complementary study that evaluates LED fluorescence microscopy within the context of these schemes is also published in this issue [18]. These schemes would be an important step towards making the diagnostic process more efficient and less onerous for patients.


Study Design

This was a prospective multi-country, randomized non-inferiority trial conducted in Ethiopia, Nepal, Nigeria, and Yemen to determine whether a scheme collecting two sputum specimens on the first day of consultation plus a morning specimen on the following day (SSM) had sensitivity and specificity that are non-inferior to the standard SSM scheme for the diagnosis of pulmonary TB (see Text S1).

Study Sites

Participants in Ethiopia were enrolled in Bushullo Major and Awassa Health Centres in the Southern Region. These are the main health service providers for Awassa District. Sputum specimens for culture were sent under appropriate conditions to the Armauer Hansen Institute, in Addis Ababa. In Abuja, Nigeria, patients were enrolled in Wuse District Hospital, and specimens were processed in Zankli Medical Centre, a private hospital endorsed as a diagnostic centre through contractual arrangement with the NTP. In Nepal, patients were enrolled from the DOTS Centre, Tribhuvan University Teaching Hospital, and the Dirgh Jeevan Health Care and Research Centre, Kathmandu. Specimens were processed in Tribhuvan University Health Research Laboratory. In Yemen, patients were enrolled at the Tuberculosis Institute in Sana'a. This institute, which houses the NTP and the national TB reference laboratory, also provides diagnostic services to the surrounding population and referred patients.

Inclusion and Exclusion Criteria

Patients ≥18 y old with cough ≥2 wk duration who had not received anti-TB treatment in the previous month presenting at the study site health service providers between 1 January 2008 and 30 March 2009 were eligible to participate.

Study Interventions and Randomization

Participants were asked to submit sputum specimens using the standard SMS or the new SSM scheme (Figure 1). The SMS scheme required one on-the-spot specimen at the time of the first visit, one specimen collected at home the following morning, and one on-the-spot specimen collected when the patient brought the morning sample to the laboratory. The SSM scheme required one on-the-spot specimen collected at the time of the first visit, a second on-the-spot specimen collected one hour later, and one morning specimen collected at home the following morning (see CONSORT statement and STARD checklist in Texts S2 and S3). All patients were requested to bring specimens on consecutive days, independently of the scheme.

The schemes were block-randomized by week over a period of 12 mo. The scheme to be used each week by each centre was allocated by block randomization. After generating a list of random numbers ranging from 1 to 5 using Minitab Statistical Software (, the scheme to be used in a specific week in each centre was allocated using a permuted block design. Blocks of fixed size were used to permute the week allocations (allocated as AABB, ABAB, BABA, ABBA, and BAAB, where A was the standard and B the frontloaded scheme). The schemes allocated were distributed to study centres concealed in sealed envelopes. The study coordinators were unaware of the block size and were only allowed to open the envelope at the start of each week. The decision to randomize by week was taken to test the study hypothesis within a context of a systems change, and it was considered that randomization of individuals was not feasible. The envelopes, thus, were not used to randomize individual participants, but provided a focal point to ensure all staff were aware of the scheme being used in a particular week. A total of 222 randomized weeks were distributed across the study sites, and of these, 114 were allocated to the standard and 108 to the frontloaded scheme, with a median number of patients per week of 24 and 26 patients for the SSM and SMS schemes, respectively (p>0.2).

Patients were not compensated for participating in the trial and underwent the same routine procedures undertaken under operational conditions. Patients were screened using the routine procedures of the outpatient clinics and were examined by a large number of clinicians. Cough registers were not kept, given the heavy workload of staff. Therefore, although the number of patients with chronic cough attending the clinics should have been similar to the number examined using smear microscopy, a small proportion may have been treated with antibiotics or may have left the service without notifying the staff, and are therefore unaccounted for.

Collection and Processing of Sputum Specimens

Standardised instructions for specimen collection, based on those used by Khan et al. [19], were given to all patients, and specimens were collected in pre-labelled pots. The number of patients who stopped attending and/or submitting specimens was recorded. Specimens were assessed macroscopically, and smears were stained using the hot Ziehl-Neelsen technique [3]. Slides were assigned study numbers and routine laboratory numbers, which were covered with wrap-around stickers for blinding. All smears were then mixed before examination (1,000×) and graded by laboratory technicians following the International Union Against Tuberculosis and Lung Disease system [3]. The stickers were removed only when another technician entered the results into a logbook. One sample per patient was processed for culture. The morning sample was selected for the majority of cases. If the patient did not submit a morning sample, then one of the spot specimens kept in the fridge was cultured. The specimen for culture was concentrated (Petroff's method) and cultured on solid medium using the standard operating procedures of the NTP. In Yemen and Nepal, specimens were cultured in Ogawa medium, while specimens in Nigeria and Ethiopia were cultured in Lowenstein-Jensen medium. Positive cultures were confirmed as acid fast bacilli (AFB), and standard bacteriological tests (niacin test) were used to identify the Mycobacterium tuberculosis complex.

Quality Assurance Procedures

The protocol was implemented in accordance with standard operating procedures and in compliance with good clinical practice/good clinical laboratory practice, and the STARD checklist is available (Text S3). A lot quality assurance sampling scheme was used to determine the sample size for external quality assessment (EQA), and EQA was conducted by two WHO/International Union Against Tuberculosis and Lung Disease Supranational Reference Laboratories. The sample size for selecting slides for EQA was based on the expected smear positivity rate and the number of negative smears examined in a year in each of the four laboratories, to assure a sensitivity of 90% relative to the controller and an accepted discrepancy number of 2 [20]. Sampling for EQA was performed before, during, and at the end of the study, and all sites met the pre-set quality standard. The information recorded by the interviewer was checked at the end of the day, and an attempt was made to contact the patient if any data were missing. If these attempts were unsuccessful and data was not recorded in the study laboratory logbooks, information that was missing was specified in the text.

Sample Size and Statistical Analysis

The sample size was calculated to establish that the SSM scheme was not inferior to the standard SMS scheme and to achieve 90% power to detect a non-inferiority margin difference of 5% between the proportions of patients with positive culture detected by the schemes. It was assumed that the standard approach would identify 50% of culture-positive patients, which was the yield observed in previous studies. The proportion of smear-positive cases identified by the SSM scheme was assumed to be 45% (or greater) of culture-positive patients under the null hypothesis of inferiority. The test statistic (one-sided unpooled z-test) was computed for the case scenario of the actual treatment group proportion being 50%. The significance level was targeted at 0.05 (5%). As the sample size was computed for culture-positive patients and it was assumed that 50% of patients undergoing screening would be culture-positive, the number of patients to be screened was 6,784.

The statistical analysis was carried out using the Stata9 statistical computer package [21]; “svy” commands were used where possible to adjust for clustering both within sites (countries) and within blocks. The study staff, however, were not blind to the scheme allocation, and the statistical analysis was not blinded. Variables were summarised as frequency counts/percentages (with two-sided confidence intervals), with the exceptions of age and cough duration, which were summarised using means/standard deviations and were compared between the two study arms to explore whether the randomization had worked and the characteristics of the patients were similar. Smears were classified as positive when ≥1 AFB in 100 fields were detected. Patients were considered smear-positive if they had ≥1 smear with ≥1 AFB, following current WHO definitions [10],[11]. Culture was the reference standard for the calculation of sensitivity and specificity. Analysis was based on both intention to (diagnose and) treat (ITT) analysis and per protocol analysis (PPA). Patients with missing smears were classified according to the results available (e.g., patients with negative first spot, negative morning, and missing second spot were classified as smear-negative) for the ITT analysis, but were excluded from the PPA. The ITT analysis is presented in the narrative of the text results for the sake of clarity, and both ITT and PPA are described in the tables for completeness. Eight patients allocated to the SSM scheme were examined with the SMS scheme, and three patients allocated to the SMS scheme were examined using the SSM scheme. These protocol violations represent <0.2% of the participants and mostly occurred at the start of the week. Given that their specimens were examined blindly, it was decided to include these patients in the scheme under which they were examined. Two further sub-analyses were conducted. One comprised the analysis of the first two smears of each scheme (spot-spot [SS] versus spot-morning [SM]) using culture as the reference standard, and the second included individuals who volunteered information about their HIV status. HIV counselling and testing were offered in Nigeria and Ethiopia, following national guidelines and routine testing conditions, and patients were asked whether they had been tested for HIV. Only patients who volunteered this information were categorised as HIV-positive or -negative, and the tests used varied across study sites. HIV testing procedures for TB programmes therefore varied by site. Testing was not available for all patients in Yemen and Nepal, and the uptake of HIV testing varied significantly between Ethiopia and Nigeria. The sub-analysis stratified by HIV was therefore admittedly underpowered and prone to self-selection bias, and it is difficult to interpret. The results, however, are included to provide preliminary information on the potential performance of the schemes in this patient population.

Protocol Registration and Ethical Approval

The protocol (International Standard Randomized Controlled Trial Number Register ISRCTN53339491) was approved by the WHO Ethics Review Committee, the Liverpool School of Tropical Medicine Ethics Research Committee, and the national and institutional ethics committees of the four countries. Consent and information sheets were translated, and informed witnessed written or oral consent was obtained.


Characteristics of Participants

A total of 6,627 patients (1,909 in Ethiopia, 630 in Nepal, 1,238 in Nigeria, and 2,850 in Yemen) were enrolled. Of these, 3,053 (46.1%) were screened with the frontloaded and 3,574 (53.9%) with the standard scheme. The characteristics of the participants are shown in Table 1, stratified by study arm. There were no statistically significant differences between the participants assigned to the SSM and SMS study arms in any of the four sites (countries). Patients in Yemen and Nepal were older than patients in Ethiopia and Nigeria, while patients in Ethiopia and Yemen (47.7% and 49.2%) were more likely to come from rural areas than patients in Nepal and Nigeria (10.2% and 13.2%). Patients from Nepal had longer cough duration at the time of consultation than patients enrolled at the other sites. The most frequent symptoms, besides cough, were chest pains (79.1%), weight loss (70.5%), and fever (70.2%). Only 1.1% of patients in Yemen, 2.6% in Nepal, 22.6% in Ethiopia, and 54.1% in Nigeria knew their HIV status. Among patients who reported their HIV status, patients in Nigeria were more likely to be HIV-positive (71.3%) than those in Ethiopia (23.7%), Nepal (12.5%), and Yemen (3.0%).

Table 1. Demographic and clinical characteristics of participants.

Completeness of Specimen Submission and Sputum Grades

Figure 2 presents the number of patients who submitted one, the first two, and all three specimens requested, by collection scheme. Patients following the SSM scheme were more likely to submit the first two specimens than patients following the SMS scheme (97.6% versus 94.2%; difference 3.4%, 95% CI 2.3%–4.6%). Although the waiting time for the second specimen in the SSM scheme was only one hour, some patients still left the clinic without submitting this specimen. The proportion of patients submitting all three smears did not differ significantly for patients screened with the SSM and SMS schemes (94.1% versus 92.8%, respectively; difference 1.2%, 95% CI −0.4% to 2.8%). The proportion of patients with one or more positive smear results by study site and scheme is shown in Table 1. Overall, 582 (19.1%) of the 3,053 patients examined using the SSM scheme were smear-positive, compared to 642 (18.0%) of 3,574 examined using the SMS scheme (p = 0.5). Spot specimens were more likely to have low smear grades (“scanty” [<10 AFB per 100 fields] or “+” [10–99 AFB per 100 fields]) (9.0%/9.0% for SSM and 9.0%/8.1% for SMS, respectively) than the morning specimens (6.9% for both schemes). The lower AFB grades of the spot specimens resulted in a slightly higher proportion of morning specimens being graded as positive (16.4% and 16.8% of the morning specimens versus 14.6% and 15.3% of first-spot specimens of the SSM and SMS schemes, respectively).

Figure 2. Number of patients submitting the first, the first two, and all three specimens, by scheme.

Error bars represent the upper limit of the 95% confidence limits.

Sensitivity and Specificity of SSM and SMS Schemes

Sputum specimens of 6,467 (97.6%) of the patients enrolled were cultured, and 1,561 (24.1%) were culture-positive. The ITT analysis was conducted in 6,358 patients (2,929 SSM and 3,429 SMS), as shown in Table 2. SSM identified 500 of 712 culture-positive patients (sensitivity 70.2%, 95% CI 66.5%–73.9%), and SMS identified 559 of 849 culture-positive patients (sensitivity 65.9%, 95% CI 62.3%–69.5%). The difference in sensitivity (SSM minus SMS) was 4.3% (95% CI for the difference  = 0.6% to 9.0%), indicating that the sensitivity of SSM was non-inferior to the sensitivity of the SMS scheme. SSM had a specificity of 96.9% (95% CI 93.2%–99.9%), with 2,149 of 2,217 culture-negative patients being smear-negative, compared to 2,518 of 2,580 (97.6%, 95% CI 94.0%–99.9%) patients examined with the SMS scheme. The difference was −0.7% (95% CI −1.9% to 0.4%), indicating that the specificity of SSM was non-inferior to the specificity of SMS.

Table 2. Sensitivity and specificity of two and three smears, stratified by scheme by intention to treat and per protocol analysis.

The same 6,358 patients were included in the ITT analysis of the first two smears. SS identified 453 of 712 culture-positive patients (sensitivity  = 63.6%, 95% CI 59.7%–67.5%), while SM identified 550 of 849 culture-positive patients (sensitivity  = 64.8%, 95% CI 61.3%–68.3%). The difference in sensitivity (SS minus SM) was −1.2% (95% CI for the difference  = −3.9% to 6.4%), indicating that the sensitivity of SS was non-inferior to that of the SM scheme, as the lower limit of the 95% CI does not exceed the predefined non-inferiority limit of −5%. The specificity of SS (97.4%, 95% CI 93.5%–99.9%) was not inferior to the specificity of the SM scheme (97.8%, 95% CI 94.3%–99.9%). The difference in specificity between the schemes was −0.4% (95% CI −1.4% to 0.6%), indicating that the specificity of SS was non-inferior to the specificity of SM. The PPA results were similar to those of the ITT analysis (Table 2). If the smears collected only on the first day were included (SS for SSM and S for SMS), the sensitivity of SS (63.6%, 95% CI 59.7%–67.5%) was higher than the sensitivity of the first S specimen of the SMS scheme (466/845; 55.1%, 95% CI 51.7%–58.5%), while the specificities were similar (97.4%, 95% CI 93.5%–99.9%, and 98.5%, 95% CI 97.9%–98.9%, respectively).

The numbers of patients that would be missed or correctly identified by the schemes for each 1,000 patients screened are indicated in Table 3. These parameters were calculated using the 24.1% culture positivity obtained and the sensitivity and specificity of the PPA and ITT analysis. Using the ITT parameters, three specimens collected with the SSM and SMS schemes would result in 90.4% and 90.0% of patients, respectively, being correctly classified, while the SS and SM smears would result in 89.2% and 89.8% of patients, respectively, being correctly classified. The results obtained with the PPA were similar to those obtained with the ITT analysis.

Table 3. Number of patients that would be correctly identified by each smear microscopy scheme using the PPA and ITT analysis.

In total, 1,059 patients reported their HIV status. The sensitivity (ITT) of the three-smear schemes decreased from 81.5% among HIV-negative to 71.2% among HIV-positive patients under the SSM scheme and from 68.8% to 51.8% under the SMS scheme. HIV co-infection thus seemed to reduce the sensitivity of smear microscopy independently of the scheme, although, as stated, the study was underpowered for this sub-analysis.


TB is a disease of poverty and a global public health emergency [1]. Patients with chronic cough are the main source of infection, and their early identification and treatment are key to effective control [1]. Simple diagnostics suitable for community-based health services would significantly improve TB control activities but unfortunately are not expected to be available in the near future [16]. Smear microscopy thus remains the test most widely used for diagnosis in LMICs.

Examining sputum smears is relatively simple, and approximately 50 million patients are investigated by smear microscopy each year. Between 70% and 90% of these examinations take place in 22 high-TB burden countries [21], where 60% of the population lives on less than US$2.00/day. Patients in these settings often travel to diagnostic centres, where they are faced with a process lasting several days and necessitating further expenditure. Although drop-out rates among patients undertaking smear microscopy are infrequently reported, 13% of patients in Chennai, India, 37% in Lilongwe, Malawi, and 95% in Lusaka, Zambia, fail to complete the process [6][8]. Furthermore, between 5% (e.g., Pakistan [22]) and 52% (e.g., Cape Town, South Africa [23]) of cases with smear-positive pulmonary TB default the diagnostic process after submission of the first specimen [22][27]. These percentages are likely to be underestimates, as most reports used the old case definition requiring two or more positive smears to classify a case as smear-positive. Defaulting patients have a high mortality rate in low-resource settings [23],[27], and reducing the number of visits required could increase the acceptability of and adherence to the diagnostic process and reduce mortality. Currently, most patients requested to provide SMS specimens receive the results of all sputum examinations the next day of consultation, or later. If the first sputum examination is positive but the patient does not return the next day, the patient does not receive any of the results and is lost. Schemes that facilitate the identification of the majority of smear-positive patients and provide these results on the first day of consultation therefore have the potential to benefit large numbers of patients.

The proportion of patients who failed to complete the submission of all specimens during this study was lower than under operational conditions. Even so, participants in the SSM arm were more likely to submit their first two sputum samples compared to participants in the SMS arm, and the scheme could be implemented in a way that allows patients to return later in the day to receive laboratory results and referral for treatment. As most smear-positive patients were identified by the first two smears, a higher number of patients could be referred for treatment, with significant operational advantages in locations where many patients abandon the diagnostic process.

The SSM scheme has sensitivity and specificity that is not-inferior to the SMS scheme used in most LMICs. Similarly, the two-specimen SS scheme sensitivity was not inferior to that of the two-specimen SM scheme, and the losses in sensitivity from the three- to the two-specimen scheme are within the range predicted by a systematic review (0%–10% losses) [9]. Using two spot specimens in this study resulted in up to 7% lower sensitivity than using the SSM scheme. NTPs with overburdened laboratories that screen patients following the WHO recommendation of examining two smears [28] thus could consider collecting two on-the-spot specimens if a significant proportion of patients default from the diagnostic process, as the lower drop-out in patient numbers would at least compensate for the loss in sensitivity. In addition, programmes that decided to continue collecting three specimens could use a SSM scheme and identify most smear-positive cases the first day of consultation. This seemingly small intervention therefore has the potential to reduce losses to follow up in areas where a significant proportion of patients fail to return for the second day of sputum submission. SS schemes may also be very useful in combination with new diagnostics with higher sensitivity, such as the new automated nuclear acid amplification tests [29]. The WHO-endorsed Xpert tests (Cepheid) are likely to be used mostly in patients with negative smear microscopy, and the rapid screening of patients that require further testing would be key for avoiding delays in the diagnostic process.

There are limitations and aspects that merit further monitoring to ascertain whether the trial results can be replicated under more realistic programmatic conditions. There has been considerable discussion in the scientific literature as to whether non-inferiority trials provide the same quality of evidence than superiority trials [30]. Non-inferiority trials often require smaller sample sizes, and systematic biases usually influence the results towards finding non-inferiority [31]. Superiority trials, in turn, require larger sample sizes than non-inferiority studies and therefore require resources that are often out of reach for interventions with low commercial value. Although the study was conducted using randomization procedures that were homogeneous across study sites and strict blinding of smear gradings, non-quantified biases beyond the control of the investigators may have influenced these results. There was also an unequal number of participants in the two schemes, which resulted from nine “frontloaded” weeks falling on weeks with public holidays (Christmas/New Year and Eid) compared to only one of the “standard” weeks. Recruitment was very low during these weeks, and because of this, together with the fact that an additional six weeks were allocated to the standard scheme, we enrolled unequal numbers. Further still, the study may have been underpowered, given the assumption that 50% of screened cases would be culture-positive, whereas only 24% were positive. Despite these limitations, the strikingly similar differences in sensitivity and specificity observed under the ITT analysis and PPA strengthen the conclusions that can be drawn from the findings.

HIV testing was not offered routinely to all study participants. Unfortunately, this service was not available for all patients in Yemen and Nepal, and the uptake of HIV testing varied significantly between Ethiopia and Nigeria. Patients with severe TB symptoms were more likely to take up HIV testing than patients with milder clinical presentation; thus, patients accepting these tests in Ethiopia and Nigeria may have had more advanced stages of TB. The data analysis stratified by HIV status is therefore prone to bias and needs to be considered as only a rough indication of the performance of the scheme in this population. Although the data suggest that the two schemes had similar sensitivity in the study patients, the sensitivity could had been different if all patients had been tested for HIV.

The decision to adopt these schemes needs to consider the greater bacillary yield of the morning sample, the potential burden on laboratory technicians to provide results the same day to start treatment, and the potential inconvenience and nosocomial exposure involved in having the patient wait at the health care facility an hour to give a second spot sample. While providing results the same day sounds attractive, further evidence is needed to demonstrate that it is feasible to implement the scheme under programmatic conditions. This would be specially challenging in rural and remote areas and primary health care clinics where potential TB patients may find it hardest to access clinics and where capacity to undertake smear examinations may not be available on site. The implementation of a SSM scheme would therefore require changes to the provision of services, including on-site laboratory with a same-day turn-around time, services that enable the initiation or referral of patients for treatment at the time that results become available, and, in high HIV prevalence areas, services to minimize contact between patients, and thus minimize the risk of nosocomial infection, by collecting the first specimen during the triage stage, collecting the second specimen one hour later, and asking the patient to return to the clinic in the afternoon.

Operational research is also needed to monitor the performance of the schemes. For example, smears were examined blindly to ensure data quality, but this routine also prevented staff copying the results of a patient's previous slides for that patient's subsequent slides, which is a practice widely suspected in overburdened laboratories and reported anecdotally [32]. As technicians may remember the results of uncovered smears collected in quick succession, they may be less motivated to examine a second or third smear if the first sputum is negative. There is also no evidence about whether early diagnosis results in increased treatment uptake, decreased mortality, or improved uptake of HIV testing or ART for HIV [33][36].

In summary, a scheme consisting of two smears collected one hour apart followed by a morning specimen identifies as many smear-positive patients as the standard SMS scheme, and patients were more likely to submit the first two specimens. The examination of the first two specimens identified the majority of smear-positive patients, independently of the scheme. Two spot specimens did not have sensitivity inferior to two specimens collected as spot and morning, and the former combination could be more suitable for locations where patients are likely to default from the diagnostic process. Smear microscopy had reduced sensitivity in patients co-infected with HIV, but this loss seemed to be independent of the scheme.

The identification of the majority of smear-positive patients may require no more than one patient visit, and the scheme presented here has the potential to improve the diagnosis of pulmonary TB in LMICs [16]. A single-visit diagnosis would represent a substantial opportunity to improve the delivery of TB services, particularly to the poor.

Supporting Information


We are grateful to the National TB Reference Laboratory, Korean Institute of Tuberculosis, Seoul, Korea, and the National TB Reference Laboratory, Bureau of Tuberculosis, Bangkok, Thailand, for re-examining the smears for the EQA. We are also grateful to Drs. Gillian Mann (Liverpool School of Tropical Medicine, UK), Jailson Barros Correia (Instituto Materno Infantil Prof. Fernando Figueira, Brazil), and Veronique Vincent (Stop TB Department, WHO, Switzerland) for their input into the research planning workshop held in Geneva in 2007. Luis Eduardo Cuevas, Carl-Michael Nathanson, Jean Joly and Andrew Ramsay are staff members of the World Health Organization.

Author Contributions

ICMJE criteria for authorship read and met: LEC MAY NAS LL IA NAA JBS AAA ENE YM MIO JOO MA AA GH RMAdC KK DvS CMN JJ BF SBS AR. Agree with the manuscript's results and conclusions: LEC MAY NAS LL IA NAA JBS AAA ENE YM MIO JOO MA AA GH RMAdC KK DvS CMN JJ BF SBS AR. Designed the experiments/the study: LEC MAY JJ BF AR. Analyzed the data: LEC MAY JBS RMAdC KK DvS BF AR. Collected data/did experiments for the study: NAS LL IA NAA JBS ENE YM MIO JOO MA AA GH RMAdC KK. Enrolled patients: NAS IA NAA JBS AAA ENE MIO JOO. Wrote the first draft of the paper: LEC MAY AR. Contributed to the writing of the paper: LEC MAY NAS LL NAA JBS AA RMAdC KK DvS CMN BF SBS AR. Principal Investigator: LEC. Initiators of the study: LEC MAY. Co-principal Investigator: MAY. Responsible for and supervised study in Yemen: NA-S. Responsible for and supervised study in Nigeria: LL. Responsible for and supervised the study in Bushulo Major Health Centre, Ethiopia: IA. Responsible for and supervised study in Yemen: NA-A. Responsible for and supervised study in Nepal: JBS. Responsible for the laboratory experiments conducted for the study in Nigeria: ENE. Contributed to laboratory testing, culture of specimens, and implementation of the EQA in Ethiopia: YM AA. Collection and examination of sputum samples in the laboratory: JOO. Contributed to EQA: GH. Contribution to overall management of study at international level: AR.


  1. 1. World Health Organization (2009) Global tuberculosis control: epidemiology, strategy, financing. Geneva: World Health Organization.
  2. 2. Perkins MD, Cunningham J (2007) Facing the crisis: improving the diagnosis of tuberculosis in the HIV era. J Infect Dis 196: Suppl 1S15–S27.
  3. 3. International Union Against Tuberculosis and Lung Disease (2000) Technical guide. Sputum examination for tuberculosis by direct microscopy in low-income countries. Paris: International Union Against Tuberculosis and Lung Disease. 5th edition.
  4. 4. World Health Organization (1998) Tuberculosis handbook. Geneva: World Health Organization.
  5. 5. Andrews RH, Radhakrishna S (1959) A comparison of two methods of sputum collection in the diagnosis of pulmonary tuberculosis. Tubercle 40: 155–162.
  6. 6. Chandrasekaran V, Ramachandran R, Cunningham J, Balasubramaniun R, Thomas A, et al. (2005) Factors leading to tuberculosis diagnostic drop-out and delayed treatment initiation in Chennai, India. Int J Tuberc Lung Dis 9: 172.
  7. 7. Kemp J, Squire SB, Nyirenda IK, Salaniponi FML (1996) Is tuberculosis diagnosis a barrier to care? Trans R Soc Trop Med Hyg 90: 472.
  8. 8. Nota A, Ayles H, Perkins M, Cunningham J (2005) Factors leading to tuberculosis diagnostic drop-out and delayed treatment initiation in urban Lusaka. Int J Tuberc Lung Dis 9: 305.
  9. 9. Mase SR, Ramsay A, Ng V, Henry M, Hopewell PC, et al. (2007) Yield of serial sputum specimen examinations in the diagnosis of pulmonary tuberculosis: a systematic review. Int J Tuberc Lung Dis 11: 485–495.
  10. 10. World Health Organization (2007) Definition of a new sputum smear-positive TB case. Geneva: World Health Organization.
  11. 11. World Health Organization (2007) Reduction of number of smears for the diagnosis of pulmonary TB. Geneva: World Health Organization.
  12. 12. Ramsay A, Yassin MA, Cambanis A, Hirao S, Almotawa A, et al. (2009) Front-loading sputum microscopy services: an opportunity to optimise smear-based case-detection of tuberculosis in high prevalence countries. J Trop Med. 2009. 398767 p.
  13. 13. Cambanis A, Yassin MA, Ramsay A, Squire SB, Arbide I, et al. (2006) A one-day method for the diagnosis of pulmonary tuberculosis in rural Ethiopia. Int J Tuberc Lung Dis 10: 230–232.
  14. 14. Hirao S, Yassin MA, Khamofu HG, Lawson L, Cambanis A, et al. (2007) Same-day smears in the diagnosis of tuberculosis. Trop Med Int Health 12: 1459–1463.
  15. 15. Cambanis A, Yassin MA, Ramsay A, Bertel Squire S, Arbide I, et al. (2005) Rural poverty and delayed presentation to tuberculosis services in Ethiopia. Trop Med Int Health 10: 330–335.
  16. 16. Keeler E, Perkins MD, Small P, Hanson C, Reed S, et al. (2006) Reducing the global burden of tuberculosis: the contribution of improved diagnostics. Nature 444: Suppl 149–57.
  17. 17. Liu X, Thomson R, Gong Y, Zhao F, Squire SB, et al. (2007) How affordable are TB diagnosis and teatment in rural China? An analysis from community and TB patient perspectives. Trop Med Int Health 12: 1464–1471.
  18. 18. Cuevas LE, Al-Sonboli N, Lawson L, Yassin MA, Arbide I, et al. (2011) LED Fluorescence Microscopy for the Diagnosis of Pulmonary Tuberculosis: A Multi-Country Cross-Sectional Evaluation. PLoS Med 8: e1057.
  19. 19. Khan MS, Dar O, Sismanidis C, Shah K, Godfrey-Faussett P (2007) Improvement of tuberculosis case detection and reduction of discrepancies between men and women by simple sputum-submission instructions: a pragmatic randomised controlled trial. Lancet 369: 1955–1960.
  20. 20. Aziz MA, Ba F, Becx-Bleumink M, Bretzel G, Humes R, et al. (2002) External quality assessment for AFB smear microscopy. In: Ridderhof J, Humes R, Boulahbal F, editors. Washington (District of Columbia): Association of Public Health Laboratories.
  21. 21. Reichman LB, Herschfield ES, editors. (2006) Tuberculosis: a comprehensive international approach, 3rd ed. New York: Informa Healthcare USA.
  22. 22. Khan MS, Khan S, Godfrey-Fausset P (2009) Default during TB diagnosis: quantifying the problem. Trop Med Int Health 14: 1437–1441.
  23. 23. Botha E, Den Boon S, Verver S, Dunbar R, Lawrence KA, et al. (2008) Initial default from tuberculosis treatment: how often does it happen and what are the reasons? Int J Tuberc Lung Dis 12: 820–823.
  24. 24. Botha E, Den Boon S, Lawrence KA, Reuter H, Verver S, et al. (2008) From suspect to patient: tuberculosis diagnosis and treatment initiation in health facilities in South Africa. Int J Tuberc Lung Dis 12: 936–941.
  25. 25. Creek TL, Lockman S, Kenyon TA, Makhoa M, Chimidza N, et al. (2000) Completeness and timeliness of treatment initiation after laboratory diagnosis of tuberculosis in Kabore, Botswana. Int J Tuberc Lung Dis 4: 956–961.
  26. 26. Harries AD, Rusen ID, Chiang CY, Hinderaker SG, Enarson DA (2009) Registering initial defaulters and reporting on their treatment outcomes. Int J Tuberc Lung Dis 13: 801–803.
  27. 27. Squire SB, Belaye AK, Kashoti A, Salaniponi FM, Mundy CJ, et al. (2005) ‘Lost’ smear-positive pulmonary tuberculosis cases: where are they and why did we lose them? Int J Tuberc Lung Dis 9: 25–31.
  28. 28. World Health Organization (2007) New WHO policies. Revised WHO policy guidelines for tuberculosis. Geneva: World rganization.
  29. 29. Boehme CC, Nabeta P, Hillemann D, Nicol MP, Shenai S, et al. (2010) Rapid molecular detection of tuberculosis and rifampin resistance. N Engl J Med 363: 1005–1015.
  30. 30. Nunn AJ, Phillips PP, Gillespie SH (2008) Design issues in pivotal drug trials for drug sensitive tuberculosis (TB). Tuberculosis (Edinb) 88: Suppl 1S85–S92.
  31. 31. Garattini S, Bertele V (2007) Non-inferiority trials are unethical because they disregard patients' interests. Lancet 370: 1875–1877.
  32. 32. Van Deun A, Zwahlen M, Bola V, Lebeke R, Bahati E, et al. (2007) Validation of candidate smear microscopy quality indicators, extracted from tuberculosis laboratory registers. Int J Tuberc Lung Dis 11: 300–305.
  33. 33. Edginton ME, Wong ML, Hodkinson HJ (2006) Tuberculosis at Chris Hani Baragwanath Hospital: an intervention to improve patient referrals to district clinics. Int J Tuberc Lung Dis 10: 1018–1022.
  34. 34. Houyang H, Chepote F, Gilman RH, Moore DA (2005) Failure to complete the TB diagnostic algorithm in urban Perú: a study of contributing factors. Trop Doct 35: 120–121.
  35. 35. Thongraung W, Chongsuvivatwong V, Pungrassamee P (2008) Multilevel factors affecting tuberculosis diagnosis and initial treatment. J Eval Clin Pract 14: 378–384.
  36. 36. Zhang T, Tang S, Jun G, Whitehead M (2007) Persistent problems of access to appropriate, affordable TB services in rural China: experiences of different socio-economic groups. BMC Public Health 7: 19.