Facemask against viral respiratory infections among Hajj pilgrims: A challenging cluster-randomized trial

Background In this large-scale cluster-randomized controlled trial (cRCT) we sought to assess the effectiveness of facemasks against viral respiratory infections. Methods and results Over three consecutive Hajj seasons (2013, 2014, 2015) pilgrims’ tents in Makkah were allocated to ‘facemask’ or ‘no facemask’ group. Fifty facemasks were offered to participants in intervention tents, to be worn over four days, and none were offered to participants in control tents. All participants recorded facemask use and respiratory symptoms in health diaries. Nasal swabs were collected from the symptomatic for virus detection by reverse transcription polymerase chain reaction. Clinical symptoms and laboratory results were analyzed by ‘intention- to-treat’ and ‘per-protocol’. A total of 7687 adult participants from 318 tents were randomized: 3864 from 149 tents to the intervention group, and 3823 from 169 tents to the control group. Participants were aged 18 to 95 (median 34, mean 37) years, with a male to female ratio of 1:1.2. Overall, respiratory viruses were detected in 277 of 650 (43%) nasal/pharyngeal swabs collected from symptomatic pilgrims. Common viruses were rhinovirus (35.1%), influenza (4.5%) and parainfluenza (1.7%). In the intervention arm, respectively 954 (24.7%) and 1842 (47.7%) participants used facemasks daily and intermittently, while in the control arm, respectively 546 (14.3%) and 1334 (34.9%) used facemasks daily and intermittently. By intention-to-treat analysis, facemask use did not seem to be effective against laboratory-confirmed viral respiratory infections (odds ratio [OR], 1.4; 95% confidence interval [CI], 0.9 to 2.1, p = 0.18) nor against clinical respiratory infection (OR, 1.1; 95% CI, 0.9 to 1.4, p = 0.40). Similarly, in a per-protocol analysis, facemask use did not seem to be effective against laboratory-confirmed viral respiratory infections (OR 1.2, 95% CI 0.9–1.7, p = 0.26) nor against clinical respiratory infection (OR 1.3, 95% CI 1.0–1.8, p = 0.06). Conclusion This trial was unable to provide conclusive evidence on facemask efficacy against viral respiratory infections most likely due to poor adherence to protocol.


Introduction
Viral respiratory infections are a major public health burden, causing serious disease especially in vulnerable populations. Influenza-associated lower respiratory tract disease alone causes over 54 million infections per year, eight million cases of severe illness, and 145,000 deaths across all age groups [1]. Ever-increasing and faster international travel intensifies the transmission of respiratory infections, especially in the setting of mass gatherings such as Hajj pilgrimage in Makkah [2]. The rites of Hajj are performed over five or six days, beginning on the eighth day and ending on the thirteenth day of the last month of the Islamic calendar. Coming from over 180 countries pilgrims converge on Makkah to join a procession of two to three million people who perform a series of physically demanding rituals. Such religious and other mass gatherings amplify the transmission of respiratory viruses by up to eight times [3] and may even accelerate the progression of a pandemic as occurred during the 2009 influenza A (H1N1) pandemic following the Iztapalapa Passion Play mass gathering in Mexico in April 2009 [4]. The current outbreak of coronavirus disease 2019 (COVID- 19) is an example of how travel accelerates the spread of respiratory viral infection [5].
Non-pharmacological interventions, such as facemask use, and hand washing have been used to complement pharmacological measures in the prevention and control of viral respiratory infections at mass gatherings with no documented efficacy [6]. There is clinical and experimental evidence that surgical masks and respirators reduce transmission of drug-resistant tuberculosis and influenza from infected patients [7,8], but randomized trials examining the effectiveness of facemasks against viral respiratory infections in household, community or healthcare settings have been either conflicting or inconclusive [9][10][11][12][13][14][15], though at least one randomized controlled trial has suggested protection against influenza by facemasks and hand washing, if applied early after exposure [13].
Inadequate sample size is thought to be an important cause of this discrepancy [16][17][18]. Therefore we designed a large cluster-randomized controlled trial (cRCT) over three years among pilgrims at Hajj to evaluate the efficacy of facemasks against laboratory-confirmed viral respiratory infections. The rationale of the cluster design was to increase administrative efficiency.

Trial design
Our study was an open label cRCT conducted during Hajj in Mina, Greater Makkah, Saudi Arabia among pilgrims from Saudi Arabia, Australia and Qatar over three Hajj seasons, 2013 to 2015. Mina is an uninhabited valley at the outskirts of Makkah and has about 30,000 tents to accommodate pilgrims for up to five days as part of Hajj rituals. Generally, 50 to 150 pilgrims occupy each large tent, allocated by gender and country of origin, but tents with a much smaller number also exist. Pilgrims in each tent sleep close to each other, head-to-head, have meals and perform rites together hence are considered a cluster. A pilot trial was conducted among Australian pilgrims in 2011 to examine the feasibility of such a study and inform power calculations [19]. Following the Consolidated Standards of Reporting Trials (CONSORT) guidelines (S1 Checklist) for cRCTs participants in respective tents were allocated to intervention or control group as per the trial protocol (S2 Appendix) [20,21]

Procedure
Hajj pilgrims aged �18 years from participating countries, staying in allocated tents and able to provide signed informed consent were included. Pilgrims aged <18 years, or �18 who had a known contraindication to mask use, had participated in another randomized trial investigating a medical intervention, refused or were unable to sign the consent were excluded.
Agreement was secured from 318 Hajj tour group leaders for 346 tents, occupied by pilgrims from Saudi Arabia, Qatar and Australia to facilitate the study. The randomization unit of this trial was the accommodation tent. We planned to stratify the randomization by country and gender. Although per protocol computer-generated random number allocation by an offsite research coordinator had been planned, this proved impractical in this field study due to poor internet/mobile phone network at the study sites. Because real-time, effective and smooth communication with the offsite research coordinator responsible for random allocation generation was not possible, coin-tossing by an individual who was not a member of the research team (i.e., a fellow pilgrim who was not a participant in the trial, a tour operator or a medical volunteer at Hajj who was not a study team member) was used. As the intervention of wearing a facemask was visible to participants and investigators, the trial could not be blinded; laboratory staff could be, and were blinded to the intervention.
Trained research team members approached adult pilgrims aged 18 years or older in their assigned tents and explained the study in detail on the first day of each Hajj (October 13 th 2013, October 2 nd 2014 and September 22 nd 2015). Individual research team members were assigned about 15 participants from the first day of Hajj. The researchers gave pilgrims an information sheet and answered their queries. Written informed consent was obtained from pilgrims who agreed to participate in the study. As per the study design, no minor, i.e., person under 18 years of age, was recruited in this study. All participants were asked to complete a baseline questionnaire and were provided with a health diary (S1 Appendix) in their preferred language (Arabic or English) which they were to fill out daily during the trial. Each participant was identified with a unique barcode number on their consent form, baseline questionnaire, health diary and any clinical specimens taken. A post-Hajj diary (S1 Appendix) for an additional three days was planned but a negligible return rate prevented this information being included in the analysis.
The consent forms and the baseline questionnaires were collected on day one, whereas participants retained diaries for completion over the next four days of Hajj rituals while they were being actively followed (Fig 1).
Each participant in the intervention group was provided with 50 surgical facemasks (3M™ Standard Tie-On surgical mask, Cat No: 1816) as well as verbal and printed instructions and demonstration of appropriate facemask usage (S1 Appendix). Pilgrims in the control group were not provided with facemasks and instructions but could use their own masks if they chose to do so. All pilgrims in both study arms were asked to record their facemask usage (including number of masks used and hours worn each study day) in their health diary daily for four consecutive days. Although facemasks were to be worn for 24 hours daily per protocol if possible, for the analysis, pilgrims who used at least one facemask each day during Hajj were considered to have used a facemask during that day, counter to the planned design.

Measures
As the primary objectives of our trial were to assess the role of facemasks in preventing the acquisition of laboratory-confirmed viral respiratory infections and symptomatic respiratory infection, first primary endpoint was the efficacy of facemasks against laboratory-confirmed viral respiratory infections, and the second primary endpoint was the efficacy of facemasks against clinical respiratory infections in participants.
A total of 464 volunteer researchers were trained by the principal investigators before the study period. Training activities included how to approach pilgrims, the trial processes, explanation and demonstration of facemask use, data collection, follow-up, and sample collection and storage. Study team members were oriented to good clinical practice guidance for the conduct of clinical trials according to the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use.
The research team visited the study tents twice daily during the study period to ask if the participants developed respiratory or systemic symptoms and collected a nasal swab (FLOQS-wabs™; COPAN Diagnostics Inc., Murrieta, CA) from those who developed subjective fever together with one respiratory symptom, or two or more respiratory symptoms without fever. Swabs were placed it into UTM™ (COPAN) viral transport media. Swabs labelled with the participant's unique barcode number were stored in an icebox at -20˚C before being re-stored by day's end in a -80˚C freezer at the laboratory of the Hajj Research Center at Umm Al-Qura University, Makkah. After Hajj, these swabs were shipped in refrigerated or cold containers to the Centre for Infectious Disease and Microbiology Laboratory Services, Westmead Hospital, NSW, Australia. There, nucleic acid was extracted with the Qiagen bioROBOT EZ instrument (Qiagen, Valencia, CA), and amplification was performed using the Roche LC 480 (Roche Diagnostics GmbH, Mannheim, Germany) instrument. Respiratory viruses were detected using a real-time, multiplex reverse transcription polymerase chain reaction assay targeting human coronaviruses (OC43, 229E and NL63), influenza A and B viruses, respiratory syncytial virus (RSV), parainfluenza viruses 1-3, human metapneumovirus, rhinovirus, enterovirus and adenovirus as described elsewhere [22,23]. Middle East respiratory syndrome coronavirus (MERS-CoV) assay targeting the upstream region of the E gene (upE) was also performed as described previously [24].
Symptomatic pilgrims were given generic medications for fever and pain, usually acetaminophen.

Statistical analysis and power calculation
Data from baseline questionnaires and health diaries were entered by trained research staff into customized web-based forms (WUFOO 1 , https://www.wufoo.com), and extracted into Excel sheets. Data checking against paper records was undertaken by four dedicated research- Statistical analysis was performed using the SPSS Statistics 1 v25 (IBM, Chicago, IL, USA) and checked by a statistician using SAS V.9.3.
Assuming that the prevalence of symptomatic viral respiratory infections is 30% in controls and the prevalence of laboratory-confirmed viral respiratory infections in controls is approximately 12%, the intervention was considered clinically worthwhile if it reduced the prevalence of clinical respiratory infection or laboratory-confirmed viral respiratory infections by 50%.
Assuming a moderate intra-cluster correlation of 0.1 and a mean of 75 participants per cluster (tent), and inflating the sample by a design effect of 8.4 to account for clustering [25], the sample size required for a cRCT to detect a reduction from 12% to 6% with 80% power at 5% significance is 2976 per arm. An additional inflation factor of 1.2 was included to allow for up to 15% loss to follow-up or incomplete outcome data. This resulted in a sample size of approximately 3500 participants per arm, making a total of 7000.
A descriptive analysis compared the characteristics of participants in the two arms (intervention and control), both at tent level and at individual participant level, as appropriate. Categorical variables were described using frequencies and percentages, and were compared, where appropriate, by using the Chi-squared test. Continuous data were described using the mean and standard deviation, and were compared by the Student's t-test. The number or proportion of participants with missing data were reported for all variables, but comparisons between groups only included known values, except where otherwise specified. P values and 95% confidence intervals (CIs) were presented without adjustment.
The first and second primary endpoints were analyzed by intention-to-treat analysis according to the participants' randomized treatment group regardless of treatment actually received. Outcomes were analyzed using a generalized estimating equation statistical model that accounted for the binary distribution of the data and the correlation between participants in the same tent, assuming an exchangeable correlation structure.
Exploratory multivariable analyses examined the effect of randomized treatment on outcomes in models adjusted for demographic factors, facemask usage and compliance with treatment. Subgroup analyses were conducted to compare the effect of treatment between groups of participants: male vs. female, those with known risk factors vs. those without risk factors or risk status unknown for viral respiratory infections, vaccinated against influenza vs. unvaccinated (or vaccination status reported as unknown), smoker vs. non-smoker, compliance with daily facemask use vs. non-compliance, and by pilgrim's country of origin.
Subgroup analyses used the same statistical model as the primary outcomes and included an interaction term between randomized treatment and subgroup. If the p value for the interaction term was < 0.05, the effect of treatment was described separately within each subgroup. Primary analysis included all participants who reported symptoms at any time during the study period. Eighteen participants who failed to report symptoms were excluded, but those (n = 675) who reported symptoms only at baseline (i.e., on day one) were included. A per-protocol analysis was undertaken amongst participants who were compliant with instructions and reported symptoms daily after the baseline time point. An analysis was performed on participants who were symptom-free at baseline and who completed symptom reports at least once after baseline (i.e., on day one).
For the primary analysis, pilgrims who did not report symptoms daily were assumed to have no change in their symptoms compared to the most recent reporting day. Those who reported no symptom on any study day were assumed to have never developed symptoms while those who reported symptoms on any day were considered symptomatic.

Participants
A total of 318 tents that housed 11,227 pilgrims were recruited in both study arms on the first day of each Hajj over three years (13 th October in 2013, 2 nd October in 2014 and 22 nd September in 2015) and followed for the next four days. The number of occupants in each tent varied according to the size of the tent, ranging from 6 to 150 pilgrims per tent. The total number of participants across all study years was 7,687 with an average participation rate of 68.5% (7687 of 11,227) (ranged from 10 to 100%). Of the total 7687 participants, 3864 from 149 tents were assigned to the intervention and 3,823 from 169 tents were assigned to the control group, with an overall participation rate of respectively 68% (3864 of 5686) and 69% (3823 of 5541). Their age ranged from 18 to 95 years (median, 34; mean, 37; standard deviation, 12.3 years), with 53.9% female. Of all participants, 6998 (91%) were from Saudi Arabia and Qatar, and the rest were from Australia. A large proportion of pilgrims 57.6% (4428 of 7687) were recruited in the third year, 2015. The baseline characteristics are shown in Table 1.
Overall facemask use was low, even in the intervention tents, with only 24.7% of participants using facemasks daily. Conversely, in the control tents 14.3% participants used facemasks daily. More participants in the intervention group had used a facemask anytime in the weeks before the actual Hajj compared to those assigned in the control group (27.4% vs. 24.2%, p < 0.01).
Slightly more pilgrims in the intervention group than in the control group reported frequent hand washing during Hajj, including their ritual ablutions (69.1% vs. 65.9%, p < 0.01).
The proportion of pilgrims who participated in the study and used facemasks ranged between 0 and 50% per tent and the proportion of pilgrims who reported developing clinical respiratory infection in each tent ranged between 0 and 46%. However, in the intervention group, the number by subgroup of recorded time of daily facemask use for at least 4 hours was consistently greater than in the control group (Fig 2).
The most common side effects of using facemask were difficulty in breathing (26.2%) and discomfort (22%); a small minority (3%) reported feeling hot, sweating, a bad smell or blurred vision with eyeglasses (Fig 3).
In a per-protocol analysis (including only participants allocated to the intervention group who used facemasks daily, and participants allocated to the control group who never used any facemasks) there was no benefit of facemask in preventing laboratory-confirmed viral respiratory infections (OR, 1.2; 95% CI 0.9 to 1.7; p = 0.26) or clinical respiratory infection (OR, 1.3; 95% CI 1.0 to 1.8; p = 0.06) ( Table 4).

Discussion
This randomized trial, like other smaller trials [9][10][11][12][13][14][15], failed to provide conclusive evidence on facemask efficacy against laboratory-confirmed or clinical respiratory infections. Inconclusiveness of this and previous studies might be attributed in part to respiratory pathogens having multiple routes of transmission including contact with contaminated surface [26,27] and fecal-oral transmission of some respiratory viruses [28].
The large sample size in our cRCT enabled the comparison of a much larger number of clinical infections (intervention: control = 354: 322) and many more laboratory-confirmed infections (intervention: control = 96: 60) with higher power than the other randomized trials combined. Although unvaccinated pilgrims in the intervention group had a higher rate of clinical respiratory infection than their counterpart in the control group (13% vs. 10%, p = 0.03), this difference is unexplained. However, in previous studies, the prevalence of influenza-like illness among Hajj pilgrims was inversely proportional to their influenza vaccination uptake [29], and vaccinated Hajj pilgrims had 43% reduction in the probability of proven influenza    infection [30]. A meta-analysis of six observational studies has shown influenza vaccine to be significantly protective against laboratory-confirmed influenza (relative risk, 0.56; 95% CI, 0.41 to 0.75) [31]. Females in the intervention group of our cRCT were at higher risk of acquiring laboratoryconfirmed viral respiratory infections than in the control group (44% vs. 29%; p < 0.01). The reason is unclear, though one possible explanation is Muslim women prefer a loose face cover to a fitted facemask. Over 70% female pilgrims use a face veil during Hajj: one fifth of them use both face veil and mask [32]. Female pilgrims who used a face cover only occasionally (43.2%) tended to have higher rates of clinical respiratory infection compared to those who used it most of the time (36%) [33]. Although we did not assess face cover use by female pilgrims in our study, given that most pilgrims used facemasks only occasionally, the higher rate of viral respiratory infections among women might be due to intermittent use of face cover or even contamination of their masks [9,33]. When they become wet, facemasks may even increase the transmission of infection, either through becoming more porous or by allowing passage of virus through the mask which is transmitted when the mask is handled [34].
The detection rate of respiratory viruses (43%) in our trial was higher than that reported in other studies (4 to 15%) [35], possibly due to the active case ascertainment strategy employed, including close follow-up of the symptomatic participants. However, the distribution of the viruses was similar to that in other studies: i.e., predominance of rhinovirus, followed by influenza and parainfluenza virus, these three viruses accounting for 97% (269/277) of viruses detected in our study.
No MERS-CoV was detected among the studied participants. Since the emergence of MERS-CoV in Saudi Arabia in 2013, multiple surveillance studies among >10,000 pilgrims from various countries have been undertaken without identifying a Hajj-related case [36,37]. Now that there is a grave fear of massive acceleration of COVID-19 spread via large population movements [38], other preventive measures including hand and food hygiene, safe drinking water and physical distancing should be encouraged in addition to facemasks [39,40]. We note that 0.5% of the symptomatic pilgrims tested positive for seasonal coronavirus indicating coronavirus transmission may occur in this setting and that, without the possibility of cross- Table 4. Per-protocol analysis: Effect of facemasks against clinical and laboratory-confirmed viral respiratory infections.  (50) 18 (47) a Analysis includes only participants from intervention group who used facemasks daily (n = 828) and those from control group who did not use facemasks (n = 1497). b Analysis includes only participants from intervention group who used facemasks daily (n = 93) and those from control group who did not use facemasks (n = 122).
The most important limitation of this RCT was that despite much effort to encourage adherence with our protocol, compliance was limited. On the other hand many pilgrims randomized to the control group used facemasks, contrary to the research protocol. The Saudi Arabian Ministry of Health had issued advice to pilgrims to use facemasks while MERS-CoV was circulating in Saudi Arabia during our study period [48], though this was without definitive advice on how to wear masks and the duration of their use. It would have been unethical to counter the Saudi authority's official advice on facemask use. Our trial does indicate, however, that as a public health intervention, facemask use is not practicable. Another important limitation of the study is that nasal swabs were not performed on the first day when subjects were enrolled. We depended on the reporting of clinical symptoms from day one and followed up directly for four days but did not validate the asymptomatic state with virological testing i.e., some asymptomatic pilgrims could have been virus positive. Longer follow-up was attempted through post-Hajj surveillance, but the low compliance precluded any meaningful analysis. Pilgrims moving from place to place to accomplish Hajj rites made it difficult for researchers to follow them as well as planned. While our study protocol required a clinical sample to be collected in participants with clinical respiratory infection, sampling was performed in some who did not meet the clinical respiratory infection definition whilst others were not swabbed when symptomatic. Though a cRCT design, complying as far as possible with CONSORT guidance, not all occupants in the selected tents participated. On average 69% of tent occupants participated in the trial, ranging widely from 10 to 100%, with possible dilution of the effect of facemask use. The trial was conducted over three years, with uneven recruitment over the years, and most participants (58%) recruited in the final year (2015). The rate of clinical respiratory infection over the study years was 15.5% in 2013, 8.3% in 2014, and 10.7% in 2015, reflecting the known seasonal variability of respiratory viruses, which may have affected the outcome of our study. About 9% participants in each study arm failed to return their diaries or returned dairies without recording symptoms, and another 9% in each group were excluded for being symptomatic on day one of our trial.
Another limitation is that participating tents varied between the study arms (intervention: control = 149: 169) because some tents, although designated as separate units, were later found to be parts of a larger tent. This was more common when several small tour groups were managed under a large tour operator and communal activities (e.g., meals, congregational prayers, sermons) were combined in one large tent. It is also possible that some of these tents were allocated to both intervention and control arms and may have further contributed to dilution of the magnitude of the effect or cluster contamination.
The failure of post-Hajj reporting was another limitation of our study. However, the median incubation periods of the three most common viruses (rhino, influenza and parainfluenza) are <3 days [49], indicating that the majority of the detected infections were acquired after enrolment. Though dangerous to presume very little symptomatic disease post-Hajj this is one explanation for non-adherence with post-Hajj follow-up.
Our cRCT, which was a field study in real time, was unable to refute our null hypothesis. Lack of facemask efficacy observed in this trial could be attributed to limited facemask use by participants (only 24.7% used daily and 47.7% used intermittently in the intervention group), the substantial proportion of participants in the control group who used facemasks, the inability to follow participants after Hajj or the likely contamination of masks [9,26]. Though more in the intervention group consistently wore masks for defined periods daily, facemask use by controls further reduced the ability of the study to detect differences in infection rates between the study arms.
Due to lack of blinding pilgrims in control tents reporting even mild illness may have reported symptoms because they knew that they had not received the intervention. Also research team members may have been biased towards swabbing such participants assuming them to be less protected, which could have led to an overestimate of the effect of intervention.
The high rate of facemask use (almost 80%) observed among French Hajj pilgrims during the 2009 influenza pandemic year [50], compared to about 55% in a non-pandemic year [3], and in pilgrims from Southeast Asia (e.g., 73% among Malaysian pilgrims in 2007) seem to be due to cultural differences and heightened awareness during a pandemic [51]. The lower uptake of facemasks among participants in our cRCT is similar to the uptake among Saudi Arabian (35 to 57%) and Australian pilgrims (53 to 57%) observed in previous surveys [31,32,52,53], while uptake as low as 0.02% was also recorded among some international pilgrims [54]. Although 78% in the intervention group of our cRCT used facemasks, only a quarter used them regularly. The most common reasons for non-compliance, difficulty in breathing and feeling of discomfort, found also in previous surveys among Hajj pilgrims [52], limited the use of facemasks in this cRCT. These might be strong and important limitations to effective facemask use against respiratory infections, since, in a mass gathering, close contact setting, around the clock protection would be necessary.
Given the number of people who attended Hajj and the close proximity within and among the living quarters, contact transmission from direct contact and indirect contact with contaminated surfaces may have been important and perhaps another reason why these results should be considered inconclusive. There is also the possibility of exposure and infection during travel and prior to Hajj itself. This RCT demonstrates the difficulties of participants' adherence to instructions and protocol even with active supervision.
Several previous observational studies at Hajj have failed to show the effectiveness of facemasks in preventing respiratory infections [50][51][52], possibly due also to poor adherence to instructions, although a recent study showed that changing facemask every four hourly reduced the chance of upper respiratory tract infections among Hajj pilgrims (adjusted OR 0.56; 95% CI 0.34 to 0.92; p = 0.02) [55]. In our cRCT, though pilgrims in both intervention and control groups were close to each other day and night, none wore a mask for 24 hours as advised. This may have been an unrealistic expectation. Mask wearing during the COVID-19 pandemic has highlighted the importance of effective and realistic health messaging.
Additional studies with an even larger sample size and more intense supervision would better test the efficacy of facemasks and the role of other interventions, such as hand hygiene, in a mass gathering setting. These will be especially important to evaluate such interventions during the COVID-19 pandemic.

Conclusion
This trial failed to provide definitive evidence for the effectiveness of facemasks during the Hajj. This was likely due to poor compliance with facemask use. We report difficulties in implementing a large cRCT, evaluating the effectiveness of facemasks against viral respiratory infections including participants' poor compliance with the protocol, despite active explanation and support.