Validation of the STOP-Bang Questionnaire as a Screening Tool for Obstructive Sleep Apnea among Different Populations: A Systematic Review and Meta-Analysis

Background Diagnosing obstructive sleep apnea (OSA) is clinically relevant because untreated OSA has been associated with increased morbidity and mortality. The STOP-Bang questionnaire is a validated screening tool for OSA. We conducted a systematic review and meta-analysis to determine the effectiveness of STOP-Bang for screening patients suspected of having OSA and to predict its accuracy in determining the severity of OSA in the different populations. Methods A search of the literature databases was performed. Inclusion criteria were: 1) Studies that used STOP-Bang questionnaire as a screening tool for OSA in adult subjects (>18 years); 2) The accuracy of the STOP-Bang questionnaire was validated by polysomnography—the gold standard for diagnosing OSA; 3) OSA was clearly defined as apnea/hypopnea index (AHI) or respiratory disturbance index (RDI) ≥ 5; 4) Publications in the English language. The quality of the studies were explicitly described and coded according to the Cochrane Methods group on the screening and diagnostic tests. Results Seventeen studies including 9,206 patients met criteria for the systematic review. In the sleep clinic population, the sensitivity was 90%, 94% and 96% to detect any OSA (AHI ≥ 5), moderate-to-severe OSA (AHI ≥15), and severe OSA (AHI ≥30) respectively. The corresponding NPV was 46%, 75% and 90%. A similar trend was found in the surgical population. In the sleep clinic population, the probability of severe OSA with a STOP-Bang score of 3 was 25%. With a stepwise increase of the STOP-Bang score to 4, 5, 6 and 7/8, the probability rose proportionally to 35%, 45%, 55% and 75%, respectively. In the surgical population, the probability of severe OSA with a STOP-Bang score of 3 was 15%. With a stepwise increase of the STOP-Bang score to 4, 5, 6 and 7/8, the probability increased to 25%, 35%, 45% and 65%, respectively. Conclusion This meta-analysis confirms the high performance of the STOP-Bang questionnaire in the sleep clinic and surgical population for screening of OSA. The higher the STOP-Bang score, the greater is the probability of moderate-to-severe OSA.


Introduction
Obstructive sleep apnea (OSA) is a prevalent sleep breathing disorder affecting 9-25% of the general adult population. [1] It is associated with cardiovascular diseases, cerebrovascular diseases, metabolic disorders and impaired neurocognitive function. [2][3][4] It has been estimated that up to 80% of individuals with moderate-to-severe OSA may remain undiagnosed. [5] The prevalence is higher in the surgical population, [6,7] with a prevalence rates as high as 70% in bariatric surgical patients. [8,9] The majority of surgical patients with OSA remain undiagnosed and subsequently, are untreated at the time of presentation for surgery. [7] Given the important adverse consequences associated with untreated OSA, prompt diagnosis and treatment of unrecognized OSA is critical. The gold standard for diagnosis of OSA is an overnight polysomnogram (PSG). However, PSG is time consuming, labor intensive, and costly. Moreover, PSG requires the expertise of sleep medicine specialists, which may not be readily available at many hospitals and medical centers. Therefore, a simple and reliable method of identifying patients who are at high-risk of OSA and triaging them for prompt diagnosis and treatment is clinically relevant. A number of screening tests have been developed to identify high-risk patients. [10][11][12][13][14][15][16] However, many of these screening tests are lengthy and complicated, or require an upper airway assessment, making them inconvenient to use and may increase variability amongst clinicians performing the upper airway assessment.
The STOP-Bang questionnaire was first developed in 2008. [17] It is a simple, easy to remember, and self-reportable screening tool, which includes four subjective (STOP: Snoring, Tiredness, Observed apnea and high blood Pressure) and four demographics items (Bang: BMI, age, neck circumference, gender). [17] The STOP-Bang questionnaire was originally validated to screen for OSA in the surgical population. The sensitivity for the STOP-Bang score 3 as the cut-off to predict any OSA (apnea hypopnea index (AHI) >5), moderate-to-severe OSA (AHI >15) and severe OSA (AHI >30) was 83.9%, 92.9% and 100% respectively. [17] Due to its ease of use and high sensitivity, the STOP-Bang questionnaire has been widely used in preoperative clinics [17][18][19], sleep clinics [20][21][22][23][24][25][26][27][28][29][30], the general population [31] and other special populations [32,33] to detect patients at high-risk of OSA. The purpose of this systematic review and meta-analysis is to determine the accuracy of the STOP-Bang questionnaire in screening patients for OSA and to evaluate the relationship between the STOP-Bang score and the probability of OSA among different patient populations.

Literature search strategy and study selection
We identified and reviewed published articles in which the STOP-Bang questionnaire was assessed as a screening tool for OSA among different patient populations. The literature search was performed according to PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-analysis) guidelines and the search strategy was implemented with the help of an expert librarian familiar with the literature search.
Electronic searches. All queries started in 2008 when the STOP-Bang was first published. [17] With the goal of completeness, a systematic search of the literature was carried out using multiple sources, including MEDLINE (from 2008 to January 2015), Medline-in-process & other non-indexed citations (up to January 2015), Embase (from 2008 to January 2015), Cochrane Central Register of Controlled Trials (up to January 2015), Cochrane Databases of Systematic Reviews (from 2008 to January 2015), Google Scholar, Web of Sciences (from 2008 to January 2015), Scopus (from 2008 to January 2015) and PubMed (from 2008 to January 2015) using the search strategy that was designed for each database. The search strategy included the following free-text and index terms: 'obstructive sleep apnea', 'obstructive sleep apnea syndrome', 'obstructive sleep apnoea', 'obstructive sleep apnoea syndrome', 'sleep disordered breathing', 'obesity hypoventilation syndrome', 'apnea or apnoea', 'hypopnea or hypopnoea', 'STOP-Bang', 'STOP Questionnaire'.
Searching other resources. A citation search was also conducted by performing a manual review of references from the final articles analyzed as well as the related review articles.
Selection of studies. Two reviewers (M.N., F.C.) independently screened the titles and abstracts of the search results. After excluding the irrelevant articles, full-text articles of the remaining publications were retrieved and carefully evaluated to determine if they met the following inclusion criteria: 1) The study evaluated the STOP-Bang questionnaire as a screening tool for OSA in adult subjects >18 years; 2) The results of a PSG (either laboratory or portable) confirming the diagnosis of OSA; 3) OSA and its severity was defined by an AHI or a respiratory disturbance index (RDI); and 4) The full-text papers were written in the English language.

Data extraction and management
The two independent reviewers (M.N. & F.C.) extracted the data with a standard data collection form. For each study, a 2X2 contingency table was constructed using the predictive parameters for each AHI or RDI cut-off. Studies were excluded if there was inadequate information to create the 2x2 contingency tables or if there was an inadequate description of the methodology. The duplicates were removed and any disagreements were resolved by consulting with another author (P.L.). Unless specifically defined, the standard cut-off of the STOP-Bang questionnaire (STOP-Bang 3) was adopted. An AHI 5 or RDI 5 were considered as the diagnostic cut-off for OSA. An AHI 15 or RDI 15 were considered as the diagnostic cut-off for moderate-to-severe OSA, and AHI 30 or RDI 30 for severe OSA.
The following information was collected from each study: author, year of publication, type of study, type of patients (surgical patients, sleep clinic patients, general population, renal failure patients and highway bus drivers), sample size, validation process and tool, OSA definition and number of patients in each of the following categories: mild (AHI 5), moderate-to-severe (AHI 15 or RDI 15) and severe OSA (AHI 30 or RDI 30). The following clinic data were also extracted: age, gender, Body Mass Index (BMI), neck circumference, the STOP-Bang score, mean AHI/RDI and minimum SpO 2 .

Assessment of methodological quality
The methodological quality of each study was assessed and any disagreements were resolved by consulting another author (PL). The validity criteria assessing the internal and external validity were explicitly described and coded according to the Cochrane Methods group on the screening and diagnostic tests. [34] The internal validity included the following factors: study design, definition of the disease, blind execution of the index test (STOP-Bang questionnaire) and the reference test (PSG), valid reference test, avoidance of verification bias, and independent interpretation of the test results. The external validity consisted of the following items: disease spectrum, clinical setting, demographic information, previous screening or referral filter, explicit cutoffs, percentage of missing patients, missing data management, and subject selection for PSG.

Statistical analysis
The continuous data are presented as mean and standard deviation and categorical data as frequency and percentage. Using 2X2 contingency tables, we recalculated the following predictive parameters in each study: prevalence, sensitivity and specificity, positive predictive value (PPV) and negative predictive value (NPV), and diagnostic odds ratio (DOR). The area under the receiver operating characteristic (ROC) curve were calculated by logistic regression. The pooled predictive parameters (sensitivity and specificity, positive and negative predictive value, DOR and area under the ROC curve were obtained to assess the performance of each STOP-Bang score for the different AHI cut-offs (AHI 5, AHI 15 and AHI 30). The probability of moderate and severe OSA at the various STOP-Bang scores were pooled and presented as a bar graph.
The meta-analysis was carried out with Review Manager Version 5.3. Copenhagen (The Nordic Cochrane Centre, The Cochrane Collaboration, 2014) and Meta-Disc version 1.4 (Hospital Ramony Cajal, Madrid, Spain). The parameters were assessed separately for each population with similar characteristics (i.e. sleep clinic population, surgical population). The parameters from pooled data of each population were calculated and forest plots were created for the predictive parameters using a random effect model. DOR and ROC curve analysis was presented to assess the diagnostic ability of STOP-Bang questionnaire. Inconsistency was assessed using the Cochrane Q test (P value <0.05: heterogeneity present) and I 2 test (I 2 >33%: heterogeneity present).

Methodological quality of the included studies
All included studies used PSG as a valid reference test to verify the accuracy of the STOP-Bang questionnaire, confirming internal validity (S2 Appendix). For validation purposes, 12 studies used laboratory PSG [17,[19][20][21][22][23][24][25][26]28,29,32], while three used portable PSG (level 3 PSG [27,33] or level 2 PSG [31] and two studies used laboratory or portable level 2/3 PSG [18,30]. All included studies had specific information to clearly evaluate the risk of bias during the validation process of the STOP-Bang questionnaire. The following aspects were available in the chosen publications: 1) Blinded interpretation of the PSG and STOP-Bang questionnaire (i.e. those who scored the PSG were unaware of the results of the STOP-Bang questionnaire and vice versa); and 2) Interpretation of the PSG results was performed independent of the patient's clinical history. In terms of the external validity, all studies adequately met the appraisal items with one  minor exception [32] (S3 Appendix). All studies clearly mentioned the inclusion and exclusion criteria.
The following abbreviations were used to evaluate the internal and external validity of the studies. F: Full meeting criteria; P: partially meeting criteria; U: Unsure if meeting criteria in subgroups; N: not meeting criteria in subgroup; N/A: not applicable.
Predictive parameters of the STOP-Bang questionnaire in highway bus drivers. The STOP-Bang questionnaire was evaluated to detect moderate-to-severe OSA in highway bus drivers by Firat et al. [32] The prevalence of moderate-to-severe OSA among the highway bus drivers was 54%. The sensitivity and specificity of a STOP-Bang score 3 as the cut-off to detect moderate-to-severe OSA were 87% and 49% respectively, whereas the positive and negative predictive values were 66% and 76% respectively. The DOR was 6.3 and area under the ROC was 0.68.
Predictive parameters of the STOP-Bang questionnaire in renal failure patients. In renal failure patients, the prevalence of moderate-to-severe OSA (RDI 15) and severe OSA (RDI 30) was 42% and 29% respectively. [33] The sensitivities for a STOP-Bang score 3 as the cut-off to detect moderate-to-severe OSA (RDI 15) and severe OSA (RDI 30) were 93.1% and 98% respectively. The corresponding negative predictive values were 86% and 97%. The specificity was 30% and 27%. The PPV was 49% and 35% respectively. [33]     Predictive performance of various STOP-Bang scores. The predictive parameters of the various STOP-Bang score cut-offs were analyzed in six studies (n = 2807) [18,19,21,27,29,30]. Data from four studies (n = 1980) [21,27,29,30] from the sleep clinic population, and two (n = 827) [18,19] from the surgical population were pooled separately (Table 4 & Fig 2). In the sleep clinic population, as the STOP-Bang score cut-off increased from 3 to 8, the specificity increased from 52% to 100%, and the PPV increased continuously from 93% to 100% for any OSA (AHI 5). Similarly, for moderate-to-severe OSA (AHI 15) the specificity increased from 32% to 100%, and the PPV increased from 73% to 95%. For severe OSA (AHI 30) the specificity increased from 23% to 100% and PPV increased from 48% to 86% (Table 4).  In the surgical population, as the STOP-Bang score cut-off increased from 3 to 7, the specificity increased from 40% to 98% and the PPV increased from 75% to 82% for any OSA (AHI 5). Similarly, for moderate-to-severe OSA (AHI 15) the specificity increased from 11% to 95%, but the PPV decreased from 39% to 37%. For severe OSA (AHI 30) the specificity increased from 28% to 97% and PPV increased from 22% to 33% (Table 4).
Association between STOP-Bang scores and predictive probability. A meta-analysis was carried out in five studies (n = 2792), [18,21,29,30,51] three studies in sleep clinic patients (n = 1852) [21,29,30] and two studies in surgical patients, (n = 957) [18,51]. The relationship between the predictive probabilities of moderate-to-severe or severe OSA and STOP-Bang scores is illustrated in Fig 7. In both sleep clinic (Panel A; n = 1852) and surgical patients (Panel B; n = 957), the probability of moderate-to-severe OSA, or severe OSA increased as the STOP-Bang score increased from 3 to 7/ 8. With higher scores, there is a more profound increase in the probability of severe OSA compared to moderate OSA.

Discussion
This review shows that the STOP-Bang questionnaire with a score 3 as the cut-off consistently demonstrates a high sensitivity to detect OSA in different patient populations;  94%(92-95) to detect moderate-to-severe OSA in sleep clinic patients and 91%(87-93) in surgical patients. The specificity at the same cut-off is modest, ranging from 34% in Sleep Clinic population to 32% in surgical population. As the STOP-Bang score increases, the probability of moderate and severe OSA increases. When the STOP-Bang score was 7 or 8, the probability of severe OSA was 75% in the sleep clinic population and 65% in the surgical population. Given the relatively high prevalence of undiagnosed and untreated OSA [1,6,7] and its associated cardiovascular, respiratory, and neurocognitive morbidities [52][53][54][55][56][57], a simple and effective OSA screening tool is essential. This approach is important to perioperative care team, as often there is insufficient time to complete a preoperative assessment of OSA [58] with the standard diagnostic approach. The STOP-Bang questionnaire can fulfill this need given that it is a short, practical and straightforward test. The questionnaire can be completed within 1-2 minutes with very high response rates of 90-100%. [17] Utilization of the STOP-Bang questionnaire in the sleep clinic population Since patients are referred to the sleep clinic for a suspicion of sleep related disorders, the prevalence of OSA is high in this population. The high sensitivity and NPV with a STOP-Bang score 3 as the cut-off can help sleep clinicians exclude patients with very little chance of moderate-to-severe OSA. On the other hand, a patient with a high score (5) on the STOP-Bang questionnaire has a high probability of severe OSA. These patients warrant expedited diagnosis and treatment. With the STOP-Bang questionnaire, sleep clinicians can prioritize their patients and efficiently allocate their limited resources. Utilization of the STOP-Bang questionnaire in the surgical population OSA is prevalent in surgical populations and is considered to be an independent risk factor for perioperative complications in non-cardiac surgeries. [52][53][54][55][56][57] Further, OSA is associated with the occurrence of major adverse cardiovascular and cerebrovascular events, repeated revascularization, angina, and atrial fibrillation following coronary artery bypass grafting (CABG). [59] Mutter et al. have shown that surgical patients with a diagnosis of OSA and continuous positive airway pressure (CPAP) prescription had lower rates of cardiovascular complications. [60] Further, patients with OSA who are not treated with CPAP preoperatively are at increased risks for cardiopulmonary complications after general and vascular surgery. [61] Therefore, it is important to identify patients at high-risk of having moderate-to-severe OSA preoperatively. However, the short time interval between the preoperative clinic visit and scheduled surgery date, lack of willingness from patients to undergo preoperative PSG and potentially long wait times for a sleep clinic appointment may hinder diagnosing OSA prior to surgery. By incorporating the STOP-Bang questionnaire into preoperative clinic practice, surgical patients can be risk stratified for OSA severity using the score. A STOP-Bang score of 0-2 has a high negative predictive value for assessing the likelihood of moderate or severe OSA, which can be used to mitigate the need for PSG. Patients with a high score on the STOP-Bang questionnaire (5) have a high probability of having moderate-to-severe OSA. Depending on the co-morbidities and type of surgery, they may need referral to a sleep clinic for further investigation before surgery or be treated as an OSA patient perioperatively. Being able to predict moderate-to-severe or severe OSA in the perioperative setting is clinically relevant so that clinicians can take the appropriate steps in mitigating the risk of perioperative complications associated with OSA (e.g. changes in anesthetic care, careful titration of opioids, CPAP administration and postoperative monitoring).
In a retrospective study, Prockzco et al. [48] compared the outcomes of patients undergoing bariatric surgery who had undergone preoperative PSG and were on CPAP therapy to those considered high risk for OSA based on STOP-Bang score 3 without preoperative PSG. Patients with a STOP-Bang score 3 had higher postoperative complications and an increased length of stay (LOS) compared to patients with OSA using CPAP therapy perioperatively and compared to patients with a STOP-Bang score 0-2. This study was in line with others who found that patients with a STOP-Bang score 3 versus 0-2 had higher postoperative complications and longer LOS. [36][42] In a preoperative setting, a high STOP-Bang score may help in risk stratification and obviate the need for a PSG [62,63]. Moreover, perioperative CPAP therapy may reduce hospital LOS. [64] Therefore, identifying and treating patients at high risk for moderate or severe OSA may help to potentially avoid perioperative complication. Further research is needed in this area.
The variation in the predictive parameters among the different populations may be due to the difference in sample sizes, age and gender discrepancies of the recruited patients, differences in associated co-morbidities, or cultural / racial differences. In a study by Kunisaki et al [50], a STOP-Bang score 3 showed a high sensitivity of 99%, but a low specificity of 5%, which may be due to the predominantly older male population. The specific combination of predictive factors in the STOP-Bang questionnaire may improve its specificity. For patients with a STOP 2, male gender, a BMI >35 kg/m 2 and a neck circumference >40 cm were more predictive of OSA than age [65]. The specificity of the STOP-Bang questionnaire may be improved by the addition of serum bicarbonate levels. [66] Most of the studies in our metaanalysis were from sleep clinic population, where the prevalence of OSA is higher. Further studies are required in a variety of medical, surgical, and general populations.
This systematic review and meta-analysis has some limitations. One of the factors contributing to the moderate to high heterogeneity is the variability of the target populations among the different studies. In an effort to unify the subject populations, all studies were divided into major groups: sleep clinic, surgical and general populations. The other reason for the heterogeneity may be variation in the prevalence of OSA in the different populations. Also, there was a paucity of validation studies in surgical patients. Nonetheless, we used the random effects method, which is more suitable when heterogeneity exists. The other limitation is that a non-English study was excluded even though it showed high sensitivity for the STOP-Bang questionnaire. [40] There is significant correlation between sensitivity and specificity with clinical screening tools, and our statistical approach does not account for overestimation of overall diagnostic test accuracy related to interpretation of each measure individually. More advanced methods utilizing bivariate or Bayesian frameworks may be necessary to address this limitation. Although DOR provides a combined measure across both sensitivity and specificity, it may significantly underestimate the confidence intervals. Despite these limitations, our systematic review and meta-analysis provides the interpretation of the available literature on the STOP-Bang questionnaire as a screening tool in OSA patients.
In summary, the STOP-Bang questionnaire has been validated to be an excellent screening tool for OSA in sleep clinic and surgical population. The probability of moderate and severe OSA steadily increases with higher STOP-Bang scores. The high negative predictive value of the STOP-Bang questionnaire may indicate that patients are unlikely to have moderate-to-severe OSA. These characteristics make the STOP-Bang questionnaire a useful clinical tool to identify patients at high risk of OSA and can facilitate the diagnosis and treatment of unrecognized OSA.