Accuracy of McMonnies Questionnaire as a Screening Tool for Chinese Ophthalmic Outpatients

Objective To evaluate the accuracy of the McMonnies questionnaire (MQ) as a screening tool for dry eye (DE) among Chinese ophthalmic outpatients. Methods We recruited 27718 cases from 94 hospitals (research centers), randomly selected from 45 cities in 23 provinces from July to November in 2013. Only symptomatic outpatients were included and they were in a high risk of DE. Outpatients meeting the criteria filled out questionnaires and then underwent clinical examinations by qualified medical practitioners. We mainly evaluated sensitivity, specificity, diagnostic odds ratio (DOR), and area under the receiver-operating characteristic curve (AUC) to evaluate the accuracy of the questionnaire in the diagnosis of dry eye. Results Of all the subjects included in the study, sensitivity, specificity, and DOR were 0.77, 0.86 and 20.6, respectively. AUC was 0.865 with a 95% CI (0.861, 0.869). The prevalence of DE among the outpatients claiming “constantly” as the frequency of symptom was over 90%. Scratchiness was a more accurate diagnostic indication than dryness, soreness, grittiness or burning. Different cut points of McMonnies Index (MI) scores can be utilized to optimize the screening results. Conclusions MQ can be an effective screening tool for dry eye. We can take full advantage of MI score during the screening process.


Introduction
The most accepted definition of Dry Eye Disease (DED) was provided by the International Dry Eye Workshop in 2007, referring to it as a multifactorial disease of the tears and ocular surface that results in symptoms of discomfort, visual disturbance, and tear film instability, with potential damage to the ocular surface. It is accompanied by increased osmolarity of the tear film and inflammation of the ocular surface [1]. This definition includes six sequelae: visual compromise, symptoms of discomfort, ocular surface damage, tear film instability, inflammation and increased osmolarity, making dry eye a disease diagnosed by both clinical symptoms and signs.
The prevalence of dry eye based on large epidemiological studies varies from 5.5% to 33.7%, and Asians are more susceptible than Caucasians [2][3][4][5][6][7][8]. However, different diagnosis criteria have been applied in these studies. At present, the diagnosis of dry eye is based on clinical tests and questionnaires, but thus far there is no "gold standard". No single clinical test can be used as a standard criterion for diagnosis, nor has a combination of clinical tests been universally accepted to differentiate DE from healthy eyes.
In spite of the subjective nature of self-reported symptoms, they are more reliable and repeatable than objective clinical tests in detecting dry eye [9]. MQ has been found to be a useful screening instrument, providing valid sensitivity and specificity information [10]. This questionnaire is composed of 14 questions focusing on the risk factors for DED. Categories of assessment include demographic information (gender and age), dry eye symptoms, previous and current dry eye treatments, secondary symptoms (associated with environmental stimuli), systemic conditions (Sjögren syndrome, arthritis, thyroid disease), and dryness of the mucous membranes (chest, throat, mouth or vagina) [11].
Although some studies [12][13][14][15][16] have reported correlations among symptoms and clinical tests, and some researchers [16,17] have validated the questionnaire in some populations, the formal assessment of MQ as a screening instrument for detecting DE in Chinese ophthalmic outpatients is unprecedented.
What's more, few studies pay attention to utilizing the MI scores to optimize the screening results. As outpatients in this study are in high risk of DE, we lower the cut-offs of MI scores to maximize sensitivity and compare the accuracy under different circumstances. We can also observe the distributions of DE and non-DE groups according MI scores.

Outpatient recruitment
Ninety-four hospitals (research centers) were randomly selected from 45 cities in 23 provinces from July to November in 2013. From these hospitals (research centers), we recruited 27718 outpatients from ophthalmic clinics by registration orders. Inclusion criterion was a presence of at least one of the six symptoms: dry sensation, foreign body sensation, burning sensation, eyesight fatigue, discomfort, and vision fluctuation. Outpatients with other eye diseases such as conjunctivitis, glaucoma, and ocular trauma were excluded. The rest filled in the MQ and underwent clinical examinations including Tear breakup time tests, Schirmer I tests and Fluorescein staining by trained medical practitioners. This survey was approved by the Institutional Review Board (IRB) of Fudan University. The investigation was conducted in strict accordance with the principles expressed in the Declaration of Helsinki. Details and procedures of this study were indicated to all the patients by practitioners before the questionnaire and clinical tests. Oral consents were sought from all subjects in advance. Participant would be excluded in the absence of agreement, thus in this way we documented and insured all patient consents.
Both the collection and analysis of the data were anonymous, which explained why we used oral consents instead of written consents. In addition, the clinical tests did not cause any physical harm to patients. We believed nothing was against health, safety and privacy of patients in this survey.

McMonnies Index
The full version of the McMonnies questionnaire is available in S1 Appendix, with the full set of weighting scores for each question. Scores are tabulated using a weighted-point assignment "based on clinical experience", where all scores are summed, with weights obtained to calculate an overall "Index" [18]. The Index ranges from 0 to 45, where a higher score is regarded as more indicative of DED [19]. A cut-point of greater than 14.5 is recommended for a dry eye diagnosis [19].

Diagnosis of dry eye disease
Diagnosis was established according to a consensus of Chinese dry-eye diagnostic criteria from the Chinese Medical Association as follows: (1) presence of at least one of the six symptoms: dry sensation, foreign body sensation, burning sensation, eyesight fatigue, discomfort and vision fluctuation; (2) TBUT5s or Schirmer I test (without anesthesia) 5mm/5min; (3) a positive diagnosis of fluorescein staining accompanied by one of the results: 5s<TBUT10s or 5mm/5min< Schirmer I test (without anesthesia) 10mm/5min. The presence of (1) was essential for disease diagnosis. Subjects showing the presence of a combination of (1) and (2), or (1) and (3) were diagnosed with DED.

Statistics analysis
Data analysis was performed using the SPSS 19.0 software. Student's t-tests and ANOVA tests were utilized for quantitative variables. The Chi-squared test was utilized for qualitative variables. Trend tests were conducted to verify if there were ascending or descending trends in quantitative variables. Values of p0.05 were considered to be statistically significant. Main values used to assess the accuracy in detecting DED included sensitivity, specificity, DOR, and AUC. The 95% confidence intervals for the AUC were also evaluated.

Results
Overall, the sensitivity, specificity, false negative rate, and false positive rate of MQ were 0.77, 0.86, 0.23, and 0.14, respectively. The positive likelihood ratio, the negative likelihood ratio and DOR were 5.47, 0.27, and 20.6, respectively. AUC was 0.865. The 95% CI of AUC was (0.861, 0.869). A fourfold table of diagnosis results across the study population can be found in S1 Text.
Demographic data, average MI scores, rates of MI>14.5 (i.e positive diagnosis by MQ) and positive rates as diagnosed by the gold standard we adopted in this study are summarized in Table 1. A significantly higher average score was observed among females than males, 16.3 versus 13.7(p<0.01). A rising trend (p<0.01) in average scores with age was observed. The highest average scores and positive diagnostic rates using both methods (p<0.05) were observed among outpatients who reported dryness. The highest average scores and positive diagnostic rates using both methods (p<0.05) were observed among outpatients who reported more than three symptoms. The prevalence of the outpatients reporting "constantly" as the frequency of symptom was 94.5%. Table 2 is a detailed evaluation of the MQ among different subgroups. A rising trend in sensitivity was observed with age among the <25y age group, 25-45y age group and >45y age group, while the reverse was true regarding specificity, DOR and AUC.
Addressing symptom reported, a much higher DOR (32.4) was observed for the group reporting scratchiness than any other group. The AUC of scratchiness (0.855) group was smaller than dryness group (0.858), but larger than grittiness (0.835), soreness (0.807), or burning group (0.822).
As the number of symptoms increased, sensitivity increased, while specificity trended to fall basically. The greatest DOR (59.9) and largest AUC (0.896) were observed for the group reporting 0 symptom. But we did not find a trend of DOR or AUC when the number of symptoms increased.
As the frequency of symptoms increased, sensitivity increased, while specificity had an opposite trend. The greatest DOR was observed in the group reporting symptoms "sometimes" (18.2), followed by "often" (14.7), "never" (12.7) and "constantly" (5.07). The highest AUC was found in the "often" group (0.843), followed by "sometimes" (0.822), "never" (0.764) and "constantly" (0.708).  Table 3 shows sensitivity, specificity and DOR at different cut-offs for different subgroups. Basically a rising trend of sensitivity was observed as the MI cut-offs went high in each classification, while the specificity and DOR trended to fall. But the trend of DOR was uncertain among the subgroups of frequency of symptom.
ROC is plotted in Fig 1. Peak Youden's index (0.625) was found when MI was 14.5, with sensitivity of 0.766 and specificity of 0.860 in Table 4. AUC was 0.865 with a 95% CI from 0.861 to 0.869.
An obvious overlap between the DE and non-DE groups was found between MI scores of 10 to 14 in Fig 2. The distribution of DE subjects concentrated in the range from 14 to 24, while non-DE subjects mainly concentrated in the range from 6 to 14.

Discussion
Symptom assessment is a critical component for the diagnosis for dry eye [22][23][24] and can be a very effective screening tool. Supporting the potential utility of this approach, a study reported that screening based on symptoms alone could better discriminate DE from non-DE than one combining symptoms and diagnosed sign [25]. For screening or research based on large populations, methods for diagnosing dry eye must be economically viable, noninvasive and brief, making questionnaires a favorable option. Many questionnaires have been used in epidemiological studies, serving as screening tools [3,4,6,8,[26][27][28] or to grade the severity of dry eye [29][30][31].
In summary, this study indicated good accuracy of the questionnaire in distinguishing DE and non-DE, supported by the sensitivity, specificity, DOR and AUC results. The spectrum of sensitivity and specificity has varied in different studies. Discriminant analyses of one investigation by McMonnies [10] reported a 98% sensitivity and a 97% specificity. These results were proven to be biased estimates because they stemmed from the same data from which the classification process was developed [32]. Later, in another research by McMonnies [19], sensitivity and specificity were found to be 92% and 93%, respectively. The study by Kelly K. Nichols [18] yielded a sensitivity of 82% and specificity of 36%, indicating a comparatively low specificity. Another large scale epidemiological study [27] focusing on US women reported a sensitivity of 77% and specificity of 86%, which were quite close to the results of our study. There are several possible reasons for the divergence of our results from some other existing investigations. Firstly, our inclusion criterion in this study was ophthalmic outpatients with at least one of the Symptom reported: outpatients reporting one single symptom 4 1.94: If a fourth fold table contains 0, the DOR will be undefined. Under this circumstance the method to get an approximation of DOR is to add 0.5 to all counts in the table [20,21].  six symptoms, inferring that these subjects may be at high risk for dry eye. Secondly, the prevalence of dry eye is different among races, and we focused on a different race from previous studies. Thirdly, throughout our study, a strict gold standard was employed for prudent diagnosis results. Peak Youden's Index was found at the cut-point of 14.5 MI, in accordance with another study focusing on the white race [19], suggesting that this cut-point is also suitable for Chinese ophthalmic outpatients. Although the cut-point of 14.5 is used as a diagnostic criterion in general practice, the results of MI scores may carry more diagnostic information and could offer potential advantages. For instance, in a study [16] MI was used to divide people into normal (MI<10), moderate dry eye (10MI20) and severe dry eye (MI>20) groups, implying that MI could reflect disease severity to some degree. On the other hand, a negative result for a screening test that has a very high sensitivity can rule the patients out (SnOUT) [33]. Therefore, in order to increase the referral rates of DE patients to clinical assessment, we can lower MI thresholds to maximize sensitivity. In Table 3, sensitivity went extremely high when we lowered the MI cut-off to 6.5, inferring that we would miss few DE patients under this situation. The sensitivity even reached 100% for the group claiming the frequency of symptom as "constantly" at the cut-off of 6.5 in this sample. Fig 2 hints that MQ shows dissatisfactory diagnostic capacity when MI scores range from 10 to 14. We tentatively put forward that the accuracy of the MQ reduces when MI gets closer to the cut-point of 14.5. In another investigation by McMonnies [19], 10MI20 was defined as a equivocal classification group, due to an overlap of DE and non-DE subjects, which is different from Fig 2. The discrepancy is probably a consequence of the proportional difference in the DE and non-DE groups in the two studies. It has even been suggested by McMonnies [19] that those subjects with MI scores between 10 and 20 should be removed from the study when involving MQ diagnosis results. However, in this survey, all eligible subjects were included because the object of this study was to assess the accuracy of MQ under actual outpatient situations.
The accuracy of the questionnaire becomes less reliable with aging, supported by DORs and AUCs in Table 4. We also find the greatest DOR for the group reporting scratchiness, implying that this symptom may be more reliable than the other four symptoms in detecting DE ( Table 4). The MQ performs best in the group reporting no symptom, with a greater DOR (59.9) than outpatients with symptom(s).
Based on all the discussion above, we put forward some suggestions on the usage of the MQ in screening for DE. Firstly, to avoid missing DE patients, we can lower the MI score threshold when necessary. Secondly, prevalence of DE among different subgroups will assist us during the process of screening. For instance, as the prevalence of outpatients claiming "constantly" as the frequency of symptoms is over 90% in this study, we suggest referrals for all these outpatients to clinical assessment. Thirdly, special attention should be paid to the subgroups proven to be more accurate in diagnose.
We have to admit that this survey has some limitations. Firstly, this study was not designed to test the reliability and validity. Thus we did not recruit the outpatients on two occasions and could not assess test-retest reliability or validity. Secondly, disease severity was not taken into consideration. Finally, the extension of our conclusions is restricted by differing gold standards applied in different studies, which is a universal problem among dry eye surveys.
The major strength of this survey was its large sample size. Compared to other parallel studies [18,34] recruiting less than 300 subjects each, this epidemiological study recruited 27781 outpatients. Secondly, the representation of the sample was assured, because the subjects were from different provinces nationwide. Finally, the assessment of the MQ in large Chinese outpatient samples has not yet been reported, filling an important blank space in the relevant research area.
To conclude, the MQ is an effective screening tool for DED in Chinese outpatients. Based on the results obtained from this epidemiological study, ophthalmologists can employ the questionnaire during the process of preliminary diagnosis. The detailed results will further assist them in determining more valuable and accurate diagnostic information. In addition, epidemiologists can apply it in large population screening, dramatically reducing the cost. Further studies of the assessment of MQ are warranted to evaluate the relationship between the disease severity and the MI.
Supporting Information S1 Appendix. The full version of the McMonnies questionnaire. (PDF) S1 Data. Minimal data set used to reach the conclusions drawn in the manuscript with related metadata and methods. (XLSX) S1 Text. A fourfold table of diagnosis results across the study population. (PDF) S2 Text. The complete table for sensitivity, specificity and Youden's Index for different MI scores. (PDF)