Is Sjögren’s syndrome dry eye similar to dry eye caused by other etiologies? Discriminating different diseases by dry eye tests

Purpose Dry Eye Disease (DED) is part of several conditions, including Sjögren’s syndrome (SS) and no single test to diagnosis DED. The present study intends to evaluate whether a set of signs and symptoms of DED can distinguish: a) SS from other non-overlapping systemic diseases related to DED; b) primary and secondary SS. Methods 182 consecutive patients with DED were evaluated under five groups: SS, graft-versus-host disease (GVHD), Graves' orbitopathy (GO), diabetes mellitus (DM), glaucoma under treatment with benzalkonium chloride medications (BAK). Twenty-four healthy subjects were included as control group (CG). The evaluation consisted of Ocular Surface Disease Index (OSDI), Schirmer test (ST), corneal fluorescein staining (CFS) and tear film break up time (TFBUT). Indeed, a subset of DED patients (n = 130), classified as SS1, SS2 and nonSS (NSS) by the American-European Criteria were compared. Quadratic discriminant analysis (QDA) classified the individuals based on variables collected. The area under Receiver Operating Characteristics (ROC) curve evaluated the classification performance in both comparisons. Results Comparing SS with other diseases, QDA showed that the most important variable for classification was OSDI, followed by TFBUT and CFS. Combined, these variables were able to correctly classify 62.6% of subjects in their actual group. At the discretion of the area under the ROC curve, the group with better classification was the control (97.2%), followed by DM (95.5%) and SS (92.5%). DED tests were different among the NSS, SS1 and SS2 groups. The analysis revealed that the combined tests correctly classified 54.6% of the patients in their groups. The area under the ROC curve better classified NSS (79.5%), followed by SS2 (74.4%) and SS1 (69.4%). Conclusions Diseases that causes DED, and also SS1, SS2 and NSS are distinguishable conditions, however a single ocular tools was not able to detect the differences among the respective groups.

a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 of the major challenging aspects of DED: 1) the discrepancy between symptoms and signs [23], 2) the difficulty to make diagnosis using one single tool or exam [24][25][26][27], and 3) lack of tools to assess the impact of a treatment for DED resulting in weak or absent conclusion in most of the systematic reviews and clinical trials [28][29][30].
The hypothesis of this study is that different diseases (etiologies) related to DED have distinguishable performance in different DED tests, in particular SS from other diseases and SS1 from SS2 and non SS. Therefore each condition should be monitored by the most sensitive/ specific parameter(s).
Therefore, our aim is to investigate whether a set of signs and symptoms of DED is able to distinguish the common non-overlapping related systemic diseases, in particular SS and the group of tests that better identify the DED in each condition.

Subjects
The study was approved by the Faculty of Medicine Ethics Committee (CAAE: 37688914.2.0000.5440), University of Sao Paulo and was conducted in accordance with the Declaration of Helsinki and current legislation on clinical research. Written informed consent was obtained from all subjects after explanation of the procedures and study requirements.
Comparison of SS and other causes of DED. This transversal study analyzed the performance of the tests currently used for diagnosis of DED among individuals belonging to five groups of diseases and a control group. The methods of enrollment, demographic and clinical data of those cases were published elsewhere, in a study that reported the sensitivity, specificity and the positive predictive value of DED tests alone and in combination for DED individuals disregarding the causes and in each of those different diseases [31]. One hundred eighty-two DED patients were recruited among consecutive patients attending the outpatient DED clinic in a referral university hospital. Individuals with the following confirmed and non-overlapping diagnosis associated with DED were included: SS diagnosed following the American-European Criteria [5], GVHD, GO, or chronic exposure to BAK preserved hypotensive drugs for glaucoma at least for one year, and DM but not diabetic retinopathy (confirmed fasting glycaemia and indirect fundoscopy).
The DED diagnostic criteria supported by different clinical studies was the following: Ocular Surface Disease Index (OSDI) score > 20 and/or Schirmer test without anesthesia (ST) <10 mm or tear break up time (TFBUT) � 6 seconds and/or any of the vital staining > 3. DED diagnosis was assigned if the patient presented at least one positive test according to these parameters [32][33][34][35]. Patients were separated into five different subgroups based on their disease (i.e.; SS, GVHD, GO, DM without retinopathy, or chronic glaucoma treatment with BAK preserved eye drops), were compared throughout the study.
Twenty-four healthy individuals with similar age range and sex distribution were analyzed as the control group (CG). In this control group, subjects reporting ocular infection or allergy, ocular surgery or contact lens wear, pregnancy and lactation, or conditions with clinical overlapping of the aforementioned diseases were excluded.
Comparison among SS1, SS2 and non SS DED. In another transversal study, involving one hundred thirty individuals with DED, classified as SS1, SS2 and non SS by the 2002 American-European Criteria, the comparison was performed to investigate whether their performance in the DED tests were similar or capable to distinguish them by their response to a single or a combination of tests. Evaluation included the above-mentioned ocular tests and the same criteria to label them as DED was applied.

Instrumentation and procedures
Evaluation of DED included: OSDI questionnaire [36], tear break-up time (TFBUT), corneal fluorescein staining score (CFS) and Schirmer test (ST), as described below and according to the suggested sequence [37].
OSDI. The OSDI is a worldly applied subjective symptom score questionnaire, recently validated Portuguese language, and used to score the frequency of DED symptoms. A Portuguese language validated version was used [36,38].
Tear film break-up time. Tear film break-up time (TFBUT) was measured after the instillation of 5 μl of a 2% sodium fluorescein solution over the ocular surface and spreading for 30 seconds (Allergan, Guarulhos, SP, Brazil) The value for each patient was obtained form the average of three consecutive breakup times in seconds.
Corneal fluorescein staining. Corneal fluorescein staining was graded in the sequence of TFBUT, observing the 2% sodium fluorescein solution impregnated in the cornea using cobalt blue illumination and following the 15-point NEI⁄ Industry scale, which consider grades of 0-3 on five regions of the cornea.
Schirmer test. Tear flow was measured with filter paper Schirmer test strip for 5-minutes without anesthetic in both eyes (Ophthalmos Ltd., São Paulo, SP, Brazil).
Two of the three investigators (i.e.; MA, JF and EMR) performed all measurements. The more abnormal result for each test, in the two eyes was used in the discriminant analysis.

Statistical analysis
Descriptive statistics for data were reported as mean ± Standard Deviation (SD). Differences were considered significant at p<0.05. A multivariate data analysis was performed to classify the subjects on the pre-existing different groups based on ocular variables collected using Quadratic discriminant analysis (QDA). The QDA classifier produced a group of five discriminative functions and each subject was classified according to the cut-off point. The assumptions required to apply QDA were tested as described by Hair [39]. In order to detect potential problems with multicollinearity, the pooled within-group correlation was tested between all variables. As recommended, all the correlation coefficient were lower than 0.8 [39].
Additionally, a receiver operating characteristic curve (ROC) was built to determine the area under the curve (AUC). The AUC was used to evaluate the classification performance of individuals in different diseases [40]. All analyses were performed using JMP Pro 10.0 (SAS Institute, Cary, North Carolina).

Comparison of SS and other causes of DED
Dry eye tests from two hundred and six individuals were analyzed. One hundred eighty-two DED subjects, with a male ratio of 18.1%, and mean age of 51.8±14.2 years, were included in this study. Based on the criteria for DED stated above, the following subgroups were formed: SS (n = 98), GVHD (n = 28), GO (n = 28), DM (n = 11), BAK (n = 17). Twenty-four healthy and non-dry eye volunteers composed the CG, with a male ratio of 29.2%, and mean age of 45.7±12.7 years.
The mean values of ocular evaluation for DED in SS individuals, in the other groups, and CG varied inside and among the different groups (Table 1).
All groups have number of subjects greater than number of variables, and only two groups have number of subjects smaller than 20 (DM and BAK). Pooled within-group correlation coefficients for all pairs of analysis revealed that the higher correlations were between ST vs TFBUT (0.47) and CFS vs TFBUT (-0.39)( Table 2).
The QDA produced four discriminant functions (DF) to classify each subject into six groups. Using unstandardized canonical coefficients the discriminant functions were constructed. The eigenvalue is a ratio of the between-groups sum of squares to the within-groups or error sum of squares. It is related to effectiveness of discriminant function. Larger eigenvalues are better and they are sorted in descending order of importance ( Table 3). The two first functions can explain 96.9% of the variance, and the other three remaining 3.1%. Canonical correlation provides correlation coefficient between discriminant scores on the function and groups and it was used to compare the importance of each discriminant function. The Wilks' Lambda Test showed that the discriminant functions significantly explained the membership of the group, except the last two.
The standardized canonical discriminant function coefficient shows the contribution of each variable to the discriminant function to compare variables in different scales (Table 4). Large values on standardized coefficient reflect greater discriminant ability to their corresponding variables. Considering the whole groups and tests included in the quadrant discriminant analysis (QDA), the most valuable test for classification was TFBUT, followed by CFS, OSDI, and ST.
Those variables combined were able to fit 49% of individuals in their actual group (Table 5).   Only two groups have most of their individuals discriminated to their actual group. The CG had 91.7% of correct discrimination, followed by BAK, 72.7%, SS 46.9%, DM 36.4%, GVHD 39.3% and GO 35.7% ( Table 5).
The most overlapping conditions were GVHD and SS, where 39.3% of patients with GVHD not discriminated from SS patients. An important part of SS group (20.4%) was misclassified as BAK. Also, GO and BAK were overlapped and both had cases not properly discriminated from DM, using the four tests to diagnosis DED (Table 5 and Fig 1).
Following the criteria of the ROC curve, applying those six DED tests to discriminate the six diseases associated with DED, again the group with better discrimination was the CG with 97.6%, followed by DM with 84.9%, GVHD with 79.4%, BAK with 78.7, SS with 77.5% and GO with 73.9%. The groups with higher frequency of overlapping of individuals were DM, GO and BAK. Dismissing those three groups the percentage of correct fitting the individuals in their actual groups was 65% (Fig 2).

Comparison among SS1, SS2 and non SS DED
One hundred thirty individuals were analyzed to investigate whether DED tests can distinguish individuals from SS1, SS2 and NSS groups. The female ratio was 93.5%, and mean age 53.7 ± 14.9 years. Based on the criteria for DED and American-European Consensus for SS, the following groups were composed by 57 individuals in the SS1), 41 individuals (SS2), 32 individuals (NSS).
The mean values of the DED tests in the three groups were significantly different, except for OSDI that was lower in the NSS group but presented large variability in all the groups (p>0.05). NSS differ from SS1 in CFS score, from SS2 in TFBUT mean values and from both SS1 and SS2 in ST ( Table 6).
The QDA produced two discriminant functions to classify each subject into three groups. The first discriminate function (DF 1 ) explained 79.4% of variance.
The quadratic discriminant analysis revealed that the combined tests correctly classified 54.6% of the patients in their groups (Wilks' Lambda = 0.82; p = 0.0015). The area under the Table 3. Eigenvalues of each discriminant function in descending order, from DF 1 to DF 4 . The eigenvalue is a ratio of the between-groups sum of squares to the within-groups or error sum of squares, and it's a relative measure of how different groups are on the discriminant function. ROC curve better classified NSS (79.5%), followed by SS2 (74.4%) and SS1 (69.4%). The most relevant variables were respectively ST, CFS, TFBUT and OSDI (Fig 3). The ROC curve, applying the four DED tests to discriminate the three diseases associated with DED, again the group with better discrimination was the SS2 with 74.4%, followed by NSS with 79.5%, and SS1 with 69.4%. Higher frequency of overlapping was found between SS1 and SS2. The NSS group was 59.4% correctly classified, while SS1 and SS2 were correctly classified in 49.1% and 58.5% of the cases, respectively. SS1 and SS2 were misclassified as NSS in 24.5% and 26.3%. (Fig 4). Table 5. Distribution and discrimination of diseases related to DED. The bold numbers on the rows present the individuals classified in their original groups and the numbers in the cells aside at the same row reveal their misclassification based on the six tests for DED applied to those individuals.

Discussion
The present work shows that common used diagnostic tests for DED are capable to discriminate different etiologies associated to this condition. Although it is not clear the exact relationship of those DED tests and the mechanisms of the disease, [37] it supports the hypothesis that different diseases cause DED toward different mechanism and it may be reflected in the performance of combined tests. DED tests have a long history, some modern achievements, but are in fact measuring parameters related to the tear film secretion, ocular surface changes and sensation [27,37].    Discriminating causes by dry eye diagnostic tests Those tests are known to present variability and overlapping between DED and healthy individuals, even observing groups of diseases [24,41]. The present work explored the discriminant analysis to differentiate distinct groups of DED and the controls, considering the variability of test results and difficulties to diagnose DED using a single sign or test. Discriminant analysis is an appropriate multivariate technique that associates multiple independent variables to a categorical dependent variable, and its application has been used in health sciences and DED [41][42][43]. In a prior study this analysis revealed that tear osmolarity alone was able to distinguish dry eye from healthy individuals with an accuracy of 89% [41]. In the first part of the present work five discriminant functions were constructed to classify the subjects on the six groups based on ocular signs. The Wilks' lambda revealed that the first three functions were significant and explained a total variance of 96.9%.
Higher standardized coefficient, in absolute value, reflects greater discriminant ability to their corresponding variables. Considering the tests used here, it is acceptable that OSDI was the most useful test to discriminate the diseases. DED patients are diagnosed when seeking for help for their symptoms. As shown in a previous publication, OSDI had higher mean scores in inflammatory diseases as SS, GVHD and GO, and lower in DM [24]. The cornea is the most innervated surface of the body and provides feedback for the lacrimal secretory system [44,45]. Those diseases trigger changes in the ocular surface environment in a distinct way based on their mechanisms, whether inflammatory (e.g. SS and GVHD), neural or hormonal (e.g.; DM, BAK) and are better detected by certain groups of exams.
It is understandable that SS is the best-discriminated disease. SS has a very complex system of diagnosis with six items and several exclusion criteria [5]. The diagnostic process eliminates any conditions with confounding factors. GVHD was called Sjögren's like syndrome in the past and appeared with certain overlapping here, but it has a clear etiology and distinguished epidemiological aspects, not explored in this work [46,47].
DM, GO and BAK groups also presented a substantial overlap, probably due to the multiple factors involved in their changes regarding DED, which includes neuropathic damage, hormone impairment and environmental factors [11,[48][49][50].
The present analysis also identified redundant variables in certain groups of diseases. CFS and LGCS showed stronger correlation coefficients. Previous authors investigated the combination of markers, but also pointing the discrepancies among them [51,52].
The higher AUC revealed in the ROC analysis agreed with the classification distribution. They showed that CG was the best identified by the tests and the last three groups with more overlapping classification were GVHD, GO, BAK. The possible explanation is the broad spectrum of the disease among the individuals recruited for this study. The time-length and therapeutic interventions may have contributed to this finding.
There are few studies that have used the discriminant analysis to address hypothesis on DED. [41,53] On those studies, comparisons were performed between two groups, DED patients and non-DED subjects. Khanal et al. [41] reported an AUC of 0.95, indicating that the discriminant function was a reliable diagnostic tool for dry eye. Kaido M et al. [53] applied the discriminant function to distinguish DED patients from non-DED subjects using functional visual acuity and they obtained AUC of 0.735. Although the variables analyzed were different, we could compare to our results and consider it as similar (AUC = 0.98, for controls). It is important to note that in the present study the number of groups is higher (more than two), with different etiologies for DED. Thus, the AUC value should be interpreted here as the ability of the combined signs and symptoms, to distinguish a particular group from all others. Therefore, this study does not intend to use dry eye tests to identify which etiology the subject belongs to. On the contrary, we proposed a new way of visualizing the distribution of the set of signs and symptoms of dry eye related to its etiologies.
The quadratic discriminant analysis was not previously attempted in the diagnosis of SS. Considering the extensive panel of tests to define the diagnosis and the long-term implications of carrying this condition with its proper diagnosis, we considered useful to identify the ocular tests more sensitive to differentiate the SS (both SS1 and SS2) from NSS individuals. Despite all the limitations of ST, its values were significantly different between SS and NSS. Further efforts on the identification of useful tests may help to improve the diagnostic and follow up panel.
Some potential limitations of this study must be pointed out. First, the discriminant analysis was based on signs that may present variability day after day and also in the same day. It would compromise the results, however, data was randomly harvested and therefore they reproduce the real conditions of patients presented to the outpatient clinic. Indeed, there were patients under treatment or not in all the groups. Treatment for the underlying condition may also influence on the tests, but that was present in all groups. The second issue is associated with the DED inclusion criteria. The lack of a gold standard and arbitrary selection of tests and threshold may induce bias. The authors adopted the most used tests and cut-off numbers widely validated in the literature avoiding to constrain the selection of DED patients for each group [31][32][33]35,37,54].
The intention of this study was to investigate the performance of ocular signs set in the different known diseases; therefore no cross-validation was applied.
In conclusion, individuals from five different groups of diseases associated with DED have a moderate capacity to be fitted in their actual group based on traditional DED diagnostic tests. Together, the combined DED tests classified the individuals in their correct group from 41.2 to 69.6% of cases. These findings showed that the actual DED tests, even when combined present a limited capacity to discriminate different causes of DED. In the same way, the DED tests distinguished SS1 and SS2 from NSS individuals with DED. Taken together, our results indicate the need for further studies to better identify tests as biomarkers of the physiopathology of DED and therefore, to evaluate more specific therapies and anticipate the prognosis for DED.