Rasch analysis of the hospital anxiety and depression scale among Chinese cataract patients

Purpose To analyze the validity of the Hospital Anxiety and Depression Scale (HADS) among Chinese cataract population. Methods A total of 275 participants with unilateral or bilateral cataract were recruited to complete the Chinese version of HADS. The patients' demographic and ophthalmic characteristics were documented. Rasch analysis was conducted to examine the model fit statistics, the thresholds ordering of the polytomous items, targeting, person separation index and reliability, local dependency, unidimentionality, differential item functioning (DIF) and construct validity of the HADS individual and summary measures. Results Rasch analysis was performed on anxiety and depression subscales as well as HADS-Total score respectively. The items of original HADS-Anxiety, HADS-Depression and HADS-Total demonstrated evidence of misfit of the Rasch model. Removing items A7 for anxiety subscale and rescoring items D14 for depression subscale significantly improved Rasch model fit. A 12-item higher order total scale with further removal of D12 was found to fit the Rasch model. The modified items had ordered response thresholds. No uniform DIF was detected, whereas notable non-uniform DIF in high-ability group was found. The revised cut-off points were given for the modified anxiety and depression subscales. Conclusion The modified version of HADS with HADS-A and HADS-D as subscale and HADS-T as a higher-order measure is a reliable and valid instrument that may be useful for assessing anxiety and depression states in Chinese cataract population.

Introduction Cataract is the most common cause of visual impairment in China [1][2][3]. Visual loss and visual disability significantly impact mental health and affect the quality of life in the aging population [4][5][6]. Depression and anxiety are among the major mental health problems in the elderly, especially in visually impaired older adults [5]. It is estimated that 10% elderly communitydwelling residents and 15% to 25% of hospitalized patients in China experience major depression disorder [7]. The prevalence of subthreshold depression (32.2%) and subthreshold anxiety (15.6%) among patients is twice as high as the prevalence in general elderly populations [5].
Vision impairment due to cataract has been significantly associated with depression and anxiety in older adults [8][9][10]. In a community-based survey of 4611 Chinese adults aged over 60 years using the 9-item Patient Health Questionnaire (PHQ-9) depression scale, adults with cataract had higher odds of having depressive symptoms compared with those without cataract [8]. A study of 662 individuals aged over 70 years in Australia using the Goldberg scales (GADS) found anxiety and depression symptoms were associated with cataract [9]. Palagyi et al demonstrated a high prevalence of depressive symptoms in older persons with cataract [10]. Several studies have examined the impact of cataract surgery on depression and anxiety [11][12][13]. However, few studies have evaluated the association of anxiety and depression with cataract in Chinese population.
The Hospital Anxiety and Depression Scale (HADS) is an useful instrument for screening anxiety and depression [14]. The Chinese version of the HADS has been developed and validated previously [15,16]. So far, no study used the HADS in Chinese cataract patients. The question has been raised as to the suitability of the HADS measures in Chinese cataract population. The Rasch model is a psychometric method that ensures assessments of reliability and validity of the scaling properties of an instrument [17][18][19]. Rasch validation of the HADS has been proven useful in dry eye patients [20]. In the current study, Rasch analysis was used to validate the Chinese version of the HADS in cataract patients.

Study population
A sample of 275 participants with unilateral or bilateral cataract over the age of 40 years was recruited from the Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, Southern China between April 2016 and April 2017. Participants with first eye operated previously were excluded. All patients completed the Hospital Anxiety and Depression Scale (HADS) questionnaire and an additional questionnaire for information about patients' ophthalmic and demographic characteristics. Clinical information was collected by the examining ophthalmologists. Written informed consent was obtained from all participants. The study adhered to the Declaration of Helsinki and was conducted after obtaining ethical approval from the Zhongshan Ophthalmic Center Institutional Review Board.

HADS questionnaire
The HADS is a self-administered scale with 7 anxiety and 7 depression items rated on a scale from 0 to 3[14]. The Chinese version of the HADS was used in the present study [15,16].

Rasch analysis
The Rasch measurement model was used to construct validity of the HADS [21]. The Rasch model estimates a person's ability in relation to item difficulty expressed in log odds units (logits) on a single continuum scale. For this analysis, participants with higher ability and items of greater difficulty were located on the negative side of the continuum scale and vice versa [22].
For Rasch analysis, a minimum sample size of 243 will provide useful and stable estimations of items and person locations irrespective of scale targeting [23]. Rasch analysis was performed on anxiety and depression subscales as well as HADS-Total score respectively. The Winsteps program (Version 3.92.1, Winsteps, Beaverton, Oregon, USA) was used for Rasch analysis using the Andrich rating scale model for HADS-Anxiety and partial credit model for HADS--Depression and HADS-Total score.
Category threshold order. The category probability curves were used to assess the threshold ordering of polytomous items. The extent to which responses to the items are consistent with the metric estimate of the underlying construct is indicated by an ordered set of response thresholds for each of the items. When disordered thresholds occur or two response categories on an item are difficult to be discriminated, collapsing the categories into one response option can improve scale fit to the Rasch model [17,24].
Rasch model fit. The item fit statistics are expressed in infit and outfit mean square (MNSQ) statistics which is based on the chi-square statistic with each observation weighted by its statistical information (model variance). A range of 0.7 to 1.3 is used as a criterion of good fit [24,25].
The Likelihood ratio test was used to compare revised model with original model. Winsteps reports global fit statistics and approximates global log-likelihood chi-squared statistic. Deviance statistics for comparing different models are the difference between the chi-squares of two analyses, with d.f. of the difference between the number of free parameters being estimated.
Targeting. Targeting refers to how well the difficulty of items matches the abilities of the study sample. The standard error of the person measure was used for the assessment. The cutoff points were defined: fair targeting as 1-2 error, good targeting as <1 error and very good targeting as <0.5 error [27].
Differential item functioning (DIF). The analysis of DIF including uniform and nonuniform was performed to identify significant differences of the response on an item by subgroups of the demographic characteristics. We assessed DIF by Age ( 70; > 70), Gender and Education (primary school and lower; junior school and higher). DIF differences were presented. Notable DIF was defined as the difference >1.0 logits [28].
Local dependency. Local dependency was identified with paired standardized residual correlations between items exceeding 0.30. If the problem occurs, the dependent items are recommended to be added together into one item [29].

Measurement precision assessed by person separation reliability (PSR) and separation index (PSI).
Person separation is used to classify people. Low person separation with a relevant person sample implies that the instrument may not be sensitive enough to distinguish between high and low performers. More items may be needed. Reliability means reproducibility of relative measure location. A PSR of !0.80 (PSI!2.00) indicates that the instrument can distinguish the study population into 2-3 levels disability [30].
Unidimensionality. Unidimensionality of the Rasch model is assessed by independent ttests for each person. The percentage of such tests outside the ± 1.96 range should be less than 5% which is required to indicate a unidimensional scale [24,31].
In principal component analysis of the residuals (PCA), 60% of the variance explained by the raw data is considered unidimensionality [25,28]. An eigenvalue in the first contrast in the residuals > 2.0 as well as indicates that a second construct is needed to be measured [26].

Results
The demographic characteristics of the study population A total of 275 people participated in the study ( Table 1). The mean age (SD) of participants-was70.5 (11.1). 38.9% were male and 48.5% of participants had low level of education.

Rasch analysis of HADS-Anxiety
Analysis of the initial HADS-Anxiety (HADS-A) items shows the mean infit MNSQ value for items A7 (I can sit at ease and feel relaxed) was 1.44, indicating misfit the Rasch model(infit MNSQ = 1.44 and outfit MNSQ = 1.37), outside an ideal MNSQ value ranges between 0.7 and 1.3. Removal of item A7 significantly improved the model fit (chi square of χ 2 (10) = 452.4, P<0.001) when comparing with the initial model by likelihood ratio test ( Table 2). The modified HADS-A exhibited a PSI of 1.95 and PSR of 0.79, suggesting a good discriminant ability of the questionnaire. The instrument appeared to be on target even though 1.22 error of person measure was slightly higher than the suggested value. The residuals explained 61.0% of the raw variance and 5.57% of the significant t-tests indicated unidimensionality. The unexplained variance in 1st contrast was 1.57 eigenvalue units, showing no evidence of another latent trait captured by the scale. The category probability curves for HADS-A revealed that all items of subscales had ordered thresholds (Fig 1). But the skewed distribution of person ability may affect fit statistic (Fig 2), which was expected to be normally distributed. No local dependency was detected with all paired standardized residual correlations <0. 30. Table 3 shows the individual item fit statistics and scoring structure for the modified HADS-A. All items of HADS-A were free from uniform DIF (S1 Table). Notable non-uniform DIF was detected on items A9("Get a sort of frightened feeling like 'butterflies' in the stomach") and A11("Feel restless as I have to be on the move") by education subgroups in high ability group with DIF difference of 1.14 and -1.20 respectively which indicated that item A9 was more difficult for people with education of primary school or lower than those with higher education in high ability group, while item A11 was more difficult for people with higher education in high ability group (S2 Table).

Rasch analysis of HADS-Depression
The respondents may have difficulty to discriminate two response categories between "Sometimes" and "Not often" on the item D14 (Enjoy book or radio or TV). Thus, we rescored items D14 by adjoining the 2 nd and 3 rd categories (the new scoring structure 0-1-1-2, Table 4). The modified model provided a better fit to the data than the original model (chi square of χ 2 (2) = 212.0, P<0.001) ( of the questionnaire. The instrument appeared to be on good target with 0.93 error of person measure. The residual explained 61.9% of the raw variance and 5.03% significant t-tests indicated that no multidimensionality appeared in the scale. The unexplained variance in 1st contrast was 1.55 eigenvalue units. All items of subscales had ordered thresholds (Figure was not shown). Table 4 shows the individual item fit statistics and scoring structure for the modified HADS-D. Fig 3 shows a slightly skewed Person-item location distribution. No local dependency was detected with all paired standardized residual correlations <0. 30. All items of HADS-D were free from both uniform and non-uniform DIF (S1 and S2 Tables).

Rasch analysis of HADS-Total
The analysis of Initial HADS-Total (HADS-T) fit started with 13 items from the modified HADS-A and HADS-D subscales. Item D14 was rescored in the same way as it was in  30. Table 5 shows the individual item fit statistics and scoring structure for the modified HADS-T version. Fig 4 shows a slightly skewed Person-item location distribution for HADS-T. All items of HADS-T were free from uniform DIF (S1 Table). Notable non-uniform DIF was detected on items A1 ("Feel tense or 'wound up'") and D14 ("Enjoy book or radio or TV") by sex and education subgroups in high ability group. Item A1 was more difficult for female (DIF contrast = -1.17) and people with higher education (DIF contrast = -1.10) in high ability group. Item D14 was more difficult for male (DIF contrast = 1.12) and people with higher education (DIF contrast = -1.64) in high ability group (S2 Table). Modified cut-off points Table 6 shows the relationship between cut-off points on the original scale and on the revised scale. Equating tests gave new upper and lower cut-off points of 9 and 6 for the modified HADS-A, while 10 and 6 for HADS-D (Figs 5 and 6).     Table 6 shows that the original HADS-A overestimated the prevalence of borderline abnormal anxiety. The equating scores for HADS-D subscale showed that rescoring D14 had no effect on the prevalence of all three levels of depression.

Discussion
The depression and/or anxiety among cataract patients in different countries have been assessed using diverse screening instruments, including HADS in the study of Japan [32], PHQ-9 in China [8], GADS [9] and the Center for Epidemiologic Studies Depression Scale (CESD) [12] in Australia and the Geriatric Depression Scale (GDS) in Canada [33]. The HADS have been also used in patients with other ocular diseases such as glaucoma, AMD, dry eye and ptosis [34]. All the instruments are self-administered, and the contents are close to everyday activities and speech, but the HADS is shorter than the GADS, CESD and GDS, and it evaluates anxiety and depression on two separated parts. Our results demonstrate that the HADS is a unidimensional, reliable and valid instrument for assessing anxiety and depression in Chinese cataract population. As indicated by Rasch analysis, the standard 7-item measure of anxiety and depression subscales should be modified for use in cataract patients. The modified HADS subscales had ordered thresholds and there was no evidence of large DIF.
Our results found item A7 (I can sit at ease and feel relaxed) misfit the Rasch model, and removal of item A7 improved model and provided a better fit for HADS-A, as previous studies suggested [17,19]. Previous studies have found Item A7 either loaded on HADS-D subscale or was complex, with some analysis showing higher loading on HADS-D subscale [19,35]. It might be that item A7 included the positive wording, which was corresponding with the positively worded items of HADS-D subscale [19,36]. For HADS-D, we rescored the item D14 ("Enjoy a good book or radio or TV program") (0-1-1-2) to improve the model fit. Traditional factor analysis have found Item D12 (I look forward with enjoyment to things) to load highly on the factor corresponding to HADS-D subscale [19]. However, in our study, D12 was found to load strongly with Anxiety items, and removal of item D12 may be reasonable from a clinical perspective. The level of reliability of these modified subscales makes it suitable for estimation of anxiety and depression states in Chinese cataract patients. Therefore, we ascertained the new cut-off points for the modified version of HADS. The cut-off points may be useful for clinicians to make clinical decision. Our current result showed that the original cut-off points of anxiety and depression subscales misestimated anxiety and depression states for cataract patients in China. Using the revised cut-off points, the ratios of anxiety and depression in cataract patients were increasing.  The current study demonstrated no uniform DIF and no non-uniform DIF for the majority of the items, indicating the measures were not affected by item bias (age, sex, education). However, items A9 ("Get a sort of frightened feeling like 'butterflies' in the stomach"), A11 ("Feel restless as I have to be on the move") for HADS-A and A1 ("Feel tense or 'wound up'"), D14 ("Enjoy book or radio or TV") for HADS-T had a notable non-uniform DIF between lower and higher education subgroups at high ability level. It is possible that people with lower education are more likely to experience nervous tension or difficulty in reading activity. In addition, item A1 was more likely to be endorsed by male. A study showed the reduction in visionrelated emotional well-being was significantly greater in men compared with women [37]. And item D14 was more likely to be endorsed by female. It is possible that these activities are more commonly performed among females in China.
There were some limitations in this study. First, the study sample was recruited from a hospital in Southern China and is not completely representative of its general population in China. Second, although the HADS is easy and convenient for study purposes, it is not comparable with a formal psychiatric diagnosis of depression or anxiety.
In conclusion, the modified version of HADS has been shown to be a reliable and valid instrument, and is useful for assessing anxiety and depression in Chinese cataract population. Supporting information S1 Table. Uniform differential item functioning (DIF) assessed by age, sex, education. (DIF difference>1.0 logits was in bold to indicate that uniform DIF would occur) Ã . (PDF) S2 Table. Non-uniform differential item functioning (NUDIF) assessed by age, sex, education. (NUDIF difference>1.0 logits was in bold to indicate that non-uniform DIF would occur) Ã . (PDF)