Psychometric and diagnostic properties of the Taiwan version of the Quick Mild Cognitive Impairment screen

There is a need for a screening tool with capacities of accurate detection of early mild cognitive impairment (MCI) and dementia and is suitable for use in a range of languages and cultural contexts. This research aims to evaluate the psychometric and diagnostic properties of the Taiwan version of Qmci (Qmci-TW) screen and to explore the discriminating ability of the Qmci-TW in differentiating among normal controls (NCs), MCI and dementia. Thirty-one participants with dementia and 36 with MCI and 35 NCs were recruited from a neurology department of regional hospital in Taiwan. Their results on the Qmci-TW, Taiwanese version of the Montreal Cognitive Assessment (MoCA), and Traditional Chinese version of the Mini–Mental State Examination (MMSE) were compared. For analysis, we used Cronbach’s α, intraclass correlation coefficient, Spearman’s ρ, Kruskal–Wallis test, receiver operating characteristic curve analysis, and multivariate analysis, as appropriate. The Qmci-TW exhibited satisfactory test–retest reliability, internal consistency, and interrater reliability as well as a strong positive correlation with results from the MoCA and MMSE. The optimal cut-off score on the Qmci-TW for differentiating MCI from NC was ≤ 51.5/100 and dementia from MCI was ≤ 31/100. The MoCA exhibited the highest accuracy in differentiating MCI from NC, followed by the Qmci-TW and then MMSE; whereas, the Qmci-TW and MMSE exhibited the same accuracy in differentiating dementia from MCI, followed by the MoCA. The Qmci-TW may be a useful clinical screening tool for a spectrum of cognitive impairments.


Introduction
The number of people aged older than 65 years is increasing worldwide [1]. The population of older people with dementia is expected to increase concurrently with global aging [2,3]. The reported proportion of individuals with mild cognitive impairment (MCI) is expected even higher than of those with dementia [4,5]; however, in clinical practice, most MCI cases in older adults remain unidentified. Although people with MCI are completely capable of self-PLOS ONE | https://doi.org/10.1371/journal.pone.0207851 December 3, 2018 1 / 17 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 significant difference between the instruments and their ability to differentiate MCI from NCs, at a significance level of 0.05 and power of 80% [20,21]. Participants were categorized into three groups: MCI, dementia, and NCs. Between May 2017 and December 2017, participants were recruited consecutively from the neurology outpatient department of regional hospital in New Taipei, Taiwan. NCs without subjective and objective cognitive problems were recruited from convenience sampling. Only the individuals aged � 65 years and able to follow instructions and understand the content of the assessments through verbal communication were eligible for participation. Dementia (Alzheimer's disease or vascular or mixed dementia subtypes) and amnestic-type MCI were diagnosed according to the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition [27] and the National Institute of Neurological and Communicative Diseases and Stroke and the Alzheimer's Disease and Related Disorders Association [28] criteria and National Institute on Aging-Alzheimer's Association workgroup's diagnostic guidelines for Alzheimer's disease [29], as applicable. The participants with MCI and dementia were classified using the Clinical Dementia Rating scale global scores of 0.5 and 1-3, respectively [6]. Participants were excluded if they scored > 7 on the Geriatric Depression Scale-Short Form, indicating depression status [30]; or were diagnosed with other MCI or dementia subtypes, including frontotemporal dementia, Parkinson's disease, or Lewy Body dementia, that present infrequently, typically with exaggerated functional deficits and different MCI syndromes.
The study protocol for the protection of human participants and the consent procedure was approved by the Institutional Review Board of the Taipei Hospital, Ministry of Health and Welfare (TH-IRB-0016-0033). Before the study, the purpose and procedure of the research were explained to all participants. NCs signed informed consents by themselves; whereas, participants with MCI or dementia signed informed consents along with their legal guardians. The participants' background information was protected as confidential and was used only for research purposes.

Data collection
In this cross-sectional study, the demographic data, including age in years, gender, and years of education, were collected. The same trained rater, blinded to final diagnosis, alternately and randomly administered the Qmci-TW, MoCA, MMSE, Barthel Index [31] and Lawton Instrumental Activities of Daily Living scale [32] on the same day. Additionally, after 2 weeks, the Qmci-TW, with alternative versions [33], was readministered to randomly selected participants by the two trained raters, blinded to final diagnosis.
In total, 119 participants were recruited, but 17 participants who scored > 7 on the Geriatric Depression Scale-Short Form were excluded. The remaining 102 participants (47 men and 55 women) were included for further study. Among the final sampled patients, 35 (34.3%) were NCs, 36 (35.3%) had received a MCI diagnosis, and 31 (30.4%) had received a diagnosis of dementia. Participants with dementia, classified using Clinical Dementia Rating scale global scores, were grouped according to mild (n = 12), moderate (n = 13), and severe dementia (n = 6). The NCs (p < 0.001) and participants with MCI (p = 0.005) were significantly younger than those with dementia. NCs had received significantly more education than those with MCI (p = 0.007) or dementia (p < 0.001). The mean Geriatric Depression Scale-Short Form scores among NCs was significantly lower than that among participants with dementia (p < 0.001). The demographic characteristics of the participants are presented in Table 1.

Instruments
The Qmci-TW is a performance test, containing six subtests, namely orientation, registration, clock drawing, delayed recall, verbal fluency, and logical memory. The Qmci can be administered and scored with a median time of under 5 minutes, and the alternative word groups and versions of the registration and recall task and the verbal fluency and logical memory subtests in the Qmci have been validated for convenience [33]. We translated the Qmci into the Qmci-TW following the established translation guidelines [34]. The Qmci-TW was administered and scored using the test manual's instructions. The Qmci-TW scores ranged from 0 to 100, with a higher score indicating greater cognitive function.
The MMSE is also a performance test, standardized and validated to measure cognitive functions in orientation, registration, attention, calculation, recall, language, and copying. MMSE scores range from 0 to 30, with a higher score indicating greater cognitive function [13].
The MoCA is a standardized and validated tool designed to measure cognitive functions in visuospatial and executive tasks, naming, memory, attention, language, abstraction, delayed recall, and orientation. In MoCA scoring, 1 point is added for individuals whose educational level is � 12, with scores ranging from 0 to 30 and a higher score indicating greater cognitive function [14].
The Barthel Index is a validated tool designed to measure activities of daily living independence, specifically regarding feeding, bathing, grooming, dressing, bowels, bladder, toilet use, transfers, mobility, and stairs. Barthel Index scores range from 0 to 20, with a higher score indicating greater independence [31].
The Lawton Instrumental Activities of Daily Living scale is a validated tool designed to measure instrumental activities of daily living independence, specifically with regard to telephone use, shopping, food preparation, housekeeping, laundry, transport mode, medication responsibility, and finance-handling ability. The scores range from 0 to 8, with a higher score indicating greater independence in complex activities of daily living [32].

Statistical analysis
Statistical analysis was performed using the IBM SPSS Statistics software, version 19.0 (IBM Corporation, Somers, NY, U.S.A.) and the R 3.4.3 software (R Foundation for Statistical Computing, Vienna, Austria). The Shapiro-Wilk test was used to test for normality, and the results indicated that most data were nonparametric. The distributional properties of continuous variables are presented as means ± standard deviations and categorical variables as frequencies and percentages. Missing values were given zero score following the manuals and scoring guidelines of each assessments. For the Qmci-TW, we used Cronbach's α, intraclass correlation coefficients (ICCs), and Spearman's ρ for internal consistency, test-retest and interrater reliability, and concurrent and construct validity [35], respectively. Data were analysed using Kruskal-Wallis test and post hoc Dunn's test, and receiver operating characteristic (ROC) curves for between-group comparisons, and AUC [35], respectively. Floor and ceiling effects. Frequency was used to calculate the lowest and highest raw scores for a subtest as the floor and ceiling, respectively. Floor and ceiling effects were considered significant if they were exhibited in more than 20% of the sample [36]. The ceiling and floor effects indicated that a subtest was too easy and too difficult for the study population, respectively. A subtest with ceiling or floor effects is considered non-sensitive, and thus, unsuitable for use in discriminating between groups [37].
Internal consistency. Cronbach's α coefficient was used to examine the internal consistency of the Qmci-TW screen subtests. This coefficient estimates the reliability of an instrument according to the consistency of the subtests, accounting for the number of subtests. Cronbach's α > 0.7 was considered acceptable for internal consistency [37]. Inter-item correlation was also used to examine the correlations between the subtests of the Qmci-TW screen.
Test-retest reliability and interrater reliability. The ICCs were used to examine the testretest and interrater reliability of the Qmci-TW screen. The ICCs of 0.75-1.00 indicated an excellent reliability [38].
Concurrent and construct validity. The Spearman's correlation coefficients were estimated between the Qmci-TW screen and the other validated cognitive screening instruments, MoCA and MMSE, to determine concurrent validity, and between Qmci-TW and the validated activities of daily living and instrumental activities of daily living assessments, Barthel Index and Lawton Instrumental Activities of Daily Living scale, to determine construct validity. The Spearman's ρ of 0.4-0.8 indicated an acceptable validity [39].
Sensitivity, specificity, and predictive values. The ROC curve analysis was used to calculate diagnostic accuracy based on the AUC. The AUC � 0.8 and � 0.9 represented good and excellent discriminating powers respectively. The optimal cut-off scores were derived using Youden's Index [40]. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated based on optimal cut-off scores.
To compare the predictive power among the three key instruments, Qmci-TW, MMSE, and MoCA, multivariate analysis was conducted by fitting two logistic regression models of (1) MCI or dementia versus NC in all subjects (n = 102) and (2) dementia versus MCI in the subjects with MCI or dementia (n = 67). The goal of regression analysis was to find one or a few parsimonious regression models that fitted the observed data well for effect estimation and/or outcome prediction. To ensure a good quality of analysis, the model-fitting techniques for (1) variable selection, (2) goodness-of-fit assessment, and (3) regression diagnostics and remedies were used in our logistic regression analyses. Specifically, the stepwise variable selection procedure (with iterations between the forward and backward steps) was applied to obtain the best candidate final logistic regression model using the My.stepwise package of R [41]. As listed in Tables 1 and 2, all the univariate significant and non-significant covariates, including age, gender, education level, and so on, were put on the variable list to be selected. Simple and multiple generalized additive models were fitted to detect nonlinear effects of continuous covariates and identify appropriate cut-off points for discretizing continuous covariates, if necessary, during the stepwise variable selection procedure. The significance levels for entry and for stay were set to 0.15 for being conservative. Then, with the aid of substantive knowledge, the best candidate final logistic regression model was identified manually by dropping the covariates with p value > 0.05 one at a time until all regression coefficients were significantly different from 0. Moreover, the goodness-of-fit measure, the estimated area under the ROC curve (also called the c statistic), and the Hosmer-Lemeshow goodness-of-fit test were examined to assess the goodness-of-fit of the fitted logistic regression model. Finally, the statistical tools of regression diagnostics for residual analysis, detection of influential cases, and check of multicollinearity were applied to discover any model or data problems.

Results
The total and most subtests of the Qmci-TW, MoCA, and MMSE scores of the NCs were significantly higher than those of participants with MCI and dementia groups, except for memory of the MoCA and registration, recall, naming, repetition, reading comprehension, verbal and executive function, and construction of the MMSE. Moreover, most of these scores were significantly higher in the MCI group than in the dementia group, except for repetition, abstraction, and orientation of the MoCA and registration, naming, repetition, and verbal and executive function of the MMSE. The Barthel Index and Lawton Instrumental Activities of Daily Living scale scores of the NCs and participants with MCI were significantly higher than those of participants with dementia (all p's < 0.001). The clinical characteristics of the participants are presented in Table 2.
Although some of the demographic and clinical characteristics, including age, educational level, Geriatric Depression Scale-Short Form score, and so on, were significantly different among the subjects of NC, MCI, and dementia (Tables 1 and 2), logistic regression analysis of MCI or dementia versus NC in all subjects (n = 102) revealed that after the keen competitions during the stepwise variable selection procedure, the MoCA score, food preparation score of the Lawton IADL scale in the past, and calculation score of the MMSE stayed in the final logistic regression model as the most important statistically significant predictors (Table 3). Both the estimated area under the ROC curve = 0.99 and the modified Hosmer and Lemeshow goodness-of-fit F test p = 0.7479 (df = 9, 92) indicated an excellent fit.
Next, logistic regression analysis of dementia versus MCI in the subjects with MCI or dementia (n = 67) revealed that after the keen competitions during the stepwise variable selection procedure, the orientation score of the Clinical Dementia Rating scale stayed in the final logistic regression model as the most important statistically significant predictor (Table 4). Both the estimated area under the ROC curve = 0.984 and the modified Hosmer and Lemeshow goodness-of-fit F test p = 0.7378 (df = 9, 57) also indicated an excellent fit.

Psychometric properties
Internal consistency. The internal consistency of the Qmci-TW screen was good, with a Cronbach's α of 0.85, and the item-to-total correlation were questionable to good, with Cronbach's α of 0.67-0.80. Significant and positive strong correlations (r = 0.53-0.76, all p's < 0.001) were found for each of the two Qmci-TW screen subtests. Results of internal consistency and inter-item correlation of the Qmci-TW screen are presented in Table 5.

Validity
Concurrent and construct validity. The correlation of the Qmci-TW with the MoCA and MMSE for concurrent validity was positive and very strong (ρ = 0.93 and 0.91, respectively; both p < 0.001). The correlation of the Qmci-TW was positive and moderate with the Barthel

Discussion
This study demonstrated that the Qmci-TW is a reliable and valid cognitive screening instrument potentially useful for differentiating among NCs and individuals with MCI and dementia. The Qmci-TW exhibited internal consistency, excellent test-retest reliability, and interrater reliability for clinical use. The Qmci-TW also exhibited sound concurrent, and construct validity in comparisons with the MoCA and MMSE and with the Barthel Index and Lawton Instrumental Activities of Daily Living scale respectively. In addition, evaluation criteria for the optimal cognitive screening instruments should also be considered, including bandwidth-fidelity tradeoff, culture fairness measure, economic considerations, and scopes of application. The reliability of the Qmci-TW with slightly narrow band is still satisfactory, and the Taiwan version of the Qmci is also validated without cultural bias in Taiwan population. Moreover, the Qmci-TW may not involve the floor and ceiling effects of the other tests for differentiating among NCs and individuals with MCI and dementia. Hence, the Qmci-TW may be preferable, particularly in patients with varied levels of cognitive function. The Qmci-TW represents the third Qmci screen translation, after the Dutch (Qmci-D) [22] and Turkish (Qmci-TR) versions [20]; the confirmation of its psychometric validity contributes to the growing evidence supporting clinical use of the Qmci.
Our results indicate that the MoCA is the most accurate test for differentiating MCI cases from NCs, followed by the Qmci-TW, and the MMSE. In the MoCA with high sensitivity, a positive test confirms MCI diagnosis, whereas in the Qmci-TW and MMSE with high specificity, a negative test result rules out diagnosis of MCI. In addition, the Qmci-TW and MMSE were found to be more accurate than the MoCA in differentiating dementia cases from MCI cases. In the Qmci-TW with high sensitivity, a positive test confirms the diagnosis of dementia, whereas in the MoCA and MMSE with high specificity, a negative test rules out diagnosis of dementia. According to these results, we recommend the use of the MoCA and Qmci-TW for detecting MCI and dementia, respectively.
As listed in Table 1, our univariate analyses indicated that age, educational level, and the Geriatric Depression Scale-Short Form score differed significantly among the NC, MCI, and dementia groups. Specifically, older age, lower educational level, and higher incidence of depression were more likely observed in the dementia group. These findings are similar to those of a previous study, which demonstrated that depressed mood, illiteracy, and older age were associated with dementia [42]. Next, multivariate analysis was conducted to assess the partial effects of all the relevant covariates in Tables 1 and 2. As shown in Tables 3 and 4, the lower MoCA score, food preparation score of the Lawton IADL scale in the past, and calculation score of the MMSE, the more likely to be MCI or dementia in all subjects, whereas the higher orientation score of the Clinical Dementia Rating scale, the more likely to be dementia in subjects with MCI or dementia. These findings not only could help us make predictions as a screening tool, but also shed light on the complementary roles of the MoCA, Lawton IADL scale, MMSE, and Clinical Dementia Rating scale. Yet, the Qmci-TW did not add much to them in the discriminations among NC, MCI, and dementia.
The current results indicated that the MoCA subtests exhibited the most floor effects, followed by the subtests of the MMSE and Qmci-TW. The MMSE subtests exhibited the most ceiling effects, followed by the subtests of the MoCA and Qmci-TW. The Qmci-TW facilitated accurate evaluation of a wide range of cognition functions with minimal floor and ceiling effects, and thus, was superior to the MoCA and MMSE. In addition, S1 Fig illustrates the ROC curves of the Qmci-TW subtests for differentiating MCI from NC and dementia from MCI. The best indicators of the Qmci-TW subtests for differentiating the participants with MCI from NCs, and participants with dementia from those with MCI were delayed recall, and orientation, respectively. Our results are similar to those of previous studies demonstrating that orientation is a poorer predictor of MCI with significant ceiling effects [43] and logical memory is a highly sensitive and specific for differentiating the participants with MCI from NCs [33].
In this study, 12.7% of all participants (4 and 9 patients with MCI and dementia, respectively) were illiterate. The optimal cut-off score for differentiating participants with MCI from NCs on the Qmci-TW (� 51.5/100) was lower than that on the Qmci (� 60/100) [18,19], which is potentially attributable to the low levels of education and high levels of illiteracy in Taiwan's older population. The optimal cut-off score for discriminating participants with MCI from those with dementia on the Qmci-TW (� 31/100) was much lower than that on the Qmci-TR (< 43/100) [20] and the Qmci-D (� 42/100) [22], explained by the higher proportion of patients with severe dementia in Taiwan population.
This study revealed that patients' abilities to execute complex instrumental activities of daily living may be a critical factor for differentiating MCI cases from NCs; however, Lawton Instrumental Activities of Daily Living scale scores in the NC group were not significantly higher than those in the MCI group (p = 0.07). In the MCI group, all participants maintained instrumental activities of daily living functions related to telephone use, housekeeping, transport mode, and finances; however, with regard to instrumental activities of daily living functions of food preparation, medication responsibility, shopping, and laundry, the number of participants with MCI scoring 0 were 16 (44.4%), 12 (33.3%), 11 (30.6%), and 8 (22.2%), respectively. Maintenance of activities of daily living is a critical factor for distinguishing between individuals with dementia and those with MCI and for differentiating between mild and severe dementia. The Barthel Index scores were significantly higher in the MCI group than in the dementia group (p < 0.001), In addition, the Barthel Index scores exhibited a significant negative moderate correlation with Clinical Dementia Rating scale global scores of 1-3 (ρ = −0.64, p < 0.001).
The key contribution of this study provides that the Qmci-TW with satisfactory psychometric and diagnostic properties is a useful and brief cognitive screening instrument to differentiate NCs, MCI and dementia. Notably, logical memory subtest only in the Qmci-TW, not in the MMSE, and MoCA, plays an important role for discriminating MCI from NC. Moreover, the Qmci-TW with time-limited for answering each questions can enhance the discriminating ability to detect patients with MCI and dementia by taking the response speed into consideration [44] and save much time for clinical practitioners. For early detection and treatment of patients with MCI and dementia, we recommend the use of the Qmci-TW in busy clinical settings.
This study has several limitations. First, the potential confounders of age and education may have caused the differences in the AUCs. This limitation is associated with the inherent challenges of a cross-sectional study design, which does not allow for adequate matching between groups. Second, the studied sample size was small and lacked statistical power for use in evaluating accuracy within each participant group. Even though sample size is one of the limitations, the statistically significant findings still deserve our attention, but the inference of the statistically non-significant findings should be conservative due to lack of power. Third, participants with active depression, who may exhibit slower reaction times and processing speeds [45], and patients with dementia subtypes, including frontotemporal dementia, Parkinson's disease, and Lewy Body dementia, and different MCI syndromes were excluded from this study. These exclusions may have led to some spectrum bias; consequently, our results may not be generalizable to other types of MCI and dementia. Fourth, because of the conceptual constructs of the Qmci-TW, MoCA, and MMSE with non-identical cognitive domains, numbers of items, and scoring criteria, the results of the pure inter-correlation analysis were needed to be carefully interpreted [46].
In conclusion, the Qmci-TW is a reliable and valid cognitive screening instrument with accurate diagnostic properties for detecting MCI and dementia in Taiwanese individuals. Further research including age-and education-matched NCs, larger sample sizes, younger adults, and other settings, such as psychiatry, and general practice clinics, are required. Nevertheless, this study provides evidence that the Qmci and Qmci-TW are useful for cognitive screening in clinical practice.
Supporting information S1 Table. The overview of