Validation of the Rowlands Universal Dementia Assessment Scale (RUDAS) to detect major neurocognitive disorder among elderly people in Ethiopia, 2020

Background The Rowland Universal Dementia Assessment Scale (RUDAS) is currently widely used for research and clinical purposes in many countries. However, its applicability and validity have not been evaluated in the Ethiopian context so far. Therefore, we designed this study to assess the reliability and validity of Rowland Universal Dementia Assessment Scale to detect major neurocognitive disorder among older people in Ethiopia. Methods An institution-based cross-sectional study was conducted among selected older people residing in Macedonia institutional care center, Addis Ababa, Ethiopia. The gold standard diagnosis was determined using the Diagnostic and Statistical Manual of Mental Disorders criteria for major neurocognitive disorders. Stata v16 statistical software was used for data analysis. Receivers operating curve analysis, correlations, linear regression, and independent t-test were performed with statistically significant associations declared at a p-value of <0.05. Inter-rater, internal consistency reliabilities, content, criterion and construct validities were also determined. Results A total of 116 individuals participated in the study with a 100% response rate. Most (52.7%) of the participants were male and the mean age in years was 69.9± 8. The Cronbach’s alpha for RUDAS was 0.7 with an intra-class correlation coefficient value of 0.9. RUDAS has an area under the receivers operating curve of 0.87 with an optimal cutoff value of ≤ 22. At this cutoff point, RUDAS has sensitivity and specificity of 92.3 and 75.3 with positive and negative likelihood ratios as well as positive and negative predictive values of 3.7, 0.1, 65.5%, and 91.5%, respectively. There has also been a significant difference in the mean scores of RUDAS among the two diagnostic groups showing good construct validity. Conclusion The Rowland Universal Dementia Assessment Scale has been demonstrated to be a valid and reliable tool to detect major neurocognitive disorder. Policy makers and professionals can incorporate the tool in clinical and research practices in developing countries.


Introduction
As the world population is going through the demographic transition, the proportion of older people is substantially increasing [1][2][3]. According to a 2018 report from the United Nations department of economic and social affairs, the number of people aged 65 years and above exceeded the number of under-five children for the first time in human history [1]. The proportion of older people is increasing at a faster rate in low-and middle-income countries (LMIC) than in high-income countries [1,2,4]. This population aging will have several social, economic, and health consequences [3]. Among the problems facing this population group more commonly are cognitive disorders [5].
Dementia or major neurocognitive disorder can be described as a syndrome in which there is a progressive deterioration in multiple areas of cognitive functioning [5][6][7]. According to the data from the world Alzheimer's report 2018, 50 million people are estimated to live with dementia, and this number is projected to reach 152 million in 2050. Currently, the estimation indicates that there are new cases of dementia every 3 seconds globally [8][9][10].
Even though an accurate diagnosis of cognitive impairment requires a detailed and multidisciplinary assessment of the individual, many short and brief screening tools have been developed and are being used over the past several years. The availability of brief and effective screening and cognitive assessment tools is necessary especially in low-and middle-income countries where there is a recognized gap in the availability of professionals with a specialty to diagnose and provide appropriate interventions [6,9]. These tools will contribute to the early identification of those with cognitive impairment at early stages so that available pharmacologic and non-pharmacologic interventions aimed at improving their cognitive function and quality of life can be provided before getting worse [6,[11][12][13].
Many of the available cognitive assessment tools were developed in western countries for the more educated and less culturally diverse population [13][14][15]. To this end, the effectiveness and applicability of most of these tools in communities with a very high illiteracy rate, low socioeconomic status, and more ethno-cultural diversity have been under question [14,[16][17][18][19].
The scarcity of culturally and linguistically adapted and valid cognitive screening tools in Ethiopia had made it difficult for clinicians and researchers to effectively screen and diagnose cognitive impairment [20,21].
Ethiopia is known for its large linguistic and cultural diversity and where almost half (48.23%) of its adult population and more than 80% of those aged 65 and above are illiterate. Therefore, finding alternative cognitive assessment tools and further assessing the ones with known educational and linguistic biases is of the essence [20,22].
Australian researchers developed RUDAS in an effort to produce a simplified tool to identify dementia that can be applied in diverse cultures, portable, and can be easily administered by primary health care providers [23][24][25][26][27]. RUDAS has been validated in both high-income and LMIC with demonstrated good validity and reliability and was shown to be relatively free from linguistic and educational biases [14,[28][29][30][31][32].
Even though many cognitive assessment tools including RUDAS have been validated for assessing cognitive impairments worldwide, the tools are culture, language, and context-sensitive and warrant the need for validation before using them in a new setting. Therefore, this study was designed with the objective of determining the psychometric properties and diagnostic accuracy of RUDAS among older people in Ethiopia to detect major neurocognitive disorder.

Methods
The study was conducted in an institutional care centre for older people in Addis Ababa, the capital city of Ethiopia. Macedonia is an indigenous care centre for the elderly and people with mental disabilities, which is an independent, non-governmental, and a non for profit organization. Accommodation, catering, and other services are provided within the centre. The study was conducted between 10 th August and 15 th September 2020. An institution-based cross-sectional study design was employed. All individuals who were 60 years of age and older and residing within the center were included. Individuals with severe life threatening illnesses were excluded from the study.

Sample size and sampling method
MedCalc Version 19.1.3 was used to calculate the sample size based on the assumptions and statistical methods suggested by Hanley and McNeil for determining the diagnostic accuracy of a diagnostic test or AUC [33]. Type I error (alpha) was set at 0.05, type two error (beta, 1-Power) at 0.1, i.e. Power set at 90%, the ratio between the positive and negative groups 1:2, and the null value at 0.5. The expected AUC was set at 0.7 considering the optimal AUC value for a good diagnostic test. Based on these assumptions, the total minimum required sample size was 105. After adding a 10% non-response rate, the final sample size became 116.
After generating a sampling frame from the list of individuals in the centre that fulfill the eligibility criteria, simple random sampling was employed to select study participants using the computer-generated random numbers method.

Data collection instruments and procedures
A. Socio-demographic variables. A brief questionnaire was prepared to collect information on the socio-demographic characteristics of the participants including age, sex, marital status, religion, ethnicity, area of residence, and educational status.
B. Gold standard reference. DSM 5 criteria for dementia was used to diagnose dementia [7]. The criteria require a decline from the previously developed level of cognitive functioning that interferes with independence in daily living and that other causes for impairment be excluded. As part of the gold standard diagnostic evaluation, the MMSE was used as a standardized cognitive assessment instrument (Criteria A2) [7]. Criteria B requires the interference of the impairment with the performance of daily living activities, which include instrumental activities of daily living. Criteria D requires other conditions causing cognitive impairment to be ruled out and the geriatric depression scale was used to assess the presence of depression and rule out pseudo-dementia.
The MMSE was used as a standard cognitive assessment tool under the DSM assessment. DSM criteria A2 requires the cognitive decline to be evidenced with at least one standardized cognitive measurement scale. It is a 30-point cognitive test initially developed in 1975. to assess cognitive function [7,15]. Modifications have been made to the original version of the MMSE as the serial sevens test for assessing attention was replaced with backward naming of the months of the year.
Depression was assessed using the Geriatrics depression scale (GDS) short form. The tool was developed by Yesavage et al. It is a 15-item self-report measure developed to examine depressive symptoms among older adults. The scale has 15 Yes/No questions [34,35].
C. Rowland Universal Dementia Rating Scale (RUDAS). RUDAS is a six-item cognitive assessment tool scored from 30 points. A cutoff value of 23 has been recommended in the initial validation study to screen for cognitive impairment. It takes less than 10 minutes to administer [23]. The six items in the RUDAS and their respective share of points are described as follows. Memory was assessed with four-item grocery list recall test in which the respondents are told a list of grocery items at the beginning of the assessment and are asked to recall them at the end (8 points, 2 for each item). Visuospatial ability is assessed in the tool through own body orientation in which the interviewer asks the respondent to show/locate different parts of his own body (5 points). Praxis is assessed with a fist/palm alternation test. The interviewer demonstrates the alternating movements of fist and palm and asks the respondent to repeat those movements continuously (2 points). The tool also has another item to assess visuo-constructional ability with the cube copying test (3 points). The respondents are then given a scenario of road crossing and their answers are used to assess their judgment (4 points). The last item is the assessment of language, which is measured by a one-minute animal generation test. The respondents are asked to generate names of as many animals as they can within one minute time. (8 points) [19,23]. A further detailed explanation of the tool and the items can be found elsewhere [23].
Data collection procedures. The data was collected by two BSc nurses and two BSc psychiatry professionals. Trained BSc nurses conducted the interviews with RUDAS then the same cases were interviewed by the BSc psychiatry professionals based on the DSM approach. The order of the two tests was interchanged for every case to avoid the order of tests effect.
The two groups of data collectors, the ones applying the gold standard and those applying the test tool to be validated were blinded to the client's performance in the other test.
In addition to the detailed clinical evaluation, the respondent's medical record was reviewed with their permission to assist in diagnosis.
Inter-rater reliability. The inter-rater reliability was assessed by applying the questionnaire on 20 individuals by two data collectors blinded to the findings of one another. The same two data collectors involved in the data collection for criterion-related validity were used.
All questionnaires were translated to the Amharic language with translation and back-translation procedure. The original English version of the tool was forward translated into the Amharic language by two bilinguals proficient in both languages. The forward-translated tools were back-translated into English by another two bilingual university instructors with MSc level training who have experience translating research questionnaires. The original, the forward-, and back-translated versions of the questionnaires were reviewed and discussed upon among the team of translators involved, the principal investigator (PI), and senior psychiatry professionals. Any discrepancies were resolved after discussion with the involved professionals.
Overall, the RUDAS items were translated into the Amharic language without significant problems, and the tool showed good semantic equivalence. Some modifications were made during the translation process and are summarized below.
In all the items of RUDAS, the semantic equivalence of the term "I want you", which is "Eifeligalehu" in Amharic, did not appear to be indicating instructions and was therefore replaced with another term "Eteyiqwotalehu", meaning "I ask you/ask of you". Besides, in all items, the endings of the questions were changed to "eteyikwotalehu/Yadirgu", which gives plural meanings when directly translated to the English language. Nevertheless, in the majority of Ethiopian cultures and also in the Amharic language, it is considered a sign of respect to older people. Item 4 in RUDAS (judgment): The term ". . .busy street. . ." whose direct semantic equivalence in Amharic is ". . .Yetechenaneke godana. . ." did not clearly provide the sense of the question as it could also mean "streets where no cars are passing by" and it was therefore replaced by another term ". . .Yetechenaneke yemekina menged. . ." which means ". . .Busy car road/ busy road. . .". In this similar question, the phrase "traffic lights" was considered a less familiar term, and the lights are less abundant in the country. Therefore, the translation of the term included the addition of more descriptive terms and was put as "Yemenged teqotatari yemebrat milikitoch", which meant "Lights used to control the traffic flow".
Further steps and procedures were undertaken to ensure the RUDAS tool and items were translated into the Amharic language with better quality and intelligibility. The final translated version of the tool was administered to an independent sample of 20 individuals within the centre who were divided into two halves and were interviewed by a well-trained data collector. The interviewer asked each item to the first half of the participants in a face-to-face manner. After answering each item, respondents were asked to elaborate and explain their understanding of the question and their answers. They were also asked to suggest any local word that would fit in a better way. Modifications were made to those items when the meaning of the items was not clear, when the respondents found it difficult to elaborate or when wrong conceptualization of the questions was identified from the participants' responses. The items that needed modification were modified before administering the updated version to the second half of the participants similarly.
The following points were taken into consideration through the above procedures, i.e. the measurement aim of the questionnaires, the target population, the concepts that the questionnaire is intended to measure, and the interpretability of items.
RUDAS items were generally well understood, and the instructions were indicated to be precise. Seven of the first ten respondents indicated that the term "Tea" was not something to be purchased directly in a grocery store but rather the "Tea leaf" and that the word was indicated to be ambiguous. For this reason, it was replaced by another term, "Buna", meaning "Coffee", which is more popular and common in most parts of Ethiopia. The modified version was then administered to the remaining ten respondents, and no concern was raised. Before the data collection, a day-long training was provided for data collectors and supervisors on the instruments, ethical principles, and how they diagnose cases.
The assessment with the gold standard evaluation by psychiatry professionals and BSc nurses using RUDAS was interchanged for every 30 cases to avoid the order of tests effect.

Data management and analysis
Every three days, the collected data was adequately assembled, reviewed, and checked for completeness and consistency by the PI. Only questionnaires that were complete were accepted.
The collected data were coded, and entered into Epi data entry V. 4.6.1 software and exported to Stata V.16 for analysis. After thorough data exploration, appropriate descriptive statics were determined and the results are reported with tables, texts, and figures.
Cronbach's alpha coefficient was used to measure internal consistency reliability. The intraclass correlation (Kappa) coefficient was used to determine inter-rater reliability.
The content validity of RUDAS was determined by calculating the Content Validity Index (CVI). A panel of experts with nine members was selected from senior professionals with knowledge and experience in the field of Psychiatry (1 expert with PhD in Psychiatry, 3 Psychiatrists, and 2 MSc in mental health), Epidemiology (1 expert with PhD in Epidemiology), and Neurology (2 Neurologists) to rate each item in the tool. Each panel member was asked to rate each item of the tools from 1-4 based on relevance and clarity. A score of 1 was deemed not relevant, 2-relevant but needs revision, 3-relevant with minor revision, and 4-relevant. The total number of experts who gave 3-4 (relevant) was divided by the total number of experts to calculate the I-CVI. S-CVI using the averaging approach (S-CVI/Ave) was computed by averaging the sum of the I-CVIs of the items of the tool to the total number of items. The definition and detailed procedures of calculating content validity indexes are described elsewhere and can be found in the literature [36].
Specificity, sensitivity, positive and negative predictive values, as well as positive and negative likelihood ratios, were calculated at several cutoff scores. Youden's J index (sensitivity + specificity−1) was used to determine the optimal cutoff score with the best balance of sensitivity and specificity. Receivers operating curve (ROC) analysis with the corresponding Area Under Curve (AUC) was determined against the gold standard to determine accuracy. Pearson's correlation coefficient was calculated to determine the correlation between screening tools (concurrent validity). A correlation coefficient of 0.6 or more was judged as indicating a strong association considering the unreliability associated with the scales due to the attenuation of validity coefficients [37].
Construct validity was determined by using the known group validity approach. An independent sample t-test was used to determine a significant mean difference in the test scores among the dementia and no dementia groups based on the gold standard assessment. Statistical significance was declared with P values < 0.05.
Multiple linear regression analysis was employed to determine the association of scores of RUDAS with different factors. Statistical significance was declared at P < 0.05. All the assumptions of multiple linear regression such as linear relationship, normality of residuals, multi-collinearity, and homoskasadicity were checked.

Ethical approval and consent to participate
Ethical approval was obtained from the institutional ethical review board of Jimma University, Institute of Health, with the reference number of IRB000263/2012. Before data collection, written informed consent was obtained from the study participants after a detailed explanation of the purposes of the study. Confidentiality was maintained for the participants. All the necessary precautions were employed during the data collection process to prevent the spread of COVID-19 infection.

Socio-demographic and clinical characteristics
A total of 116 respondents participated in the study with a response rate of 100%. The mean age of the participants was 69.87 ± 7.97 (Range; 60-94) and males comprised 51.72% (n = 60) of the participants. The majority, 48.30% (n = 56) of the respondents had attended no formal education and the mean years for formal education attended by the respondents was 4.90 ± 5.90 years. The mean RUDAS scores has showed a steady increase as the participants educational level increased with those who attended college and above scoring the highest mean RUDAS score (27 ±2.5) and those with no formal education scoring the lowest (18.21 ± 5.7). Regarding marital status, 42.24% (n = 49) of them were widowed/widower. Before their admission to the centre, 55.17% (n = 64) of the respondents had their residence in urban areas. The mean (SD) score on geriatric depression scale short form of the participants was 6.01 (3.59). Those with a diagnosis of dementia had a mean (SD) MMSE score of 15.36 (1.54) while those who didn't have dementia had a mean (SD) MMSE score of 23.83 (1.31). Other sociodemographic and clinical characteristics of the respondents can be observed in Table 1.

Reliability of RUDAS
RUDAS had a Cronbach's alpha value of 0.73. None of the items in the tool have resulted in an increment of its alpha values when they were deleted. The inter-rater reliability findings indicate that RUDAS had an ICC value of 0.94 (95% CI: 0.82-0.98). Validity of RUDAS Content validity. RUDAS had excellent Content validity index results. Only one item, i.e. Visuo-constructional drawing, had an item-level content validity index (I-CVI) of 80% with the rest scoring 100%. The scale level content validity index (S-CVI) of the tool was 96.7%.
Criterion related validity. For the total respondents who participated in the study, the mean score of RUDAS was 21.5 ± 5.7. On average, the administration of RUDAS took 10 minutes. The area under the receiver's operating curve for the identification of major neurocognitive disorder was AUC = 0.87 (95%CI: 0.81-0.93) (Fig 1).
At the recommended cutoff score of RUDAS, which was �23, it had excellent sensitivity (97.4%) but had lower specificity at 62.3%. The optimal cutoff score for the tool for to detect major neurocognitive disorder determined in our study based on the maximum Youden's J index was scores less than or equal to 22.
At this cut point, the tool had a sensitivity value of 92.31% and a specificity value of 75.32% and it correctly classified the cases 81.03% of the time ( Table 2). The LR+, LR-, PPV and NPV values of the tool at the specified optimal cutoff value were 3.74, 0.10, 65.5%, and 91.5%, respectively.
Construct validity and concurrent validity. Independent sample t-test analysis revealed a statistically significant difference between the two diagnostic groups in their mean RUDAS  Table 3). The variables in the model explained 52% of the variation observed in the RUDAS scores (R 2 = 0.56, Adjusted R 2 = 0.52, F = 13.27, p<0.001).

Discussion
In this study, we validated and reported the psychometric properties and diagnostic accuracy of RUDAS to detect major neurocognitive disorder in Ethiopia. RUDAS showed very good internal consistency reliability with Cronbach's alpha value of 0.73. In the current study, at an optimal cut point of � 22, RUDAS has demonstrated an excellent ability to detect major neurocognitive disorder with an AUROC value of 0.87 and sensitivity and specificity values of 92% and 75%, respectively. The participants' years of formal education showed a statistically significant positive association with RUDAS scores, whereas having dementia and the geriatric depression score results showed significant negative associations with total RUDAS scores. Many validity studies in different settings and languages have also reported very good reliability measures for RUDAS. The initial development study by Storey et al. reported an interrater, and test-retest reliabilities of 0.99 and 0.98, respectively [23]. In addition to the above study, the reliability measures for RUDAS on the current study also corroborate with reports of the studies conducted in the Netherlands, Taiwan, Peru, and Nepal [32,[38][39][40]. These consistent reports demonstrate the tool's ability to reliably and consistently measure the cognitive status of individuals.
The initial validation study of RUDAS by Storey et al. in 2004 reported an area under the ROC curve of 0.95. At an optimal cutoff score of 23, the sensitivity was 89% and the specificity was 98% [23]. Adding to the studies assessing the performance of RUDAS in communities with low and middle socioeconomic status, another one conducted in Taiwan reported that RUDAS had an AUC of 0.87, sensitivity 76%, 83% PPV, a specificity of 81%, and 91% NPV with a cut point of 22 [32].
The optimal cutoff value in our study was one point lower than the one recommended by the tool developers and some other studies conducted in high-income countries [23,30]. The observed variation may be due to the high socioeconomic status and a better level of education of the participants in those studies. This was further supported as the tool has a similar cutoff point and AUC value as another study conducted in Taiwan, with a sample of low-education and low-income background [32]. The similarity in the findings indicates that RUDAS can be successfully applied in low and middle-income countries to screen for cognitive impairment.
RUDAS has been reported to be a tailor-made tool for communities with diverse backgrounds. The initial validation study reported that factors such as gender, years of education, cultural background, and preferred language were not associated while age was [23]. Several other studies conducted to validate the tool reported similar findings [12,28,39]. However, controversies exist regarding the association of education with the performance of the test. Some report the relative freeness of the tool from the effect of education [38,40,41] while others indicated an association [30,42]. In our study, confirming the findings of some of the studies, years of education had a significant association. However, the excellent performance of the test in the current study population with low mean years of education, i.e. five years, indicates that it can be applied effectively in such communities.
The findings of this study overall indicate that the tool is a practical, valid, and reliable instrument for screening major neurocognitive disorder and assessing individual's cognitive status. In resource-limited settings like Ethiopia, having validated brief screening tools will help in the early identification of cognitive impairments, which will, in turn, lead to early interventions aimed at stopping or slowing the progression of the disorder.
The excellent performance of the tool in a sample with a low mean level of education and a high cultural diversity indicates the applicability of the tool in such populations. The short time taken to administer the tool and also the administration of the tool by non-psychiatry professionals provides the advantage of applicability of the tool in busy outpatient setups and by professionals outside of mental health practice. As far as the researcher's best knowledge, this study is the first to validate RUDAS in the continent of Africa to this date.

Limitations
The variability of the Amharic language across different regions of the country in wording and cultural difference might require caution in using this version of the tool in different areas of the country and the need for further validation studies in the other languages. The current study also did not assess the test-retest reliability of RUDAS. Test-retest reliability is an essential measure of the tool's stability and ability to consistently screen for major neurocognitive disorder over time.

Conclusions and recommendations
The Rowlands universal dementia assessment scale has been demonstrated to be a valid and reliable cognitive assessment instrument and can be incorporated in clinical and research practices in developing countries. Researchers in the area can also conduct further validation studies to assess the applicability of the tool in other languages and communities. Future studies should also assess the test-retest reliability of the tool.