Clinical use of SAND battery to evaluate language in patients with Progressive Supranuclear Palsy

Background Progressive Supranuclear Palsy (PSP) patients present language disturbances in tasks like naming, repetition, reading, word comprehension and semantic association compared to Parkinson’s disease (PD) and healthy controls (HC). Objective In the present study we sought to validate a Screening for Aphasia in NeuroDegeneration (SAND) battery version specifically tailored on PSP patients and to describe language impairment in relation to PSP disease phenotype and cognitive status. Methods and results Fifty-one PSP [23 with Richardson’s syndrome (PSP-RS), 10 with predominant parkinsonism (PSP-P) and 18 with the other variant syndromes of PSP (vPSP)], 28 PD and 30 HC were enrolled in the present study. By excluding the tasks with poor acceptability (i.e., writing and picture description tasks) and increasing the items related to the remaining tasks, we showed that the PSP-tailored SAND Global Score is an acceptable, consistent and reliable tool to screen language disturbances in PSP. However, we failed to detect major differences in language involvement according to disease phenotype. Differently, we showed that patients with dementia present worse language performances. Conclusions Taking into account specific disease features, the combination of the SAND subscores included in the PSP-tailored SAND better represents language abilities in PSP. Furthermore, we showed that language disturbances feature PSP patients irrespective of disease phenotype, but parallels the deterioration of the global cognitive function.


Objective
In the present study we sought to validate a Screening for Aphasia in NeuroDegeneration (SAND) battery version specifically tailored on PSP patients and to describe language impairment in relation to PSP disease phenotype and cognitive status.

Methods and results
Fifty-one PSP [23 with Richardson's syndrome (PSP-RS), 10 with predominant parkinsonism (PSP-P) and 18 with the other variant syndromes of PSP (vPSP)], 28 PD and 30 HC were enrolled in the present study. By excluding the tasks with poor acceptability (i.e., writing and picture description tasks) and increasing the items related to the remaining tasks, we showed that the PSP-tailored SAND Global Score is an acceptable, consistent and reliable tool to screen language disturbances in PSP. However, we failed to detect major differences in language involvement according to disease phenotype. Differently, we showed that patients with dementia present worse language performances.

Conclusions
Taking into account specific disease features, the combination of the SAND subscores included in the PSP-tailored SAND better represents language abilities in PSP. Furthermore, we showed that language disturbances feature PSP patients irrespective of disease phenotype, but parallels the deterioration of the global cognitive function. PLOS

Introduction
Progressive Supranuclear Palsy (PSP) is a rare, rapidly progressive neurodegenerative disease characterized by postural instability and supranuclear vertical gaze palsy as well as by cognitive and behavioral symptoms [1]. According to the clinical diagnostic criteria proposed by the Movement Disorder Society (MDS) [2], language impairment is part of the complex spectrum of disturbances affecting patients with PSP. As such, PSP with predominant speech-language disorder (PSP-SL) is recognized as an independent clinical phenotype reaching the diagnostic level of possibility associated with a probable 4R-tauopathy pathology (i.e., either PSP or Cortico-basal Degeneration) [2]. However, evidence suggests that a wide spectrum of language deficits characterize also the remaining PSP phenotypes, including Richardson's syndrome (PSP-RS). Recently, Burrell et al. reported that patients with PSP-RS present specific language deficits similarly to those affected by Primary Progressive Aphasia (PPA) [3]. To date, there is scant of evidence on the language profile of PSP patients diagnosed according to MDS clinical criteria [4]. The Screening for Aphasia in NeuroDegeneration (SAND) battery is a brief validated tool to detect language impairment in patients affected by neurodegenerative diseases through the assessment of different components of language [5,6]. The SAND battery is proved to detect subtle language impairment in PSP phenotypes other than PSP-SL, such as lexical-semantic level disturbances in comparison with Parkinson's disease (PD) and healthy controls (HC) [4]. However, peculiar PSP clinical features may prevent a proper application of specific language tasks included in the SAND battery.
In the present study we aimed to validate a version of the SAND battery specifically tailored for PSP and to use it to describe language performances in PSP according to disease phenotype and cognitive status.

Patients
Between November 2015 and December 2018, consecutive cases of suspected PSP referred to the Center for Neurodegenerative Diseases of the University of Salerno were proposed a dedicated set of assessments including a clinical interview, a motor evaluation, extensive cognitive and behavioral testing, language evaluation and brain MRI.
For each enrolled patient the MDS proposed diagnostic flowchart was applied by two specialists for movement disorders who defined the PSP phenotypes [23 PSP-RS, 10 PSP with predominant parkinsonism (PSP-P), 9 PSP with predominant corticobasal syndrome (PSP-CBS), 4 PSP with progressive gait freezing (PSP-PGF) and 5 PSP with predominant frontal presentation (PSP-F)] according to the predominant clinical features and expressed the degree of diagnostic certainty [2,7,8]. Diagnosis as well as phenotypic attribution was verified for all patients during at least one subsequent visit. As PSP-CBS, PSP-PGF and PSP-F included a limited number of patients, those subtypes were grouped together as the other variant syndrome of PSP (vPSP = 18).
In addition, two groups of age-matched HC (N = 30) and PD (N = 28) patients were also enrolled for the present study. Exclusion criteria for enrollment of PD patients were diagnosis of dementia in accordance with MDS criteria and H&Y in on state>3. Exclusion criteria for enrollment of HC were the presence of any neurological or psychiatric conditions.
The project was performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki. As such, the study was approved by the local Ethics Committee (Campania Sud) and each subject was included upon signature of the informed consent form.

Clinical and cognitive evaluations
Severity of the disease was evaluated with the PSP rating scale (PSP-rs) [9]. Cognitive abilities were screened with the Mini-Mental State Examination (MMSE) and the Montreal Cognitive Assessment (MoCA). Memory domain was investigated with the immediate and delayed recall scores of the Rey auditory verbal learning test (15-RAWLT) and the Rey figure recall test. Attention-executive domain was explored through the Trail Making Test (TMT), the short version of the Stroop Interference Test, the Clock design test (CDT) and the Rey figure copying test (RCF). Visuo-spatial functions were tested with the constructional apraxia test and Benton orientation line test (BJLO) [10]. Language was explored with two sub-tests from the Neuropsychological Examination of Aphasia battery (ENPA), the non-word repetition test and the auditory comprehension test of sentences.
Functional autonomy was evaluated with the Instrumental Activities of Daily Life (IADL), while depression and apathy with the Beck Depression Inventory II (BDI-II) and Apathy Evaluation Scale (AES), respectively [11,12].
Using the z scores computed with the scores from the HC group, each PSP patient was classified as having PSP with normal cognition (PSP-NC = 4), PSP with mild cognitive impairment (MCI) single domain (PSP-MCIsd = 9), PSP with MCI-multiple domain (PSP-MCImd = 24) and PSP with dementia (PSP-D = 12) [13,14].
Due to the lack of specific MCI criteria for PSP, MDS MCI criteria for Parkinson's disease were applied [14]. Patients presenting any type of cognitive/behavioral decline associated with impairment of IADL were considered as affected by dementia (PSP-D), according to Statistical Diagnostic Manual of Psychiatry-5th Edition (DSM-5).

Language testing
Language was evaluated with the SAND battery [5,6]. The SAND Global Score including the 23 task-related scores was computed according with a previously described process [5]. In brief, the SAND global score is a frequency count of the pathologically impaired sub-scores with higher scores indicating more severe impairment. However, SAND Global score acceptability and consistency in PSP patients was suboptimal due to a high proportion of missing data in the writing and connected speech tasks (S1 Appendix). Therefore, following the three steps process as noted in S1 Appendix, a PSP-tailored SAND Global Score was created, reducing the impact of the writing and picture description subscores and expanding the relevance of the remaining tasks subscores ( Table 1). The PSP-tailored SAND Global Score ranges from 0 to 19, with higher scores indicating greater impairment. (S1 Appendix).

Statistical analysis
After checking for normality distribution with the Kolmogoroy-Smirnov test, differences in variables between groups were computed with χ 2 or the Kruskal-Wallis tests as appropriate. Pairwise comparisons were performed with Mann-Whitney's U test.
Acceptability and internal consistency were explored for both the SAND Global Score and the PSP-tailored SAND Global Score. Acceptability was considered appropriate for each Global Score if �15% of the respondents totalized the lowest and highest possible scores (floor and ceiling effect) and for each Global Score item if there were �5% of missing values. Moreover, skewness of Global Scores (limits, -1 to +1) was determined [15].
Internal consistency was evaluated by means of Cronbach's alpha [16]. A value�0.70 was considered as acceptable [17]. Since the SAND Global Score showed suboptimal acceptability and consistency in PSP patients (see Results and S1 Appendix), subsequent analyses were performed only for the PSP-tailored SAND Global Score.
Scaling assumptions referring to the correct grouping of items and the appropriateness of their summed score were checked using corrected item-total correlation for PSP-tailored SAND Global Score (standard, �0.40 [18]).
Construct validity was explored with non-parametric Spearman's correlation between the PSP-tailored SAND Global Score and other language testing as well as with cognitive and behavioral testing. Correlations were considered strong with coefficient>0.70 and moderate with coefficient between 0.30 and 0.70. SAND scores were not expected to correlate with memory and behavioral testing, while were expected to correlate with other language and cognitive testing.
ROC analysis was performed for the PSP-tailored SAND Global Score to identify the optimal cut off to detect language impairment in PSP patients compared to both PD and HC. Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) and diagnostic accuracy in comparison to clinical diagnosis were assessed at the best threshold for classification.
Post-hoc Bonferroni test was used to correct for multiple comparisons. Statistical analysis was performed with SPSS (Version 23).

Results
Sixty-two PSP patients were considered for the present study, but 11 were excluded according to specific inclusion/exclusion criteria detailed above. In detail, in six patients the clinical diagnosis of PSP was not confirmed in subsequent visits, four patients were not able to complete the SAND battery and one patient presented PSP-SL. The final cohort, thus, included 51 PSP patients ( Table 2). According to MDS degrees of diagnostic certainty, all PSP patients had a diagnosis of probability (ie, presenting either a clear limitation of the range or decreased velocity and amplitude of vertical gaze plus other features) but those-by definition-with PSP-CBS [2]. As such, although PSP-CBS is featured by either a clear limitation of the range or decreased velocity and amplitude of vertical gaze, the presence of a corticobasal syndrome still raises the differential diagnosis with Corticobasal disease. Since no in vivo biomarkers are available differentiating PSP from Corticobasal disease, thus, PSP-CBS remains a diagnosis of possibility [2]. PSP patients presented worse performances on both the SAND Global Score and the PSP-tailored SAND Global Score compared with both PD and HC, reflecting, thus, worse language abilities (Table 2).

Validation phase
Although no floor effect was observed, a tendency to ceiling effect was reported for the SAND Global score (0.2% of participants obtained the lowest possible score and 15.2% the highest possible score). Skewness was 0.527. However, missing values were 30.7% and 13.3% in writing and picture description tasks, respectively, compared to 0% in the remaining tasks (S1 Appendix). Cronbach's alpha for the SAND Global score was 0.405 and, thus, it was considered suboptimal for internal consistency [16]. Reducing the items related to the tasks with suboptimal acceptability (i.e., writing and picture description tasks) and increasing the items related to the remaining tasks significantly improved Cronbach's alpha from 0.405 to 0.887 indicating highlevel internal consistency (S1 Appendix). By removing additional items, no further improvement of Cronbach's alpha was detected. Therefore, the 19-items PSP-tailored SAND Global Score was conceived. Neither ceiling or floor effect were observed for the PSP-tailored SAND Global Score (lowest possible score = 0, 4.8% of the participants; highest possible score = 17, 0.9% of the participants). Skewness of the PSP-tailored SAND Global Score was 0.965. All the PSP-tailored SAND Global Score items presented excellent acceptability as there were no missing data and 100% of data were computable. Scaling assumptions referring to the correct grouping of items and the appropriateness of their summed score were checked using corrected item-total correlation (standard, �0.40).
Spearman's correlation confirmed convergent validity of the single tasks included in the PSP-tailored SAND Global Score, demonstrating significant moderate correlations with other language testing ( Table 3). As for the other cognitive tests, moderate correlation was demonstrated with measures of global cognition as the MoCA, but not with the MMSE, as well as with tests exploring visuospatial and attention-executive domains. No correlation was shown with tests exploring memory domain or behavioral scales, while moderate correlation emerged with disease severity as assessed with the PSP-rs (S1 Appendix).

Determining the optimal cut off of the PSP-tailored SAND Global Score
ROC analysis was used to assess the discriminatory power of the PSP-tailored SAND Global Score in identifying language impairment in PSP compared to both HC and PD. As for the comparison with HC, the ROC analysis showed an 87.6% discriminatory power [95% confidence interval (CI), 80.1-95.2%]. The determined optimal cut off was 3 showing 74.5% sensitivity, 80% specificity, 86.4% positive predictive value (PPV), 64.9% negative predictive value (NPV) and 76.5% diagnostic accuracy (S1 Appendix).

PSP-tailored SAND and disease features
SAND task subscores in PSP, PD and HC are shown in S1 Appendix. PSP patients reported worse outcome in all SAND task subscores as well as in the PSP-tailored SAND Global Score compared to both PD and HC.
No differences were detected in the PSP-tailored SAND Global Score among patients with different disease phenotypes (Table 4). PSP-D showed worse PSP-tailored SAND Global Score compared to both PSP-MCIsd and PSP-NC. However results were not significant when correcting for multiple comparisons (Table 5).

Discussion
The present study showed that the PSP-tailored SAND battery is acceptable, reliable, and easily applicable to PSP patients. By removing subscores with high proportion of missing values and expanding subscores of the remaining tasks, we used the best combination of SAND tasks to screen language ability in PSP leading to a significant improvement in consistency and acceptability as compared to the original SAND Global Score [5,15]. As a matter of fact, differently from patients with PPA, PSP patients disclose peculiar clinical features possibly impacting performances on specific language tasks. Specifically, ocular movement abnormalities may hamper the visual exploration of the picture description and possibly impact the performances of connected speech task for non linguistic reasons. Similarly, the writing task can be affected by both apraxia and bradykinesia. The combination of SAND tasks included in the PSP-tailored SAND Global Score overcome such limits showing high acceptability since data were computable for 100% and the percentage of missing values was 0% for all items. The excellent acceptability by PSP patients is also supported by the absence of both ceiling and floor effects as well as by the optimal skewness. Furthermore, the internal consistency of the PSP-tailored SAND battery is high and acceptable (Cronbach's alpha = 0.887; item-total score correlation�0.40 for all items) suggesting a coherent representation of all the language functions screened. As for convergent construct validity, each task of the PSP-tailored SAND battery showed significant moderate correlation values with other corresponding language testing. Furthermore, the PSP-tailored SAND Global Score showed moderate correlation with measures of global cognition as well as with cognitive tests exploring attention-executive and visuospatial domains. The positive association with the PSP-rs suggests a correlation between language abilities and severity of disease.
No association was shown with behavioral assessments suggesting divergent validity between language function and apathy and depression burden in PSP patients. As for the discriminatory power of the PSP-tailored SAND Global Score, the optimal cut off of 3 demonstrated an adequate sensitivity and specificity profile in identifying language impairment compared to both PD and HC. This is the first study showing a cut off for a language battery differentiating PSP from PD and HC. Previous evidence showed the SAND cut off of 5 was able in differentiating PPA from patients affected by movement disorders (PD and PSP) [5].
Confirming previous findings on a smaller cohort of patients [6], PSP patients other than PSP-SL present language disturbances when compared to both PD and HC age-matched groups (S1 Appendix).
As for language evaluation according to disease phenotype, we failed to detect significant differences suggesting language is globally involved in PSP irrespective of the specific phenotype. Confirming previous findings [13], available clinical and cognitive assessments hardly capture clinical differences among MDS PSP phenotypes.
As for the relationship between language and cognitive status, we detected a trend for worse language performances in PSP-D compared to both PSP-MCIsd and PSP-NC suggesting that language deficit may be related to the extent of impairment of the cognitive networks.
Our study has several strenghts. Firstly, the large sample size of early PSP patients enrolled (median disease duration = 4 years) representing the different phenotypes of the disease as well as the inclusion of age-matched groups of PD and HC subjects. Secondly, all included patients underwent a thorough evaluation with an extensive battery of clinical assessments by a specialist for movement disorder in a third level center and were diagnosed according to recent MDS criteria [2]. Finally, we are the first to propose an evaluation of language abilities in PSP taking into account specific disease features possibly impacting on language evaluation.
On the other hand, we acknowledge the lack of pathological confirmation, still the gold standard for PSP diagnosis, is a major limitation of our study. Another limitation of our study is the lack of cross-validation procedures for the ROC analysis which can lead to an under-or over-estimation of the PSP-tailored SAND Global score cut-off to discriminate between PSP and PD and HC subjects. However, as ocular disturbances and postural instability remain the cardinal features of PSP, language testing, as the SAND battery, would not represent a diagnostic testing for such condition. Finally, we missed to evaluate the motor speech component with appropriate instruments.
In conclusion, the combination of the SAND subscores included in the PSP-tailored SAND Global Score represents an acceptable and reliable tool to screen for language abilities in PSP. Furthermore, we showed that language disturbances feature PSP patients irrespective of disease phenotype, but may parallel the deterioration of the global cognitive function.