Cultural Adaptation of the Portuguese Version of the “Sniffin’ Sticks” Smell Test: Reliability, Validity, and Normative Data

The cross-cultural adaptation and validation of the Sniffin`Sticks test for the Portuguese population is described. Over 270 people participated in four experiments. In Experiment 1, 67 participants rated the familiarity of presented odors and seven descriptors of the original test were adapted to a Portuguese context. In Experiment 2, the Portuguese version of Sniffin`Sticks test was administered to 203 healthy participants. Older age, male gender and active smoking status were confirmed as confounding factors. The third experiment showed the validity of the Portuguese version of Sniffin`Sticks test in discriminating healthy controls from patients with olfactory dysfunction. In Experiment 4, the test-retest reliability for both the composite score (r71 = 0.86) and the identification test (r71 = 0.62) was established (p<0.001). Normative data for the Portuguese version of Sniffin`Sticks test is provided, showing good validity and reliability and effectively distinguishing patients from healthy controls with high sensitivity and specificity. The Portuguese version of Sniffin`Sticks test identification test is a clinically suitable screening tool in routine outpatient Portuguese settings.


Introduction
Olfaction is important for our daily living. Impairment of the sense of smell has important consequences for quality of life, health, and safety. Olfaction is considered to be less disabling that other sensory losses, such as blindness and deafness. As a consequence, the medical community considers olfaction as a less important sense and is less studied. Recent advances in our understanding of olfaction changed this. For example, olfactory impairment is an early, frequent and sensitive marker of the preclinical phase of neurodegenerative diseases like Parkinson and Alzheimer diseases. It is present in over 90% of early Parkinson disease stages which, in turn, may lead to early treatment. [1,2] Current consensus estimates that about 5 percent of the general population suffers from anosmia, and about 20% exhibit hyposmia. [3] Odor identification is culturally dependent. For example, many smells that are familiar in the USA are not familiar in Europe. Specifically, Portugal has a unique, 800-years old culture which, of course, comes along with many characteristic flavors. Considering this, it is important to establish an olfactory test that distinguishes between normal and pathological situations. In addition, the test, ideally, should be able to monitor changes in olfactory capacity over time, for example, in order to establish the possible effects of treatment.
No validated olfactory test for the Portuguese population currently exists. Thus for a population of 10.5 million people it is impossible to correctly evaluate the sense of smell, make accurate diagnoses, evaluate prognoses, or compare treatment modalities.
The best-validated olfactory tests include the UPSIT (University of Pennsylvania Smell Identification Test), [4] the CCCRC test (Connecticut Chemosensory Clinical Research Center), [5] and the "Sniffin' Sticks" (SnSt). [6] The latter is a European-designed test, while the first two were developed in North America.
The SnSt has been validated in various countries and populations not only in Europe (e.g Germany and northern European countries [7], Italy [8], Greece [9], and Holland, [10] but also outside Europe like in Australia [11], Sri Lanka [12], Brazil [13] or Taiwan [14]. This work aimed to adapt and validate a reliable Portuguese version of the Sniffin´Sticks (SnSt) test that would be suitable both for clinical and laboratory settings. Such a test would permit health professionals, industry officials, and others to evaluate the olfactory conditions in Portugal, compare this with the standards of other cultures, and to, diagnose, treat and study multiple related pathologies and conditions.

Ethics statement
All participants provided informed written consent. The study followed the Declaration of Helsinki 2013 on Biomedical Research Involving Human Participants and was approved by the Ethics Committee of the Faculty of Medicine, University of Coimbra, Portugal.

Smell testing
For orthonasal olfactory testing, the SnSt were used (Burghart GmbH, Wedel, Germany). [6,7] This test consists of 3 subtests, namely, tests for odor threshold (T), odor discrimination (D), and odor identification (I). Results of the 3 subtests are typically summed up and presented as a composite TDI score. Normative data is available based on multi-centered European examinations. [7,16] Odor threshold testing identifies the least detectable concentration of an odorant (phenyl ethyl alcohol) that can be perceived by a participant. It is determined by the administration of 16 geometrically increasing dilutions of the odorant in a single-staircase design within a 3-alternative forced-choice procedure. Three pens are presented starting from the pen with the weakest dilution, with two pens containing the solvent and the third the odorant at a certain dilution. The participant's task is to identify the odor-containing pen. Each change from weaker to stronger or stronger to weaker is considered a "reversal". Threshold is defined as the mean of the last four staircase reversal points. Participants' scores range between 1 (the highest concentration is not perceived) and 16 points (the lowest concentration is perceived). [6,7] The SnSt odor discrimination test assesses the ability to distinguish a certain odorant from another using a 3 alternative forced choice technique (16 triplets). In this task, no naming or formal identification of the odorant is necessary.
The SnSt odor identification testing involves a multiple forced choice identification of 16 odors from a list of 4 descriptors each. It is known to have a strong cultural connotation and needs to be adapted in order to avoid a situation in which the local population is unfamiliar with the odors presented or with the descriptors used for the multiple choice task.

Our protocol design included 4 experiments:
Experiment 1 -cultural adaptation of the identification subtest. In order to determine odor familiarities of the Portuguese population, the SnSt identification test was translated into Portuguese by a native English speaker and a native and bilingual Portuguese. Several different descriptors for the same original descriptor were found. Instead of performing a classical two investigator results reconciliation and back translation, [17] all descriptors were paneled for familiarity survey testing. Participants (n = 67) were asked to rate odor descriptors according to how familiar each odor seemed to them, using a Likert-type scale ranging from 0 to 5 (0 = unknown, 5 = highly familiar). Averaged results were converted to a percentage scale and presented in Table 1.
The successful identification of individual odorants from a list of four descriptors should be >75% in healthy participants. [6] Accordingly, original answer sheet was modified.
Combined verbal and nonverbal information was provided for all odorants and distractors. Experiment 3 -validity: differentiate normal vs anosmia. A third experiment included a group of 69 patients previously reported as having olfactory loss (40.7±20.6 years (range 20-81)). This group was tested with SnSt-pt in order to examine if the test could discriminate between healthy controls and people indicating olfactory loss. It included people with advanced Parkinson's disease, nasal polyps and severe septal deviations. Experiment 4 -reliability, test-retest. One last experiment re-evaluated 71 healthy participants with a 1 month interval in order to examine test-retest reliability of the SnSt-pt.
continuous variables and percentages for categorical data. Data was examined for normality with the Kolmogorov-Smimov test. SnSt-pt scores were compared using independent sample t tests and one-way analysis of variance (ANOVA) with post hoc Bonferroni tests. Correlational analyses were performed using the Pearson´s Chi-squared test. To assess the factors that independently influence SnSt-pt, multiple linear regression analysis was performed using TDI and T score as the dependent variable and age, gender and current smoking status as covariates. Multiple logistic regression analysis was performed to predict the usefulness of SnSt-pt TDI and T scores to differentiate patients from controls. Test-retest reliability was evaluated by means of the concordance correlation coefficient on 71 randomly selected healthy participants who were re-assessed with the SnSt-pt about 1 month after the first evaluation. Cronbach´s alpha, Pearson's correlation statistic and intra-class correlation coefficient (ICC) were calculated. Bland-Altman plots showed the agreement between test and retest measurements. The level of significance was set at 0.05.

Results
The odor identification test required both translation and the replacement of distractors unfamiliar to the Portuguese population. [6] Results were converted into a percentage scale and results displayed in Table 1.
The original answer sheet was modified to include more familiar descriptors. In particular seven names of odors and descriptors were replaced according to the familiarity survey ( Table 2).
After replacing the names of odors and descriptors with low familiarity indexes as described in Table 2, an overall improvement in familiarity of 24±9.5% was achieved.
The SnSt-pt test was administered to 203 healthy participants to define normative values and the validity of the test in the Portuguese population (Table 3).
After observing its normal distribution, a multiple regression analysis was run to predict the SnSt-pt TDI score as the dependent variable in relation to age, gender and smoking status. Age (r = -0.271, p<0.001), gender (r = -0.399, p<0.001) and smoking status (r = -0.439, p<0.001) were significant independent predictors of the TDI score, F(3,199) = 30.19, p<0.001) and the adjusted R 2 was 0.30, indicating that these variables explain 30% of the TDI score variation. Data is presented in Fig 1 and Table 4.
A group of 69 previously known patients with olfactory loss was tested in order to examine whether or not the test is capable of discriminating between healthy controls and patients with an olfactory disorder (  (Fig 2).

Discussion
To our knowledge, this work presents the first olfactory test validated in the Portuguese population. Normative data for age and gender is presented. The ability to discriminate healthy participants from persons with impaired olfactory ability is established and showed good reliability. About 10 million people now have access to and can be the subjects of an olfaction test adapted to their own habits and culture. And that is important not only for research but also for clinical purposes. The ability to accurately and reliably assess olfactory function has very important clinical, safety and medico-legal implications.
Olfactory tests are known to have strong cultural affinities. [18,19] We adapted SnSt instead of USPIT or CCCTRC test because of its European cultural background. Additionally, the SnSt permits the study of the three components of olfaction (threshold, discrimination and identification), rather than only identification, as does the UPSIT. [6] We have attempted to rule out cultural bias factors, which some have suggested to be one of the contributors to the lower rates of identifiability achieved for certain odors in the original test. Nevertheless, it still remains with a contextual and linguistic effect, reduced by the nonverbal descriptor cues and translation design we employed. Gummy candy, sauerkraut, fir, grapefruit, liquorice, turpentine and peppermint are not commonly seen in Portugal, as is demonstrated by the results of the familiarity survey we performed. Therefore, the names of each of these substances were replaced by more common Portuguese designations, as shown in Table 2. Importantly, these modifications do not require changes in how the test is manufactured.
We chose to use both verbal and nonverbal ways of identifying odors because of the widely different backgrounds and educational status of the subjects in both our clinics and laboratory. Several papers mentioned no difference in using only word alternatives. [6,20] Yet, we chose to include nonverbal approaches as well. We did this because we sought to create a test that would be useful for the elderly, for people with dementia and psychiatric problems, and for the deaf-blind. Moreover, including a nonverbal approach helped reduce total testing time while it made taking the test more interesting for participants.
In our study, the SnSt-pt test was administered 343 times as part of our efforts to develop normative values in relation to different age and gender groups. The 10 th percentile of the 18-35 year old group was used as the level at which normosmia could be distinguished from hyposmia (TDI = 31.75; T = 7; D = 8 and I = 13). [7] Original SnSt normative data are applicable mainly to the populations of German-speaking countries. The original data established a TDI of 30.3 as a separation point distinguishing normosmic and hyposmic people. [7] The data from the current study suggest that Portuguese SnSt-pt scores are directly comparable to scores obtained in other countries. [6,8,9,14,[21][22][23][24][25], albeit a little higher. This threshold is similar to what has been described for the Greek population. [9] Such findings may be partly explained by the Mediterranean weather, as a warm climate may favor higher thresholds.
Although a complete olfactory workup, including a TDI score, is important for research purposes and for some clinical cases, it typically is too time-consuming for use in a busy clinical setting. Therefore, we have also presented data on the identification score alone. Focusing on this alone permits providers to evaluate olfactory capacity with an approach that takes less time than does the full SnSt battery. The identification test was found to be a clinically suitable screening tool for the Portuguese population. In fact, the current study indicates that the odor identification test is a reliable screening tool with good test-retest reliability (r = 0.62, p<0.001) and an area under the ROC curve of 0.84. These results suggest this approach has value as a way of differentiating between "normosmic" and "hyposmic" people.   The threshold test is usually considered the most sensitive olfactory examination of overall olfactory function [26,27] and seems to accurately reflect peripheral olfactory function. [27] Nevertheless, the threshold test is characterized by considerable within-and across-participant variability in relation to age, gender and smoking status, [7,28] as was confirmed in our test retest study.
In relation to the evaluation of a person's olfactory capabilities, older age, male gender and active smoking status are each well-known to diminish olfactory capabilities. [3,16] Nevertheless these factors explain no more than about 30% of the variation in TDI score in this study. Accordingly, it seems reasonable to conclude that the TDI score is an independent predictor of hyposmia.
SnSt-pt showed significantly lower scores in hyposmic patients compared to healthy controls, as matched for age and sex (p<0.001). ROC analyses indicated that SnSt-pt distinguishes patients from controls with high sensitivity and specificity, supporting the role of SnSt-pt in detecting hyposmia. In a clinical setting, TDI and T score may be used to identify a hyposmic person.
The SnSt-pt TDI with a r = 0.86 compares well with the reported reliability of both the Sniffin`Sticks test (r = 0.72) [6] and the UPSIT 0.98. [29] Notably, the identification test correlation, r = 0.62, was lower than that observed for the German version of Sniffin`Sticks (r = 0.73). [6,30] Few published studies have evaluated the reliability of suprathreshold tests other than odor identification tests. [31] The reliability of the discrimination subtest is r = 0.71, which is higher than that of the atypical identification subtest but lower than the subtest for threshold. These findings are in line with those of other reports. [29,30] Familiarity with stimuli, a learning effect and the feedback provided after the initial testing session, may explain the fact that retest scores tended to be higher than initial scores. Notably, the staircase method we used is known to be associated with elevated levels of false positives. When retesting thresholds, we started the test at the values attained during the previous testing session. We followed this approach in order to reduce the length of time required for testing and to lessen the opportunity for false positives to appear in responses. [32]

Conclusion
The Portuguese cross cultural adaptation of the SnSt test confirms the validity and reliability of SnSt-pt in the Portuguese population. The study provides distinct and integral normative data for each of 3 age groups and for both genders. SnSt-pt distinguishes patients from healthy controls with high sensitivity and specificity. The SnSt odor identification test is a tool suited for the routine clinical workup of patients and for a range of additional uses in healthcare and in industry.