Reference data on reaction time and aging using the Nintendo Wii Balance Board: A cross-sectional study of 354 subjects from 20 to 99 years of age

Background Falls among older adults is one of the major public health challenges facing the rapidly changing demography. The valid assessment of reaction time (RT) and other well-documented risk factors for falls are mainly restricted to specialized clinics due to the equipment needed. The Nintendo Wii Balance Board has the potential to be a multi-modal test and intervention instrument for these risk factors, however, reference data are lacking. Objective To provide RT reference data and to characterize the age-related changes in RT measured by the Nintendo Wii Balance Board. Method Healthy participants were recruited at various locations and their RT in hands and feet were tested by six assessors using the Nintendo Wii Balance Board. Reference data were analysed and presented in age-groups, while the age-related change in RT was tested and characterized with linear regression models. Results 354 participants between 20 and 99 years of age were tested. For both hands and feet, mean RT and its variation increased with age. There was a statistically significant non-linear increase in RT with age. The averaged difference between male and female was significant, with males being faster than females for both hands and feet. The averaged difference between dominant and non-dominant side was non-significant. Conclusion This study reported reference data with percentiles for a new promising method for reliably testing RT. The RT data were consistent with previously known effects of age and gender on RT.


Introduction
there is a need for valid reference data. So far, only smaller studies investigating the validity of WBB have been published. Therefore, the aims of this study are (1) to provide reference data on RT in a larger heterogeneous study-population and (2) to characterize the age-related changes in RT in both males and females.

Study-design and recruitment
To simulate common fall-risk assessment, we tested RT in conjunction with balance and strength assessments in a cross-sectional study using six different assessors. All measurements were performed during a single home visit. Participants were recruited at various locations (e.g. malls, local communities, university campus, and hospital staff) during the spring and summer of 2016 in Denmark (Aalborg and Odense) and Norway (Oslo and Ålesund). Participants were eligible for inclusion if they were at least 20 years old and reported good self-perceived health. Participants were excluded if they had clear cognitive problems (not being able to name current year or the national capital), could not stand unsupported for 30 seconds, had significant neuromuscular disease (e.g. sequelae after stroke or Parkinson's disease), or musculoskeletal disease (i.e. recent (< 6 months) orthopedic surgery or bone fracture, muscular dystrophy, or polymyositis rheumatica). Lastly, individuals with alloplastic surgery within the last two years were excluded. Participants gave oral consent and the study was approved by the ethics committee of the North Jutland Region, Denmark.

Equipment and software
The WBB is a rectangular-shaped platform, which has four uniaxial vertical strain gauge force transducers, one in each corner. From the WBB data were streamed using Bluetooth technology to a personal computer and into FysioMeter (Bronderslev, Demark). The FysioMeter software obtained data from the transducers in the WBB by four channels of 16-bit digital data samples at approximately 100 Hz. The data were filtered using a 4th order Butterworth filter with a cut-off frequency of 20 Hz.

Overall experimental procedure
Prior to any measurements, the assessors collected information on age, gender, weight, height, hand and leg dominance (i.e. "with which side do you prefer to throw/kick a ball?"), smoking status (i.e. never, current, or prior), and number of prescription drugs used daily. Furthermore, participants' physical activity level was assessed from 1 (least active) to 4 (most active) for work (if applicable) and leisure hours, as done in the Copenhagen City Heart Study [31].
Harmonization of the experimental procedure was secured between the six assessors (AWB and FE (medical students), MTR (medical doctor), KDE and MS (nurses), and MDH (physiotherapist)) at the Department of Geriatrics, Aalborg University Hospital before the study was initiated. To minimize systematic bias, each assessor recruited participants for every predefined age category: 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, and 80+ years. The assessment of RT was done after the assessment of postural balance, but before the measurement of muscle strength. The experimental procedure for the assessment of postural balance and muscle strength are fully described in the cited reproducibility studies [25,29,32].
participant. The WBB is 5 cm x 30 cm x 50 cm, meaning that the participants stepping motion had to exceed the 5 cm height. In preparation for the RT tests, participants were instructed to bear equal weight on each leg (i.e. not to have one leg more ready for stepping). The computer screen visualized a virtual WBB, and was placed on a table 80-100 cm in front of the eyes of the participant. When the test started, a green visual stimulus appeared at a random side (left or right side of the board) at a random time (1-4 seconds) on the virtual WBB board. Participants were instructed beforehand that this indicated which side to tap the board as quickly as possible with the appropriate foot. When the participant tapped the board, the internal timer would stop and the time from stimulus to the registered hit was recorded in milliseconds (ms). The next attempt was initiated immediately after the previous one, for a total of seven assessments. For the first six, the software was designed to include three attempts for each side (left and right) at a random order. The last and seventh attempt served as a dummy to prevent the participant from anticipating which side the stimulus would appear. Importantly, the full seven-attempt session was stopped and repeated if the participant during one of the assessments either lost focus on the screen, missed the board during hitting, or used the non-designated side to hit the board.
For the upper limb RT assessment, the participant was seated in a chair with arms resting and fist clenched 5 cm in front of the WBB. As with the lower limb RT assessment, a green visual stimulus appeared on a random side at a random time interval between 1 and 4 seconds for seven attempts. The participant stopped the internal timer by hitting the board on the designated side as fast as possible. For the same indications as for the lower limb RT measurement, the seven-attempt session was stopped and repeated.

Statistical analysis
All statistical analyses were performed using SPSS (version 24). All results are given with mean ± standard deviation (SD) for normally distributed data (Shapiro-Wilk) or median with interquartile range for non-normal distributions. From the FysioMeter software, four variables of mean RT were extracted from each participant: RT hands for dominant and non-dominant side (RTH-D and RTH-ND, respectively), and RT feet for dominant and non-dominant side (RTF-D and RTF-ND, respectively). Variables were divided into gender and age groups (20-29, 30-39, 40-49, 50-59, 60-69, 70-79, and 80+) and assessed for outliers using the outlier labelling rule [33]. With the exception of extreme outliers, indicative of measurement errors, outliers were winsorized [33]. Also, for each gender and age-group, the 10, 25, 75, and 90 percentiles were extracted. We used one-sample t-test and independent t-test to test for significant difference between the mean RT for each side (dominant and non-dominant) and gender, respectively. Non-normal data were tested with non-parametric tests, i.e. Wilcoxon Signed-Rank test and Mann-Whitney U test.
To investigate the age-related changes in RT, four linear regression models were calculated using the mean RT for hands and feet for each gender, i.e. mean RT as dependent variable and age as independent variable. Assumption of linearity and homoscedastisticity was assessed with the standardized residuals plotted against predicted values, while the assumption of normal distributed errors and autocorrelations was assessed with the histogram of residuals and Durbin-Watson test (accepted value between 1.5 and 2.5), respectively [34]. Violations of linearity or homoscedasticity were corrected with transformations. The presence of non-linear relationships were statistically tested with hierarchical multiple regression using a quadratic model, i.e. adding the age squared as an independent variable to our linear regression models.

Results
A total of 354 participants, age 20-99 years, were recruited and tested. Participant characteristics and reaction time results (in percentiles) are given in Tables 1 and 2, respectively. There were three and five extreme outliers for RT hands and feet, respectively. As expected, all four variables of reaction time data were positively skewed. Histogram for each of the four variables are available as supplementary material online. In general, the mean or median age for each gender was comparable in every age-group and our study population reported a fairly high level of physical activity during leisure time. Additionally, in the age-groups 60-69, 70-79 and 80+ there were about 20 percentage points more females compared to males reported never smoking. For all four variables of RT, there was an increase in RT and variation with higher age group (Table 2). Using Mann-Whitney U-test, the difference between male and female was statistically significant for both hands (Z = -2.71, p = 0.007) and feet (Z = -2.08, p = 0.037), with males being faster than females. The approximated difference, assessed with the independent t-test, was -50 milliseconds (95% CI: -84;-16) and -55 milliseconds (95% CI: -101;-9) for hands and feet, respectively. Turning to the difference between dominant and non-dominant side, Wilcoxon Signed-Rank test was non-significant for hands (Z = -1.35, p = 0.178) and feet (Z = -1.07, p = 0.283). Results from the linear regression after transformation are presented in Table 3 using the reciprocal RT variables.
Without transformation, there was evidence of non-linearity, especially for the RT feet variables, as well as some indications of heteroscedastisity. As indicated by the negative regression coefficients, there was a statistically significant positive relationship with RT and age. On average, age could account for 47% and 59% of the variation in reciprocal RT for hands and feet, respectively. As indicated by the F-values, predictions of reciprocal RT based on age were superior for RT feet compared to RT hands. The hierarchical multiple regression confirmed a statistically significant non-linear relationship between RT and age. For RT hands, the quadratic model gave a significant R 2 change of 3.2% (F change = 9.1, p < 0.001) and 5.6% (F change = 21.1, p < 0.001) for males and females, respectively. For RT feet, the R 2 change was 3.6% (F change = 13.0, p < 0.001) and 9.4% (F change = 44.3, p < 0.001) for males and females, respectively. All beta coefficients for age squared were positive, meaning that the rate of increased RT increases with age. The quadratic models together with the raw data are presented graphically as the curved line in Figs 1 and 2.

Discussion
This study reported reference data on RT and investigated its relationship with age using a standard WBB. The main findings for both feet and hands were; (1) increased mean RT and variation with increasing age, (2) males had faster RT compared to females, and (3) a significant increase in the rate of increase in RT with age.  It is well established that RT and its variability increases with age [18,[35][36][37]. Less is known about sex differences. In our study, we found a consistent lower RT for males compared to females in every age group. The difference between male and female was statistically significant for both hands and feet. In the United Kingdom Health and Lifestyle Survey (HALS), using RT data from over 7,000 participants, consistent sex differences for simple and four-choice RT were also reported [37]. The method used in our study was a two-choice RT, meaning that the participant not only had to respond to a particular stimulus (simple RT), but also had to respond in one of two ways depending on the stimulus. This required additional informationprocessing in the central nervous system compared to simple RT. The sex differences for mean choice RT are reported to be the weakest and most variable. Being more extensively studied than simple RT, it could explain why studies with smaller samples and incomplete age coverage could yield ambiguous results [37]. However, in general, the present method and study Reaction time and aging: A cross-sectional study using the Nintendo Wii Balance Board protocol has identified these differences, and further supports the results from larger and more robust studies [38].
The HALS study also reported that mean RT increases more rapidly at older ages for females compared to males. By looking at the increased curvature in our quadratic models, this phenomenon was also apparent in our study. A possible cause for this accelerated increase in RT, albeit not the only one, is that older men represent a healthier subset compared to women due to sex differences in survival. However, larger cohort studies have only found negligible association between RT measures and medical or lifestyle factors [18], and no clear trend could be drawn from our anthropometric data either (Table 1). However, it should be noted that we used a rather crude measure for physical activity (four groups) which placed most of our individuals in group 2 or 3. Also, the age difference in activity level was masked by the fact that the older participants reported their total amount of physical activity as leisure time while young participants reported leisure time and working hours separately. Still, in contrast to lifestyle factors, biological factors such as vision, grip strength, and forced expiratory volume at one second, have shown to be more important for explaining age differences in RT performance [18].
Concerning our third main finding, the HALS study identified a similar quadratic curvilinear relationship between age and choice RT for both genders. This was statistically tested and confirmed with hierarchical multiple regression. In our linear regression models, age could account for more than half of the RT variation in males and approximately half of the RT variation in females. Interestingly, the models also showed that RT feet were better predicted with age than RT hands. Furthermore, HALS reported that their mean four-choice RT slows throughout adulthood, whereas simple mean RT barely slows until around 50 years of age. It is of particular interest that our findings, using a two-choice RT test, were something in between. We found that the average slowing of RT was evident after around 40 years of age for feet, whereas the slowing of RT was continuous throughout adulthood for hands. Since RT depends on the complexity of the task [39], and although both require response selection, the differences may in part be because two-choice RT is simpler to perform than a four-choice RT. In short, the main findings from our cross-sectional study using WBB were consistent with larger studies on RT and age [38].
Since the RT test used in our study has been shown to have good reproducibility with little or no learning across test sessions [24], the reference data and percentiles provided in this study can be used to identify subjects with increased RT. This may have implications for further research and clinical work. As noted previously, RT is one of the documented risk factors for falling [8][9][10]. The RT test is essentially a speed test of the cognitive-motor interaction, which is a major component of the motor skills required for successful balance recovery [40]. Additionally, the WBB allows for reliable testing of balance [41] and strength [28,29], both of which are important risk factors for falling [3,5]. The reliable assessment of these risk factors has generally been restricted to specialized fall clinics due to the significant costs of the stationary laboratory equipment needed. However, WBB offers a low-cost, highly mobile, and objective global test for these different components of the cognitive-motor interaction. In this regard, it is interesting to note that a recent systematic review and meta-analysis found reactive and volitional stepping interventions to reduce falls among olders adults by approximately 50% [42]. These encouraging results are significantly better than those observed with general exercise interventions [43,44]. The authors attribute this effect to the significant improvements in simple and choice RT, gait, and balance found with stepping interventions [42]. In short, the WBB offers a reliable assessment of the important components of the cognitive-motor interaction needed to prevent falls, and offers a novel opportunity to implement these tests in community-dwelling adults. However, more research is needed to confirm the relevance of these specific tests in fall risk assessments.
There are several strengths with the current study. Firstly, the data and method appear externally valid when compared to other larger studies on RT. Secondly, we used and reported relevant statistics, such as percentiles for further studies, and statistically tested for a significant non-linear relationship between age and RT. We also reported relevant anthropometric data for comparisons with other studies. Thirdly, the RT data were collected in the subjects' home as a battery of three clinically relevant tests, which is similar to a real-life scenario of homebased fall-risk assessment. Finally, the data were averaged and collected for all age groups from six assessors with different educational background. However, our study also has significant limitations. Due to the non-random selection of participants we can not rule out the risk of selection bias. From the anthropometric data, we found some evidence of a comparatively younger representation in age groups 30-39 and 70-79. However, with the exception of females 70-79 for RTH-D, all age groups followed a similar trend for all four RT variables. Furthermore, all raters recruited members from different locations. For these reasons, selection bias may be minimal in the present study. Still, compared to epidemiological studies, our sample size was relatively small. We aimed for a healthy study-population, but did not include a standard clinical test for cognitive deficits. Thus, we can only say that the participants did not present with obvious cognitive deficits during data collection. How the data would look in a high-risk population for falling, or other relevant populations, is not known. Our percentiles are simply an approximation of the normal values, and not cut-offs for any particular risk factor. Further research is needed to investigate any predictive utility of RT using the WBB. Also, the informative value of this study to fall related measures is limited due to the restricted numbers of variables presented. Lastly, we made a crucial effort to harmonize our assessors for this study since RT assessments are susceptible to different experimental protocols and no studies have investigated the interrater reproducibility of our protocol. However, other researchers and assessors, even while trying to mimic our protocol, may end up with a slightly different protocol and resulting RT data. More studies are needed to detect, and preferably minimize, such potential protocol bias.

Conclusion
In this study, we reported reference data with percentiles on a new promising method for reliable RT testing in a healthy population of 354 subjects. Our data using WBB found similar age and gender effects on RT as those previously reported in trials using conventional measurements. These include increased RT mean and variation with increasing age, lower RT among males compared to females, and a significant non-linear relationship between RT and age. The aforementioned data can be used to identify those with high aRT, and has the potential to improve the accuracy of fall risk assessments in community-dwelling older adults. The WBB has the potential to be a multimodal test and intervention instrument for clinically relevant parameters, previously only available at specialized fall clinics.