Reliability of System Identification Techniques to Assess Standing Balance in Healthy Elderly

Objectives System identification techniques have the potential to assess the contribution of the underlying systems involved in standing balance by applying well-known disturbances. We investigated the reliability of standing balance parameters obtained with multivariate closed loop system identification techniques. Methods In twelve healthy elderly balance tests were performed twice a day during three days. Body sway was measured during two minutes of standing with eyes closed and the Balance test Room (BalRoom) was used to apply four disturbances simultaneously: two sensory disturbances, to the proprioceptive and the visual system, and two mechanical disturbances applied at the leg and trunk segment. Using system identification techniques, sensitivity functions of the sensory disturbances and the neuromuscular controller were estimated. Based on the generalizability theory (G theory), systematic errors and sources of variability were assessed using linear mixed models and reliability was assessed by computing indexes of dependability (ID), standard error of measurement (SEM) and minimal detectable change (MDC). Results A systematic error was found between the first and second trial in the sensitivity functions. No systematic error was found in the neuromuscular controller and body sway. The reliability of 15 of 25 parameters and body sway were moderate to excellent when the results of two trials on three days were averaged. To reach an excellent reliability on one day in 7 out of 25 parameters, it was predicted that at least seven trials must be averaged. Conclusion This study shows that system identification techniques are a promising method to assess the underlying systems involved in standing balance in elderly. However, most of the parameters do not appear to be reliable unless a large number of trials are collected across multiple days. To reach an excellent reliability in one third of the parameters, a training session for participants is needed and at least seven trials of two minutes must be performed on one day.


Introduction
Impaired standing balance is a significant problem in elderly [1;2] and is one of the main risk factors and causes of falling [3;4]. Falls often result in serious injuries, including death [5]. In standing balance, several underlying systems (i.e. muscles, neural system and sensory systems) interact, which results in a closed loop system in which cause and effect are interrelated [6]. The underlying systems deteriorate with age and are influenced by diseases and medication use [7][8][9][10]. Due to redundancy, these systems can compensate for each other's deterioration. Therefore, the underlying cause of impaired standing balance is difficult to detect and hence, to intervene with targeted therapies [11].
Current clinical balance tests, such as posturography, do not take aforementioned cause and effect relations and redundancy of standing balance into account and therefore cannot detect the underlying cause of impaired standing balance [11]. Previous research showed that system identification techniques are useful to assess the underlying systems of standing balance, in which the response to well-known disturbances are assessed [12][13][14][15][16]. A clear advantage is that this method takes into account the cause and effect relation and separates the contribution of the underlying systems. This gives the opportunity to improve diagnosis of impaired balance and, eventually, to prevent falling by targeted therapies [6]. Before introducing the method into clinical practice for diagnosing or monitoring treatment of impaired balance, it is important to assess the reliability of this technique, which is yet unknown, and compare it with posturography.
In this study we investigated the reliability of standing balance parameters obtained with four disturbances applied simultaneously and system identification techniques to assess standing balance in healthy elderly and compared this with a parameter obtained with posturography, namely body sway. We used the generalizability theory (G theory) [17], which takes into account both systematic and random measurement errors. A validity study was performed to assess whether differences in standing balance parameters could be detected as expected by the results of previous studies, in which sensory reweighting was investigated by increasing disturbance amplitudes over trials using the same system identification techniques [12;15;18]. Furthermore, recommendations will be given for study designs to reduce the measurement errors and therefore improve the reliability.

Participants
Twelve healthy elderly aged 70 years or older participated in this study. Participants were recruited from the database of the Center of Human Drug Research, Leiden, the Netherlands,

Participant characteristics
Prior to participation, a screening procedure was performed. Medical history was recorded including general questions about smoking, alcohol use, medication use and information on diseases. Anthropometric data included height and body composition measured with a bioelectrical impedance analysis (BIA, InBody 720, Biospace Co., Ltd, Seoul, Korea). Cognitive function was assessed with the MMSE [19]. Orthostatic hypotension was assessed by measuring blood pressure after at least 5 minutes in supine position and 3 minutes after postural change to standing position. Handgrip strength was measured using the Jamar dynamometer handle (Jamar, Sammons Preston Inc, Bolingbrook, IL, USA). Physical functioning was measured with the Short Physical Performance Battery (SPPB) [20]. Walking speed was determined by a 4 meter walking test at normal pace, as part of the SPPB.

Apparatus
Standing balance was assessed using the Balance test Room (BalRoom), a custom-made device applying specifically designed disturbances during stance (Motekforce Link, Culemborg, the Netherlands, and University of Twente, Enschede, the Netherlands) (Fig 1). The BalRoom consists of three separated modules. The first module consists of two support surfaces (SS), which are independently actuated and rotate around the ankles [21]. By rotation of the SS around the ankle axis the proprioceptive information of the ankle is disturbed. The second module is a visual scene (VS) in front of the participant, which rotates around the ankle axes. Rotating the VS around the ankle axis results in a disturbance of the visual information. The third module consists of two rods applying forces at hip and shoulder level (FH and FS, respectively) resulting in movements around the ankle and hip joint. These disturbances are used to investigate the contribution of the ankles and hips and their coupling to standing balance [14]. The body sway was measured in a single plane using a string potentiometer (Celesco SP2-50, Celesco, Chatsworth, CA, United States), which integrates the amplitude of unidirectional body movement transferred through a string attached to the waist of the participant.

Disturbance signals
All disturbances applied with the BalRoom were multisine signals with a unique combination of frequencies (Fig 2). All excited frequencies were multiples of the frequency 0.0625 Hz resulting in a disturbance period of 16 s. The SS rotated following a continuous position disturbance signal with increasing zero-to-peak amplitude over trials, i.e. 0.02, 0.03 and 0.04 radians, and a flat velocity spectrum with frequencies between 0.125 and 6.9375 Hz. The VS rotated following a continuous position disturbance signal with constant zero-to-peak amplitude of 0.03 radians over trials and a flat velocity spectrum with frequencies between 0.0625 and 1 Hz. The FH and FS disturbances are independent continuous force disturbance signals with constant zero-to-peak amplitude of 30 Newton over trials consisting of frequency contents between 0.75 and 7 Hz. All disturbances were repeated eight times resulting in a total duration of 128 seconds.

Procedure
During the screening visit for inclusion up to 21 days before the start of the study, each participant had a training session to get familiarized with the BalRoom and with the body sway test. No data were recorded. During the study, the tests were performed during three sessions separated by one week, allowing assessment of intersession variability. Per session the tests were performed twice separated by one hour, allowing assessment of intrasession variability. During all tests the participant wore comfortable flat shoes. During the BalRoom test, the participant  was instructed to stand with the arms resting along the body, with both feet in place on the support surfaces. The two sensory (SS and VS) and mechanical (FH and FS) disturbances were applied simultaneously. Each test consisted of three conditions with increasing disturbance amplitude of the SS rotation (i.e. 0.02, 0.03 and 0.04 radians), while the amplitudes of the VS, FH and FS disturbances remained constant. The three conditions were presented in random order. Before recording each condition, the participant was allowed about 10 seconds to get accustomed to the disturbances. Between conditions, the participant was offered ample resting time depending on individual needs. The participant wore a safety harness to prevent falling, which did not constrain movement nor provide support or orientation information.
During the body sway test, the participant was asked to stand still and comfortable with eyes closed for a period of 2 minutes, with the feet approximately 10 cm apart and the hands in a relaxed position along the body.

Data recording and processing
The actual angles of SS rotation (i.e. motor angles), applied forces at hip and shoulder level (FH and FS forces) and the applied torques to the SS (i.e. motor torques) were available for measurement. Lower and upper body segmental movements were measured in anterior-posterior direction using two draw wire potentiometers (Celesco SP2-50, Celesco, Chatsworth, CA, United States) at a sample frequency of 1000 Hz. The potentiometers were connected to the hip and the shoulders by magnets and straps. The motor angles, segment angles, motor torques and applied FH and FS forces were recorded using a Matlab interface with a sample frequency of 1000 Hz. Data analysis was performed with Matlab (The MathWorks, Natick, MA, United States). The leg and hip angle were calculated using goniometry and using the segment movement of the lower and upper body [22]. The ankle torque was obtained by subtracting the contribution of the mass and inertia of the support surfaces from the recorded motor torques. The hip torque was obtained using the applied FH and FS forces and leg and hip angles using inverse dynamics [22]. The time series were segmented into eight data blocks of 16 seconds (i.e. the period of the disturbance signal).

Data analysis
To indicate the effect of the disturbances on the ankle torque, hip torque and joint angles, Frequency Response Functions (FRFs) were estimated. The time series of the disturbances, ankle torque, hip torque, leg and hip angle were transformed to the frequency domain. The periodic part of the frequency coefficients was determined by averaging over the data blocks. The Power Spectral Densities (PSD) and Cross Spectral Densities (CSD) were computed to calculate the FRFs [23]. For each disturbance, only the excited frequencies were analyzed.
Sensitivity functions. The sensitivity function represents the sensitivity of the body reactions (i.e. joint angles and joint torques) to sensory perturbations. FRFs representing the sensitivity functions of the SS rotation and the VS rotation to the ankle torque, hip torque, leg angle and hip angle were estimated using the indirect approach using Eq 1 [12;23].
In which Φ d,x represents the CSD of the disturbance (d) (i.e. SS rotation or VS rotation) and x, which represents the ankle torque (T a ), hip torque (T h ), leg angle (θ l ), or hip angle (θ h ), and Φ d,d the PSD of the disturbance. This results in 8 FRFs; 1) SS rotation to ankle torque ( SS S Ta ), 2) SS rotation to hip torque ( SS S Th ), 3) SS rotation to leg angle ( SS S θl ), and 4) SS rotation to hip angle ( SS S θh ), and 5) to 8) the VS rotation to each torque and angle ( VS S Ta , VS S Th , VS S θl , VS S θh ). Each FRF is represented by a magnitude and phase representing the ratio between the input and output and the relative timing both as function of frequency. The magnitude of the sensitivity function of the ankle and hip torque is normalized to the gravitational stiffness (mgl CoM ). The average magnitude on the low frequencies (<0.375Hz and <0.1875Hz, for SS and VS respectively) and the phase on higher frequencies (0.68Hz and 0.375Hz, for SS and VS respectively) are the parameters of interest. Different values of frequencies were used for SS and VS due to differences in frequency content. They represent the sensitivity to the disturbances and the phase lag between the disturbance and the reaction of the body, respectively, resulting in 16 parameters.
Neuromuscular controller. The neuromuscular controller is the link between the sensory systems and the muscles, where the sensory information is combined and muscle commands are generated to keep the body in upright position. The FRFs representing the ankle and hip controller and their coupling were estimated using the multi-input-multi-output (MIMO) approach according to the method described by  and Eq 2 [14].
In which Φ d,T and Φ d,θ are the CSD matrices between the external disturbance (d) (i.e. FH and FS)) and the corrective ankle and hip torques (T) and the leg and hip angles (θ) resulting in a two-by-two FRF matrix (H c ). This results in 4 FRFs; 1) leg angle to ankle torque (H θl2Ta ), 2) leg angle to hip torque (H θl2Th ), 3) hip angle to hip torque (H θh2Th ), and 4) hip angle to ankle torque (H θh2Ta ). The magnitude is normalized to the gravitational stiffness (mgl CoM ).The average magnitude on the low frequencies (<1Hz) and the phase on higher frequencies (2.3Hz) are the parameters of interest and represent the normalized effective stiffness and the phase lag between the torques and angles, resulting in 8 parameters [14].
Body sway. The body sway (x BS ) was measured over 2 minutes during quiet stance with eyes closed. The movement of the body was expressed as millimeters of sway during 2 minutes.

Statistical analysis
The characteristics of the participants were represented by mean and standard deviation in case of a Gaussian distribution. Else, median and inter quartile range or number and percentage were presented. The parameters obtained with system identification techniques (i.e. sensitivity and phase lag of the sensitivity functions, and normalized effective stiffness and phase lag of the neuromuscular controller) and body sway are given as mean and standard deviation.
Reliability of each parameter was assessed using the G theory in three steps [17]. First, systematic errors were identified using linear mixed models with trial (intrasession), day (intersession) and their interaction as fixed effects and participant intercept as random effect. Because of the number of dependent variables tested, a Bonferroni correction was applied to avoid type I errors. P values below 0.006 were considered statistically significant. The various sources of measurement errors of each parameter were assessed using a random effects repeated measures analysis of variance (ANOVA) including participant, trial, day and their interactions. This resulted in the variance of the participants (σ p 2 ), the variance of the trials (σ t 2 ), the variance of the day (σ d 2 ), the variance of their interactions (σ pt 2 , σ pd 2 and σ td 2 ) and the variance of the residual (σ ptd,e 2 ). All were presented as percentages of the total variance. Negative variance components were set to zero. The actual sources of variance were used to calculate the index of dependability (ID), the standard error of the measurement (SEM) and the minimal detectable change (MDC) using Eq 3 [17;24].
In which, n t is the number of trials and n d the number of days. Comparable with an intraclass correlation coefficient (ICC), the ID ranges between 0 and 1 and can be interpreted as; ID < 0.40 poor reliability, 0.40 < ID < 0.75 moderate reliability, and ID > 0.75 excellent reliability [25]. In this case, the ID represents the reliability for two trials on three days. The SEM indicates the absolute reliability and is represented by an absolute value and a percentage of the overall mean. The MDC shows which effect (e.g. treatment effect) can be detected with the parameters of interest and therefore indicates the clinical relevance. A low SEM and MDC are indicative of a reliable and clinical relevant parameter.
Second, a decision study was performed in which the effect of different measurement protocols on the reliability was investigated. Aforementioned equations show that increasing the number of trials or number of days results in an increase of ID and a decrease of SEM and MDC, i.e. an improvement of reliability. In the decision study, the number of trials was varied between 1 and 40 trials and the number of days between 1 and 3. Per number of days, the number of trials needed to reach an excellent reliability was determined in this group of healthy elderly (ID > 0.75).
Third, a validity study was performed to assess whether differences in the sensitivity functions represented by the sensitivity and phase lag due to increasing disturbance amplitude of the SS rotation could be detected. Previous studies showed an increase in sensitivity to VS rotation [12;26] and a decrease in sensitivity to SS rotation [12;18] due to increasing disturbance amplitude of the SS rotation. Furthermore, no differences in neuromuscular controller were detected with increasing disturbance amplitude [15;18]. A linear mixed model was constructed with disturbance amplitude as fixed effect and participant intercept as random effect. To correct for multiple testing, a Bonferroni correction was applied to avoid type I errors. P values below 0.006 were considered statistically significant.
Statistical analysis was performed with SPSS version 20 (SPSS Inc., Chicago, USA) and Matlab (The MathWorks, Natick, MA, United States). Graphs were made with Matlab (The MathWorks, Natick, MA, United States).
The minimal dataset used for statistical analysis is available from the 3TU database (datacentrum.3tu.nl, DOI: 10.5072/uuid:433acf72-2779-4470-a111-d94c415125b8). Table 1 presents the characteristics of the healthy old participants. Figs 3, 4 and 5 displays the magnitude of the FRFs of the sensitivity functions of the SS rotations and the VS rotations, and of the neuromuscular controller, respectively. Table 2 reports the systematic errors obtained with the linear mixed models according to the G theory. No systematic errors were found for the body sway (x BS ). The sensitivity functions show both a main effect of trial and day. Overall, the sensitivity of the SS rotation was lower during the first trial compared with the second trial (for SS S θl and SS S Ta ) and it was lower during the first day compared to the second and third day (for SS S Th and SS S Ta ). The sensitivity function of the VS rotation shows the opposite result; the sensitivity of the first trial was higher compared with the second trial (for VS S Th and VS S Ta ). Furthermore, the phase lags of VS S θl was higher in the first trial compared with the second trial. The phase lags did not differ between days. The normalized effective stiffness estimated using the FS and FH disturbances showed an effect of the day; one component of the neuromuscular controller (H θh2Th ) was higher during the first day compared with the second day. No effect of trial and day was found for the phase lags of all components of the neuromuscular controller.  Reliability of System Identification Techniques in Standing Balance Variance components Table 3 shows the magnitude of the variance components as percentage of the total variance (i.e. the sources of variability) according to the G theory. The variance of the participant in the body sway was 87.3%. The other variance components in the body sway were low varying from 0-7.2%. The median of the variance of the participant (σ p 2 ) was 17.8% with an interquartile range from 9.9% to 28.9%. The contribution of the trial variance (σ t 2 ) was 0.9% (median) with an interquartile range from 0.0% to 10.5%. The contribution of the day variance (σ d 2 ) was 0.4% (median) with an interquartile range from 0.0% to 5.7%.

Reliability of System Identification Techniques in Standing Balance
The error variance related to the interactions between the participant and trial (σ pt 2 ), between participant and day (σ pd 2 ) and between trial and day (σ td 2 ) were low; the median of them were 0.2%, 9.2% and 0.9%, respectively. The largest proportion of measurement variability was due to the participant variability (σ p 2 ) and the other interactions combined with the residual error (σ ptd,e 2 ) contributing 48.1% (median) ranging from 7.2% to 76.0%. Table 4 presents the results of the reliability measures. In this study design, the ID represents the reliability for two trials on three days. The ID of the body sway was 0.97. The ID in 4 out of  25 parameters was higher than 0.75 and in 11 out of 25 parameters ID was between 0.40 and 0.75. The SEM and SEM % were inverse related with the ID. Furthermore, the MDC was lower with increased ID. To reach an ID of 0.75, for the body sway one trial was needed. For 28% (7/ 25) of the parameters at least seven trials were needed to average over one day to reach an ID of 0.75. Increasing the number of days resulted in less trials needed per day to reach an ID higher than 0.75. Table 5 presents the results of the validity study. The mean and standard deviation of the parameters of the second trial at the first day are given for each condition. All sensitivities to the SS rotation decreased with increasing disturbance amplitude (p < 0.002). The sensitivities to the VS rotation did not significantly change with increasing disturbance amplitude (p > 0.008). No significant differences were found for the phase lag of the sensitivities to the SS rotation and VS rotation. No significant differences were found between the conditions for the parameters describing the neuromuscular controller.

Discussion
In this study, we assessed the reliability of a comprehensive set of parameters obtained with four disturbances applied simultaneously and (MIMO closed loop) system identification techniques describing standing balance in a group of healthy elderly. Results were obtained by measuring standing balance twice during three days. A distinction was made between systematic and random errors. The results showed a systematic error between the first and second trial measured with the BalRoom on one day, which was not found using the body sway measurements. The reliability ranged from moderate to excellent when averaging the two trials of three days (i.e. averaging six trials). To the best of our knowledge, this is the first study that investigated the reliability of system identification techniques to assess standing balance in healthy elderly.

Systematic errors
In general, the sensitivity to the SS rotation was lower in the first trial compared with the second trial, while the sensitivity to the VS rotation was higher in the first trial compared to the second trial. These results are confirmed by the variance component of the trial (σ t 2 ) and day (σ d 2 ); a high variance component of trial and day indicates a systematic error. Previous studies using system identification techniques also showed a systematic error between the first and second trial or between days. These differences were explained by motor learning, changes in posture or stretching of the joints [27][28][29]. In contrast, in a previous study no learning effects were found. These results might be due to the practice session all participants performed prior to participation in this study [30]. In our study, the differences between the first trial compared with the second trial (i.e. a lower sensitivity to proprioception and a higher sensitivity to vision during the first trial) could be explained by a difference in strategy used to maintain standing balance or familiarization during the test. According to the sensory reweighting hypothesis, sensory information is weighted based on reliability; the weight of the proprioception increased at the cost of a decrease of the weight of the other sensory information [12]. As the sensitivity to the disturbances represents the contribution of the proprioceptive and visual information, the sensitivity to the SS rotation increases, while the sensitivity to the VS rotation decreases.
The combination of mechanical disturbances with sensory disturbances of the visual and proprioceptive information could have resulted in a longer adaptation time or a redundancy of applied strategies to withstand the disturbances. However, comparable systematic errors within a day were found in healthy elderly (unpublished data) in a previous study using only SS rotation to disturb proprioceptive information [18], which suggests that the longer adaptation time is not due to the combination of multiple disturbances. In contrast, no systematic errors were found in healthy young adults (unpublished data). This is an indication of increased adaptation time in elderly compared with young adults.
When a steady state of standing balance is assessed, a familiarization trial is needed on the same day to overcome the systematic error between trials. Excluding the first trial of each day resulted in less systematic errors between days.

Reliability
The variance component of the participant (σ p 2 ) as percentage of the overall variance corresponds to the ICC when both n t and n d are equal to one. The reliability of the parameters ranged from poor to moderate. To increase reliability of steady state balance assessment, multiple trials on more than one day have to be performed. The ID values indicate that performing two trials on three days results in a reliability ranging from moderate to excellent, which is needed to discriminate between healthy old individuals. A high residual variance (σ ptd,e 2 ) component indicates that a majority of the measurement error is random or can be attributed to error sources not identified in the study. In this study, relative low SEM% were found (<20%) in 12 out of 25 parameters, which is comparable with other studies using system identification techniques [27]. A low SEM% indicates that the parameter could detect changes over time within the same participant (e.g. effects of intervention or changes in conditions). However, the SEM values depend on the number of trials performed and on the number of days measured. A high SEM% indicates less accurate parameters; in 6 out of 25 parameters a high SEM% (>30%) was found. Therefore, it must be considered whether these parameters are useful or not in assessing standing balance. The MDC values are in the same order as in a previous study using only SS rotation in healthy elderly (unpublished data) and indicates which change in the parameters can be minimally detected, when comparing groups or within the same participant. It is difficult to interpret the MDC results of new parameters. To get more feeling for this measure and to get more insight in the clinical relevance, it is recommended to assess standing balance using system identification techniques in several groups of elderly with a large variance in impaired balance severity and clinical phenotypes [24].
The results showed that the ID of 4 out of 25 parameters was excellent and the ID of 11 out of 25 parameters was moderate. To reach an excellent reliability for steady state balance assessment in one third of the parameters, at least 7 trials on one day are needed. Most of the parameters do not appear to be reliable in this population, unless a very large number of trials are collected on multiple days. In this study, averaging trials across days seems to be more effective than averaging more trials per day. These results are consistent with the variance component of interaction; the variance component of participant x day (σ pd 2 ) is much higher than the variance component of participant x trial (σ pt 2 ). This means that the parameters for each participant were more affected by between day than within day sources of error, relative to the other participants. These results are in accordance with previous studies; Lariviere et al. (2015) showed that one till ten trials were needed to assess an excellent reliability for parameters obtained with system identification techniques [27]. A lower reliability seems to be a general feature of position stabilization task in contrast to tracking tasks [28].

Validity
The validity study showed that differences could be detected within participants by changing the experimental condition. It was possible to detect changes over conditions using one trial. Increasing the disturbance amplitude of the SS rotation resulted in a decreased sensitivity to the SS rotation. This result was expected according to the sensory reweighting hypothesis, as mentioned before. Our findings are therefore also in line with previous studies investigating sensory reweighting during standing balance using system identification techniques [12;15]. However, we also expected to see an increase in sensitivity to the VS rotation as compensation for the decrease in sensitivity to the SS rotation. The absence of this change might be explained by a third sensory system, i.e. the vestibular system. Less use of proprioceptive information could also be accompanied by more use of the vestibular information. Whether someone increases their use of the visual information or their use of the vestibular information could be different per individual. No changes were found in the neuromuscular controller by increasing the disturbance amplitude of the SS rotation. This is following our expectations, as changes in sensory information does not influence the stiffness and damping of the neuromuscular controller. These results are also in accordance with a previous study, in which we showed that the neuromuscular controller did not change with increasing disturbance amplitude of the SS rotation [15;18]. Furthermore, no changes were found in the phase lag of both the sensitivity functions and the neuromuscular controller with increasing the disturbance amplitude of the SS rotation. According to Peterka (2002) we expected to see a difference in the phase lag of the sensitivity functions to the SS rotation [12]. That we did not find a difference, could be explained by the high MDC values and high SEM% for the phase lags.

System identification techniques compared to posturography
System identification techniques are a new engineering approach to assess standing balance. In contrast with posturography, a general used technique to assess standing balance, it is possible to detect underlying systems and used strategies in standing balance [6;11]. In this study, we assessed standing balance with both system identification techniques and posturography (i.e. body sway). Compared to system identification techniques, no systematic errors and a higher reliability were found for posturography. To reach an excellent reliability in posturography only 1 trial is needed.
In comparison with our results of system identification techniques, studies investigating the reliability of the Sensory Organization Test (SOT) showed a learning effect in healthy young due to changes in postural strategies or through reweighting of sensory information. Remarkably, this learning effect was only present in more demanding test conditions [31]. Studies investigating the reliability of Center of Pressure (CoP) parameters did not find systematic errors [32;33], which is comparable with our results of the body sway, but in contrast with the system identification techniques results. This could be explained by the influence of used strategies to maintain balance on the parameters. CoP parameters only describe objectively standing balance, while system identification techniques also describe the underlying changes. Therefore, changes in strategies between trials will not be detected by CoP parameters and do not influence the reliability of CoP parameters.
The reliability of the SOT was moderate in noninstitutionalized old adults when 2 sessions of the test were performed 1 week apart. To improve the reliability of the computer-generated scores of the SOT, a modification of the scoring system was recommended [34]. The reliability of CoP parameters depends on the test condition, study design, study population and therapeutic interventions [35]. To reach an excellent reliability of CoP parameters, the duration of the trial must be minimal 90 seconds, must by three to five times repeated and must be measured with eyes closed and on a firm surface [35]. Santos et al. (2007) showed that at least 7 repetitions must be performed to reach an excellent reliability for CoP parameters [36]. This is comparable with our study, in which measurements of approximately two minutes were used to assess standing balance with the BalRoom and must be repeated seven times to reach an excellent reliability. The found relative low SEM% (<20%) are comparable with other studies using CoP parameters [36].

Clinical recommendations
First, the results indicate that there is a systematic error between the first trial and the second trial. This could be due to changes in used strategies to maintain standing balance and time needed to reach a steady state. Therefore, to assess steady state balance we recommend to perform one familiarization trial on each day. Second, results showed that averaging over days is more effective than averaging within days. However, in clinical practice it is often not feasible to measure on more than one day as it is time-consuming. Furthermore, performing multiple measurements on one day could be hampered by fatigue or boredom of the participant, which has to be taken into account. However, measuring less trials on one day will result in lower reliability.
It is recommended to measure more than 7 trials per day to reach an excellent reliability. However, this is only the case for some of the parameters. 16 out of 25 parameters even require more than 40 trials on one day to reach an excellent reliability. Therefore, we have to take this into account and select which parameters are the most important parameters to assess standing balance and represent the underlying changes in standing balance. Furthermore, more research is needed to answer the question whether changes in the measurement protocol (e.g. including a training session, duration of trials, repetitions of the perturbation signal) will improve reliability or not.
As mentioned before, systematic errors might be due to more time needed for reaching a steady state balance or a redundancy of applied strategies. This implies that parameters obtained with system identification techniques are sensitive for detection of adaptation strategies. Besides steady state balance, adaptation strategy and adaptation time may have clinical meaning and need further exploration. System identification techniques are sensitive tools to assess the duration of adaptation of sensory reweighting [37] in contrast to e.g. CoP measurement.

Strengths and limitations
The strength of this study is the selection of healthy old participants, resulting in a well phenotyped group. However, this also affects ICC and ID. Low variability within the participants (i.e. a homogeneous population) results in lower ICC and ID values and therefore lower relative reliabilities [38;39]. SEM(%) and MDC are measures of absolute reliability and important measures when interpreting results of repeated measures effects of intervention. In a less healthy population with neurological or balance disorders the variability is likely much higher. This may result in higher reliabilities and therefore lower SEM(%) and MDC values, which indicates more accurate and sensitive parameters. Therefore, before using this technique in another population it is recommended to first test reliability in this population of interest. Another strength of this study is the set up with exactly one week between sessions. A limitation of this study is the relative low number of participants. However, a larger number of participants will result in even less variability within the population due to the homegeneity, which might affect the ID as mentioned before. A larger sample size will therefore not automatically result in better reliability. As in this study only two trials were performed per day, it was not possible to assess the number of trials needed to reach an excellent reliability when omitting the first (familiarization) trial from analysis. Therefore, we could not give recommendations on the number of trials needed to reach an excellent reliability after a training session. Furthermore, we could only predict the number of trials needed to reach an excellent reliability.

Conclusions
This study investigated the reliability of a comprehensive set of parameters obtained with system identification techniques to assess standing balance in a population of healthy elderly. Systematic errors were present between trials showing sensitivity of parameters obtained with system identification techniques for detection of adaptation strategies. To assess steady state balance a training session is recommended. As only a single trial per day resulted in poor to moderate reliability, it is recommended to perform more trials on separate days. Most of the parameters do not appear reliable unless a very large number of trials are collected across multiple days. Within the present framework, acceptable reliability of steady state balance assessment could be achieved in one third of the parameters by measuring and averaging at least seven trials on the same day.