Gaze and Movement Assessment (GaMA): Inter-site validation of a visuomotor upper limb functional protocol

Background Successful hand-object interactions require precise hand-eye coordination with continual movement adjustments. Quantitative measurement of this visuomotor behaviour could provide valuable insight into upper limb impairments. The Gaze and Movement Assessment (GaMA) was developed to provide protocols for simultaneous motion capture and eye tracking during the administration of two functional tasks, along with data analysis methods to generate standard measures of visuomotor behaviour. The objective of this study was to investigate the reproducibility of the GaMA protocol across two independent groups of non-disabled participants, with different raters using different motion capture and eye tracking technology. Methods Twenty non-disabled adults performed the Pasta Box Task and the Cup Transfer Task. Upper body and eye movements were recorded using motion capture and eye tracking, respectively. Measures of hand movement, angular joint kinematics, and eye gaze were compared to those from a different sample of twenty non-disabled adults who had previously performed the same protocol with different technology, rater and site. Results Participants took longer to perform the tasks versus those from the earlier study, although the relative time of each movement phase was similar. Measures that were dissimilar between the groups included hand distances travelled, hand trajectories, number of movement units, eye latencies, and peak angular velocities. Similarities included all hand velocity and grip aperture measures, eye fixations, and most peak joint angle and range of motion measures. Discussion The reproducibility of GaMA was confirmed by this study, despite a few differences introduced by learning effects, task demonstration variation, and limitations of the kinematic model. GaMA accurately quantifies the typical behaviours of a non-disabled population, producing precise quantitative measures of hand function, trunk and angular joint kinematics, and associated visuomotor behaviour. This work advances the consideration for use of GaMA in populations with upper limb sensorimotor impairment.


Introduction
Various sensorimotor impairments including stroke [1], amputation [2], and spinal cord injury [3] lead to deficits in upper limb performance, which can hamper activities of daily living that require precise hand-object interactions [4]. Functional assessments are used to gauge the impact of upper limb impairment and to monitor rehabilitative progress thereafter [5,6]. However, such assessments often do not yield precise and comprehensive measures of joint and trunk movements, along with hand function measures such as grip aperture [7,8]. Furthermore, they do not tend to measure the corresponding hand-eye interaction, which is recognized as an important behaviour during grasp control [9,10]. Quantitative measurement of visuomotor behaviour collected during the execution of functional tasks can enhance the understanding of these movement features. Measurement technologies commonly used for this purpose include eye tracking and motion capture. Existing assessments reliant on such specialized equipment, however, can be criticized as not being generalizable to authentic activities of daily function, as they tend to focus on simple functions of reaching and grasping [11][12][13][14][15][16][17][18][19]. Furthermore, technology-based assessments risk becoming obsolete as newer technologies emerge, hindering the opportunity for robust comparisons over time.
A Gaze and Movement Assessment (GaMA) protocol was developed, based on the foundation of visuomotor research that both acknowledges and demonstrates a means of overcoming these limitations [7][8][9]. GaMA uses motion capture and eye tracking to quantify the movement quality and visual attention exhibited by participants as they interact with and move objects in an environment. GaMA includes: (1) a procedure for the administration of two standardized functional tasks that incorporate common dextrous hand demands of daily living-the 'Pasta Box Task' and 'Cup Transfer Task' [7]; (2) a methodology to obtain synchronized motion capture and eye tracking data during functional task execution [7][8][9]; and (3) analysis software, which requires a standardized data set of synchronized movement and eye data coordinates as input, and outputs measures of hand movement, angular joint kinematics, and eye gaze [7][8][9].
The standardized 'Pasta Box Task' and 'Cup Transfer Task' used in GaMA were designed based on common task requirements of functional clinical assessments, and were shown to have repeatable hand trajectory and hand kinematics among able-bodied participants [7]. Motion capture marker clusters are used to reduce the implementation burden, particularly as such clusters were validated as being equivalent to individually placed anatomical markers [20]. Consistent normative joint movement kinematics for GaMA's functional tasks have been demonstrated, with test-retest reliability in a normative population [8]. Movement and eye data synchronization and analysis have established that participants' visual attention to future actions was similar between participants and across tasks [9], thereby reinforcing the theoretical framework of visual allocation during goal-directed actions [21].
GaMA's analysis software requires a specific input data set that can be obtained by various data collection hardware and software solutions. This renders GaMA amenable to technological evolution, such as markerless motion capture and mobile eye trackers. Furthermore, GaMA's output measures of hand movement, angular joint kinematics, and eye gaze (which can be precisely reported for individual movement phases) remain relevant and equipmentindependent for future comparative purposes, both within and across research sites. The ability to compare results across sites would be extremely valuable, as it could facilitate larger subgroup comparisons when smaller populations of individuals with upper limb impairments are studied, such as upper limb prosthesis users.
In order to validate a new protocol such as GaMA, it is essential to determine reproducibility. Reproducibility of a test or method is defined as the closeness of the agreement between independent results obtained by following the same procedures, but under different experimental conditions [22]. Due to the inherent variability found in clinical populations, reproducibility of a test to assess movement behaviour is typically first studied in a non-disabled population. While intra-rater test-retest reliability of GaMA has been demonstrated for hand movement and angular joint kinematic measures for non-disabled individuals [7,8], it has yet to be determined whether these and other measures obtainable by GaMA are reproducible across raters and sites. Furthermore, it is often assumed that the non-disabled population will behave similarly (or identically) across test sites; yet, it is known that deviations from protocols can result in data set disparity amongst the population [23]. If a standardized protocol can be shown to yield measures that are similar across sites, the data sets could be combined for a richer understanding (or more saturated data set) of non-disabled movement behaviour.
The objective of this study, therefore, was to conduct an inter-site validation of GaMA by assessing the reproducibility of the visuomotor measures in non-disabled individuals presented by Valevicius et al. and Lavoie et al. [7][8][9]. More specifically, this study sought to determine whether the same hand movement, angular joint kinematic, and eye gaze measures could be obtained by testing a second independent group of non-disabled participants, at a different site equipped with different motion capture and eye tracking technology, and administered by a different rater. Establishing the reproducibility of GaMA in the non-disabled population advances its consideration as an outcome assessment protocol for populations with sensorimotor impairments of the upper limb.

Methods
For comparative purposes, the research conducted by Valevicius et al. [7,8] and Lavoie et al. [9] is referred to in this paper as 'the original study', and the data set analyzed by these studies is referred to as 'the original data set'. The new research presented in this article is referred to as 'the repeated study' and its data as 'the repeated data set'. Unless otherwise specified, the same procedures were followed in both studies. Further details about such procedures can be found in the GaMA protocol supplementary materials (S1 Text). Ethical approval for these procedures was obtained by the University of Alberta Health Research Ethics Board (Pro00054011), the Department of the Navy Human Research Protection Program, and the SSC-Pacific Human Research Protection Office.

Participants
A total of 22 non-disabled adults were recruited to participate in the repeated study. Data from two participants were removed due to problems arising from software issues. The characteristics of the 20 participants from the original study [7][8][9] and the 20 participants in the repeated study are detailed in Table 1. In both studies, two participants performed the tasks without corrected vision, since they had to remove their glasses to don the eye tracker. These participants, however, reported that their vision was sufficient to allow them to confidently perform the task.

Equipment
Motion capture and eye tracking hardware and software specifications for the original study and the repeated study are indicated in Table 2. The equipment was set up in the repeated study as specified in the original study [7][8][9]. Rigid plates and a headband (each holding four retroreflective markers) were attached to the participant in accordance with Boser et al.'s Clusters Only kinematic model [20]. To improve rigid body motion tracking in the repeated study, the hand plates were redesigned, as shown in Fig 1. For both studies, markers were attached to the index finger (middle phalange) and thumb (distal phalange) [7]; a head-mounted eye tracker was placed on the participant and positioned in accordance with the manufacturer's instructions; and a motion capture calibration pose was collected for each participant, as outlined by Boser et al. [20].

Data collection
In both studies, the two functional tasks introduced by Valevicius et al. (the Pasta Box Task and Cup Transfer Task) [7] were administered. The Pasta Box Task involves the movement of a pasta box between a side table to shelves in front of the participant. The Cup Transfer Task involves the movement of two deformable cups filled with beads over a barrier on a tabletop in front of the participant. Additional details about these tasks can be found in Supplement S1. Each participant completed 20 error-free trials of the two tasks (if an error occurred, data from this erroneous trial were discarded and an additional trial was completed to replace it), while simultaneous motion and eye tracking data were collected. Prior to this, each participant was given verbal instructions, a demonstration, and at least one familiarization trial of each

Research Participant Characteristics Original Study Repeated Study
Male participants 11 13 Female participants 9 7 Self-reported right-handed participants 18 19 Participants with normal or corrected to normal vision 18 18 Participant age (years-mean ± standard deviation) 25 functional task. Task order was randomized for each participant. At least two gaze calibrations (outlined by Lavoie et al. [9]) were collected before participants executed their initial trial of each task, and one after they completed their final trial of the last task; given that there were two functional tasks, a minimum of 5 calibrations were performed per participant.
The original data collection protocol differed from the repeated study in one notable way. In the original study, every participant performed a total of 60 trials of each task, 20 of which were under each of the following conditions: (1) only motion capture data were collected, (2) only eye tracking data were collected, and (3) both motion capture and eye tracking data were collected. As the repeated study consisted solely of collecting data during simultaneous motion capture and eye tracking, it was only compared to that of the original data set captured under condition (3) 'both'. In the original study, the order of conditions for each participant was block randomized to one of 4 block orders, with motion (1) and both (3) conditions always sequential. As a consequence of the partial randomization order, three quarters of the originalstudy participants were afforded at least 20 extra trials executing each functional task prior to testing under the 'both' condition.

Experimental data analysis
Data analysis in the repeated study was undertaken as outlined by Valevicius et al. and Lavoie et al. [7][8][9], with details provided in Supplement S1. The data analysis was dependent on accurate and combined visuomotor data. This required that the motion capture marker trajectory data and pupil position data first be cleaned, filtered, and synchronized. Motion capture data cleaning was necessary to fill any gaps in the data. Pattern-based interpolation was used to fill all gaps originating from markers that were part of clusters, whereas linear or cubic interpolation was used for individual marker data. Pupil position data were cleaned using linear interpolation of gaps smaller than 100 ms. Next, the visuomotor data were filtered to reduce any noise that was introduced during data collection. Filtering was accomplished using secondorder, low-pass Butterworth filters, with a cut-off frequency of 6 Hz for the motion capture data [7,8] and 10 Hz for eye tracking [9]. Finally, the motion capture and eye tracking data were synchronized.
For each functional task, the repeated data set were divided into distinct movements based on hand velocity, the velocity of the task object(s), and grip aperture values, as per Valevicius et al. [7]. The data from each movement were further segmented into the phases of 'Reach', 'Grasp', 'Transport', 'Release', and 'Home'; the Home phase was not used for data analysis. Due to the short duration of the Grasp and Release phases, combined movement segments of 'Reach-Grasp' and 'Transport-Release' were used in hand movement analysis [7]. Eye latency measures were calculated at instances of phase transition, both at the end of a Grasp phase and at the beginning of a Release phase (referred to as 'Pick-up' and 'Drop-off' by Lavoie et al. [9]). An illustration of how one distinct movement was separated into the abovementioned subsets (phases, movement segments, and phase transitions) can be found in

GaMA measures
Duration (phase and trial), hand movement, angular joint kinematic, and eye gaze measures were calculated for the original and repeated studies, as outlined by Valevicius et al. [7,8] and Lavoie et al. [9], and are listed in Table 3. Lavoie et al.'s 'fixations to future' measure was not considered in this study as these fixations were shown to be unlikely to occur in non-disabled participants for both tasks [9]. In addition to the measures listed in Table 3, the relative duration of each phase was calculated as the percent of time spent in that phase, relative to the given Reach-Grasp-Transport-Release sequence.
In the repeated study, the calculation of hand movement measures was altered due to the creation of a virtual rectangular prism, which approximated the participant's hand position at each point in time. Using the centre of this prism, hand position and velocity were subsequently calculated. For comparative purposes, the original study's hand movement results were recalculated via this methodology, rather than the original calculation of Valevicius et al. that used the average position of the three hand plate markers [7]).

Statistical analysis
The aim of the statistical analysis was to detect significant differences between the original and repeated data sets, and to determine whether such differences were more pronounced for particular movements and/or movement subsets (phase, movement segment, or phase transition). To investigate differences between the two groups of participants, a series of repeated-measures analyses of variance (RMANOVAs) and pairwise comparisons were conducted for each measure and task. RMANOVA group effects or interactions involving group were followed up with either an additional RMANOVA or pairwise comparisons between groups if the Greenhouse-Geisser corrected p value was less than 0.05. Pairwise comparisons were considered to be significant if the Bonferroni corrected p value was less than 0.05. Detailed statistical analysis methods can be found in supplementary materials (S2 Text).

Duration
For both the Pasta Box Task (or 'Pasta') and the Cup Transfer Task (or 'Cups'), the repeatedstudy participants took significantly more time to complete the tasks than the original-study participants (Pasta: 11.8 ± 3.4 seconds versus 8.8 ± 1.2 seconds, p < 0.01; Cups: 13.9 ± 2.5 seconds versus 10.5 ± 1.3 seconds, p < 0.0001). The repeated-study participants had longer phase durations than the original-study participants, with all Grasp and Transport phases and the Movement 2 Release phase significantly prolonged in Pasta, and all phases significantly prolonged in Cups (S1 Table). The two participant groups, however, displayed similar relative phase durations throughout both tasks, with no significant differences.

Hand movement
The repeated-study participants had greater hand distances travelled than the original-study participants, with significant increases in Movement 1 & 3 segments of Pasta (S2 Table) and in all Cups movement segments, except for Movement 1 & 4 Transport-Releases (S3 Table). However, Fig 3 (Pasta) and Fig 4 (Cups) show that the average hand trajectories chosen by both participant groups were similar. The repeated-study participants also had larger hand trajectory variability than the original-study participants, with significant increases in all Pasta movement segments except for Movement 3 Transport-Release (S2 Table) and all Cups movement segments (S3 Table). The repeated-study participants had a greater number of movement units (i.e., more hand velocity peaks) than the original-study participants, with significant increases in all movement segments of Pasta and for Movement 1 & 4 Reach-Grasp and Movement 1 to 3 Transport-Release segments of Cups. Participants in the original and repeated studies had similar hand velocity profiles for both tasks, as shown in Fig 5A and 5B. Although the peaks in the repeated study appeared smaller, these differences were non-significant throughout both tasks (S2 and S3 Tables). Significant percent-to-peak hand velocity differences were identified for the Movement 1 Reach-Grasp segment of Pasta and the Movement 2 & 3 Reach-Grasp segments of Cups, but the differences between the mean values of the two participant groups were less than one standard deviation of the original study results. Participants in the original and repeated studies showed similar percent-to-peak hand deceleration values, with no significant differences in Pasta and a significantly difference only for the Movement 4 Reach-Grasp segment of Cups. However, the difference between the mean values of the two participant groups in this movement segment was less than one original study standard deviation.
Participants in the original and repeated studies had similar grip aperture profiles for both tasks, as shown in Fig 5C and 5D, with no significant differences in peak grip aperture identified for either task. Also, no significant differences in percent-to-peak grip aperture were identified in Pasta, and a significant difference was only identified in the Movement 4 Reach-Grasp segment of Cups.  7 (Cups). Similar angular kinematic profiles existed between the original-and repeated-study participants, with only a few differences; participants in the repeated study had an increased standard deviation for trunk flexion/extension (both tasks), and an offset was present between the wrist flexion/extension angles (both tasks) and between the wrist ulnar/radial deviations angles (Pasta only) of the two participant groups. Angular kinematic measures are presented in Table 4 (Pasta) and Table 5 (Cups). The original-and repeated-study participants generally had similar peak joint angles in both tasks. Significant peak angle differences were found in wrist flexion/extension for Movements 1 and 2 of Pasta and all movements of Cups, and in wrist ulnar/radial deviation for all movements of Pasta.
The original-and repeated-study participants also had similar ROM values in Pasta, although significant differences were found for the Movement 2 trunk flexion/extension ROM and the Movement 2 & 3 trunk lateral bending ROM. However, these differences were quite small (with the largest being 5.3˚). In Cups, differences in ROMs were significant in more movements and degrees of freedom (DOFs), as indicated by the shading in Table 5. However, the significant trunk ROM differences were quite small (both less than 2˚), and the significant shoulder ROM differences were less than the respective original study standard deviations for those DOFs.
The repeated-study participants exhibited differences in peak angular velocities in most DOFs in both tasks. The peak angular velocities in the trunk DOFs of repeated-study participants were usually greater than those of original-study participants, with significant trunk flexion/extension differences in Movement 1 and 2 of Pasta and Movement 1 of Cups. The peak angular velocities in the remaining DOFs of the repeated-study participants were usually smaller than for the original-study participants, with most significantly lower.

Eye gaze
The repeated-and original-study participants exhibited similar eye fixations, with no significant differences identified in either task, as shown in Table 6 (Pasta) and Table 7 (Cups). Gaze and Movement Assessment (GaMA) inter-site validation Significant eye arrival latency differences were identified in all Grasp phase transitions and the Movement 3 Release phase transition of Pasta, as well as the Movement 3 phase transitions of Cups. No significant eye leaving latency differences were identified in Pasta, but significant differences were identified in the Movement 3 Release transition in Cups.

Discussion
Measures that were consistent between the original and repeated studies included all hand velocity, grip aperture, and eye fixation results, along with most peak joint angle and ROM results. Although participants in the repeated study took more time to complete each functional task (greater overall duration), similar relative phase durations between the participant groups indicated that the repeated-study participants did not spend a disproportionate amount of time in any one phase.
Participants in the original study may have displayed faster performance due to the prior functional task trials that they completed (that is, during task trials where only motion capture or eye tracking data were captured in the original study). This presumption is likely, given that practice has been shown to decrease functional test completion time [24]. The longer phase durations exhibited by the repeated-study participants directly affected the eye arrival latencies and eye leaving latencies due to the time-dependent nature of these calculations. Furthermore, Gaze and Movement Assessment (GaMA) inter-site validation the longer movement times resulted in decreased joint angular velocities in shoulder, elbow, forearm, and wrist DOFs.
Learning effects may have also contributed to discrepancies in hand movement measures between the original-and repeated-study participants. The repeated-study participants exhibited an increased number of movement units and increased hand trajectory variability, both of which were likely due to the influence of fewer practice opportunities [25,26]. Furthermore, increased hand trajectory variability presumably contributed to the increased average hand distance travelled of the repeated-study participants. Hand trajectory variances would be expected to be away from, or in avoidance of, obstacles present in all task movements (box walls and the partition in the Cup Transfer Task, and the shelf frames in the Pasta Box Task). Future studies that employ GaMA should standardize the amount of functional task practice opportunities that participants receive.
Task demonstration variations by raters may also have contributed to task duration differences between the two participant groups. Although the same script was used to explain the tasks to participants in each study, small variances in task demonstration speed may have been introduced by the raters. Since the timing of demonstrations is known to influence the resulting pace of participants' movements [27], a slower demonstration may have contributed to the repeated study's increase in task duration time. It is recommended that a standard task   ns indicates a p value that is not significant. Highlighted table cells also indicate significant differences (red = higher and blue = lower repeated study value). https://doi.org/10.1371/journal.pone.0219333.t004 Gaze and Movement Assessment (GaMA) inter-site validation demonstration video be created and shown to all participants to reduce the possible effects of rater demonstration variation. The angular kinematic measures revealed offsets in the wrist flexion/extension and ulnar/ radial deviation measures of the repeated-study participants, likely due to differences in the kinematic calibration pose across the two studies. Such calibration errors are known to be the main limitation of the Clusters Only model [20]. In addition, a large standard deviation in trunk flexion/extension was observed for repeated-study participants, also likely attributable to errors in the kinematic calibration. That is, the calibration of this DOF depends on how each participant chooses to 'stand upright'. To limit such deviations in joint angles, the rater must ensure that the participant does not have a bent wrist and is standing as upright as possible, when a kinematic calibration pose is captured.
Further angular kinematics variations were observed between the two participant groups, in both the forearm pronation/supination and wrist ulnar/radial deviation ROMs. Such deviations were introduced by the Clusters Only model, which calculates wrist and forearm angles in a manner that is different from other DOFs. This alternative calculation method was chosen because, during the required calibration pose, participants struggled to align their wrist axes of rotation with the global coordinate system, either due to their elbow carrying angle or their inability to supinate their forearm the required amount. As such, the model uses the local coordinate system of the forearm plate to calculate wrist and forearm joint angles. Small misplacements of the forearm marker plate, however, can introduce wrist and forearm joint angle calculation errors. To combat this limitation of the Clusters Only model, the rater must take care to align the forearm marker plate with the long axis of the forearm when it is affixed to the participant. A second option would be to use a calibration model with both marker clusters and anatomical markers, which may increase accuracy specifically for wrist and forearm kinematics [20].
Although little has been done to validate eye tracking and/or motion capture methods in upper limb movement research, many studies have validated motion capture methods for gait measurements [28]. Gait studies commonly revealed that inconsistencies in motion capture marker placement were a large source of anatomical model errors [28]. The Clusters Only model used by GaMA attempts to address this issue as it does not require precise individual  Gaze and Movement Assessment (GaMA) inter-site validation marker placement, and has been shown to be as reliable as an anatomical model [20]; it does, however, introduce its own variability caused by calibration pose inconsistencies. Gait reliability research has also identified intrinsic participant-to-participant variation within a given population and trial-to-trial variation for a given participant [28,29]. Such variation could, at least partially, also explain movement behaviour differences between the original and repeated data sets of this study.

Limitations
Given that this study manipulated numerous experimental factors when comparing the visual and movement measures of two groups of non-disabled participants, it had limitations. It was infeasible for this research to determine the degree to which these factors (different For the results of the pairwise comparisons (in column p), � indicates a significant p value less than 0.05 and "ns" indicates a p value that is not significant.
Highlighted table cells also indicate significant differences (red = higher repeated study value). https://doi.org/10.1371/journal.pone.0219333.t006 Gaze and Movement Assessment (GaMA) inter-site validation participants, sites, equipment, raters, and task experience opportunities) affected movement measure variation. Additional research on the effects of training could shed more light onto whether or not the amount of practice fully explains the difference in results between the two studies. Although assessment of inter-site/inter-rater reliability of GaMA using the same For the results of the pairwise comparisons (in column p), � indicates a significant p value less than 0.05 and "ns" indicates a p value that is not significant.
Highlighted table cells also indicate significant differences (red = higher and blue = lower repeated study value). https://doi.org/10.1371/journal.pone.0219333.t007 Gaze and Movement Assessment (GaMA) inter-site validation participant group would also provide valuable information by reducing the effects of inter-participant variability, for this study, a new participant group presented an opportunity to analyze a wider range of normative behaviour; an important consideration when designing an assessment tool to be used to characterize functional impairments.

Conclusions
Overall, the results of the repeated study were similar to those obtained by Valevicius et al. and Lavoie et al. [7][8][9]. Most hand movement, angular joint kinematic, and eye gaze results exhibited by participants in the repeated study were consistent with those observed in the original study. Most significant differences between the results could be explained by the amount of practice that participants in the two studies received, demonstration variations introduced by the rater, and the limitations of the Clusters Only kinematics model. Researchers should be aware of such potential variability when collecting data, and endeavor to adhere to the same data collection protocol when intending to compare data across sites. GaMA presents a novel methodology to obtain quantitative metrics on hand function, trunk and angular joint kinematics, along with eye fixation behaviour during functional object manipulation tasks. Due to its demonstrated reproducibility, it is expected that, in the future, GaMA can serve as a reliable and informative functional assessment tool across different sites and for individuals with sensorimotor impairments in the upper limb.  Table. Phase duration values for the Pasta Box Task and Cup Transfer Task (presented as means ± between-participant standard deviations), with significant results of the pairwise comparisons. For the results of the pairwise comparisons (in column p), � indicates a significant p value less than 0.05, �� indicates a p value less than 0.005, and "ns" indicates a p value that is not significant. (DOCX) S2 Table. Pasta Box Task hand movement values for each movement segment (presented as means ± between-participant standard deviations), with significant results of the pairwise comparisons. For the results of the pairwise comparisons (in column p), � indicates a significant p value less than 0.05, �� indicates a p value less than 0.005, and "ns" indicates a p value that is not significant. (DOCX) S3 Table. Cup Transfer Task hand movement values for each movement segment (presented as means ± between-participant standard deviations), with significant results of the pairwise comparisons. For the results of the pairwise comparisons (in column p), � indicates a significant p value less than 0.05, �� indicates a p value less than 0.005, and "ns" indicates a p value that is not significant.