Validation of electronic performance and tracking systems EPTS under field conditions

The purpose of this study was to assess the measurement accuracy of the most commonly used tracking technologies in professional team sports (i.e., semi-automatic multiple-camera video technology (VID), radar-based local positioning system (LPS), and global positioning system (GPS)). The position, speed, acceleration and distance measures of each technology were compared against simultaneously recorded measures of a reference system (VICON motion capture system) and quantified by means of the root mean square error RMSE. Fourteen male soccer players (age: 17.4±0.4 years, height: 178.6±4.2 cm, body mass: 70.2±6.2 kg) playing for the U19 Bundesliga team FC Augsburg participated in the study. The test battery comprised a sport-specific course, shuttle runs, and small sided games on an outdoor soccer field. The validity of fundamental spatiotemporal tracking data differed significantly between all tested technologies. In particular, LPS showed higher validity for measuring an athlete’s position (23±7 cm) than both VID (56±16 cm) and GPS (96±49 cm). Considering errors of instantaneous speed measures, GPS (0.28±0.07 m⋅s-1) and LPS (0.25±0.06 m⋅s-1) achieved significantly lower error values than VID (0.41±0.08 m⋅s-1). Equivalent accuracy differences were found for instant acceleration values (GPS: 0.67±0.21 m⋅s-2, LPS: 0.68±0.14 m⋅s-2, VID: 0.91±0.19 m⋅s-2). During small-sided games, lowest deviations from reference measures have been found in the total distance category, with errors ranging from 2.2% (GPS) to 2.7% (VID) and 4.0% (LPS). All technologies had in common that the magnitude of the error increased as the speed of the tracking object increased. Especially in performance indicators that might have a high impact on practical decisions, such as distance covered with high speed, we found >40% deviations from the reference system for each of the technologies. Overall, our results revealed significant between-system differences in the validity of tracking data, implying that any comparison of results using different tracking technologies should be done with caution.


Introduction
Electronic performance and tracking systems (EPTS) primarily track player (and ball) positions and have become one of the most important components to monitor a player's overall external (locomotor) load [1]. In particular, semi-automatic multiple-camera video systems (VID), radio-based local positioning systems (LPS) and global positioning systems (GPS) have become indispensable core technologies for assessing the physical and tactical behaviour of both training and competition [2,3]. As a matter of fact, it is not uncommon for some players to be tracked by two or three different EPTS during a regular week, considering that GPS systems and/or LPS systems are often used during training sessions, while most teams obtain positional data from official matches from semi-automatic camera systems [4]. Consequently, validity, interchangeability and agreement between different EPTS are of key importance to allow for a substantiated assessment of a player's overall locomotor load and to integrate the data of different systems in a meaningful way. A review of the literature on the subject of EPTS' validity reveals that previous studies differ with regard to the number of tested core technologies (single technology [5][6][7][8][9][10][11][12][13][14][15] vs. multiple technology studies [3,16,17]), the choice of exercises (predefined movement patterns [3,5,7,11,13,14,17,18] vs. complex and free movements scenarios [6,8,19]), and, most importantly, the utilized criterion method. The most commonly used criterion methods include predefined movement circuits with known spatial arrangements (to evaluate distance measurement accuracy) [3,7,11,12,18], timing gates (to evaluate average speed) [3,5,7,9,12,18], and radar/ laser-based speed measurements for the evaluation of instantaneous running speed [10,13,17].
However, all these methods have specific drawbacks. First, distance references that are based on predefined running circuits are inevitably susceptible to errors introduced by the participants (e.g. errors introduced by postural sway or the difficulty for participants to follow the marked course as precisely as possible [7]). Second, timing gates are only of limited suitability as a speed reference [20], the reason for this being that this approach only determines average speed based on limited sampling points [21]. Third, while radar/laser guns are capable of measuring the instantaneous speed of an object with high accuracy, they are suitable only when it comes to validating linear running movements without changes in direction [17].
Therefore, the actual positional data obtained by EPTS should ideally be compared with the instantaneous positional and speed data of a two or three-dimensional reference system with known error estimates [8]. However, to our knowledge, merely four validation studies used a kinematic analysis approach to evaluate the validity of EPTS. Specifically, Duffield et al. [6] and Vichery et al. [19] used a VICON motion analysis system to validate GPS systems in fieldbased team sports, while Ogris et al. [8] and Stevens et al. [22] investigated the accuracy of a radar-based LPS-system during soccer-specific movements. Limitations of the aforementioned studies include a lack of instantaneous accuracy measures for both speed and position (rather than merely average differences of the mean aggregated data) [6,19,22], missing information on the specific data processing steps [6,19], a lack of realistic game scenarios [6,19,22], an insufficient size of the test area [6,19,22], as well as a lack of direct comparison between different technologies [6,8,19,22].
A review of the literature further reveals that previous EPTS validation studies can be divided into three categories, according to the examined parameters. The first category contains studies that analyzed position accuracy (spatial coordinates) [8,17]. Others examined the accuracy of instantaneous speed and acceleration data [10,23]. Errors in this category could result from either a poor quality of position data or inadequate processing algorithms [21]. Eventually, an accumulation of errors in the first two categories can lead to errors in the third category: key performance indicators (KPI) that are aggregated from the continuous data (e.g. distance covered, mean or peak speed, peak accelerations, etc.) [6,19,22]. Consequently, aiming at a comprehensive accuracy assessment of EPTS requires comparisons in three different categories, because in each category different problems could occur and different accuracy demands are to be met. Furthermore, up-to-date information on the spatial accuracy of sport-specific GPS technologies is still missing in the current literature. Considering the fact that various studies made use of GPS-based spatial coordinates to answer relevant scientific questions [24,25], as well as the fact that several commercial GPS-systems determine distance via positional differentiation and speed via Doppler shift [21], information on the spatiotemporal accuracy of sport-specific GPS-systems is still scarce.
Therefore, the purpose of the current study was to assess the accuracy of the most commonly used tracking technologies in professional team sports under field conditions (i.e., semi-automatic multiple-camera video technology, radar-based LPS technology, and GPS technology). Measures of each technology were compared to that of a reference system (VICON). This was done for test runs along predefined tracks, shuttle runs, and small sided games. The results could contribute to an improved understanding of performance parameters provided by EPTS.

Participants
Fourteen male soccer players (age: 17.4±0.4 years, height: 178.6±4.2 cm, body mass: 70.2±6.2 kg) playing for the German Bundesliga team FC Augsburg participated in the study. Prior to participation, all players received comprehensive verbal and written explanations of the study, which was conducted within a period of two consecutive days. On each single day, 10 players participated. On the second day, four players from the first day had to be substituted. Therefore, fourteen different individual players participated in total. Voluntarily signed informed consent to wear GPS/LPS sensors and VICON markers and to participate in the collection of spatiotemporal tracking data was provided by both the players and their parents. Institutional board approval for the study was obtained from the Ethics Commission of the Technical University of Munich. To ensure confidentiality, all performance data were anonymized. This study conformed to the recommendations of the Declaration of Helsinki.
Global Positioning System (GPS). GPSports (GPSports Sports Performance Indicator (SPI) Pro X, Canberra, Australia). This version of the SPI Pro provides raw position, instant speed and distance data at 15 Hz (5 Hz interpolated to 15 Hz). Software: Team AMS firmware: R1 2015.10. All GPS devices were activated 15 min prior to the data collection to allow the acquisition of satellite signals. Unfortunately, horizontal dilution of precision (HDoP) information cannot be retrieved with the provided Team AMS software. After making a request to the manufacturer in this regard, we were informed that the internal code automatically rejects data with HDoP values >4, which is well below the maximum value of 50 [26].
Local Positioning System (LPS). Inmotio (LPM system, 1 kHz, Inmotio Object Tracking BV, Amsterdam, Netherlands). Software: Inmotio Client, firmware: v3.7.1.153. 11 base stations were set up and calibrated under the supervision of an expert of the Inmotio company. During data collection, 22 transponders were activated to simulate a real match situation in terms of the number of transponders that were active at the same time, which resulted in an individual sampling rate of 45.45 Hz (1 kHz/22 transponders). LPM data was filtered with the integrated weighted Gaussian average filter set at 85%, as recommended by the manufacturer.
To ensure optimal device positioning on the body and minimization of crosstalk between GPS and LPS, athletes wore only one device of each system simultaneously. Using the harness provided by the manufacturers, GPS devices were positioned on the upper thoracic spine between the scapulae. LPS devices were worn in a vest containing a transponder located on the back that was connected to two antennas, one on top of each shoulder. The position of the athlete is then calculated as the spatial center of both antennas (manufacturer information). VICON measurement accuracy. To demonstrate the spatial accuracy of the applied VICON setup, a rigid calibration object was moved through the VICON area, spiraling from the center to the edges of the measurement volume. As the markers on the calibration object remained at accurately known distances to each other at any given time, the distances between the markers that are delivered by the VICON software, which were calculated in retrospect, can be used to describe the crucial aspect of measurement accuracy (see S1 Dataset). Overall, the average error of the calibrated VICON setup was 0.0 mm (SD = 1.0 mm, 95% CI [-1.9 mm, +2.0 mm]), resulting in an RMSE of 1.0 mm at a frequency of 100 Hz.

Reference system
Comparison criteria: Center of mass (COM). Under the assumption that each EPTS endeavors to detect the position of the human body as a whole, the center of mass (COM) (or rather the XY-position of the body's center that is projected on the ground plane) was considered a valid criterion measure. However, in the case of wearable tracking devices, the systems actually detect the position of the sensors that are fixed to the players (usually attached between the shoulder blades or on top of the shoulders). In video-based systems, objects are tracked by image segmentation using different techniques of image recognition [27]. Typically a rectangle is identified enclosing segmented parts of the player, and a weighed estimate of the body parts locates the body's center.
Eventually, the choice of the most suitable reference position on the human body should not be prescribed by the technological prerequisites of the respective EPTS, but rather by biomechanical considerations. We, therefore, advocate the idea that the ultimate reference position for each EPTS should be the COM, irrespective of where the respective transponder/ receiver is attached to the human body. To estimate COM, five adhesive marker mounts were glued on each participant's skin (right shoulder (RSHO), left shoulder (LSHO), left anterior superior iliac spine (LASI), right anterior superior iliac spine (RASI), and sacrum (SACR)) (see Fig 1). The reflective markers were then fixed to the mounts through a tight-fitting compression shirt. COM is then estimated by means of the reconstructed pelvis method [28], defined as the spatial center of the RASI, LASI, and SACR.

Venue and satellite reception
Measurements took place at Rosenaustadion (Augsburg, Germany). This particular stadium is characterized by low stands (12.0 m maximum height at 50.0 m distance from the sideline). In addition, the pitch (105.0 m x 67.0 m) is surrounded by a tartan track. To meet the standard requirements for the camera system (sufficient height to obtain the required viewing angle), an additional platform had to be built on top of the stands (see Fig 2). During the entire measurement period, the number of connected GPS satellites was 10.1 ± 0.8, which is in the range of previous validation studies (e.g. 8 ± 1 [19], 9.5 ± 2 [3] and 12.3 ± 0.3 [13]). Thus, for all technologies involved, the minimum requirements were met. Data was recorded after sunset using floodlights. The weather was dry and windless with temperatures around 8˚Celsius.

Exercises
Sport-specific course (SSC). A predefined circuit with prescribed movement intensities (Fig 3) was used to analyze elementary movements under controlled conditions, e.g. curved runs and runs with sharp turns. Within each trial, six distinct elementary movement patterns were performed: (1) 15 m sprint into 5 m deceleration, (2) 20 m sprint into 10 m backward running into 10 m forward running, (3) 505 agility test, (4) two rapid 90˚turns, (5&6) curved runs toward and away from the camera (see S2 Video). The beginning and end of each individual section was marked with two flat pylons, which in turn were equipped with reflective VICON markers. This enabled us in hindsight to detect the starting and endpoint of each section by means of the players' XY-position (a player was located within/outside a certain section if his COM crossed the line between the two start/end points).

m shuttle run test (SHU).
Players repeatedly ran 20 m shuttles with 180˚changes of direction at 11 kmÁh -1 for a period of two minutes. Subjects ran in groups consisting of ten players each. The shuttle run test was performed to obtain controlled test conditions including change of directions.
Small-sided game (SSG). Finally, exercises with the highest ecological validity are matches that take place on a full-sized pitch. Unfortunately, to the best of our knowledge, a gold standard for full pitch testing does not exist to date. Therefore, the best possible alternative are SSGs with fewer players competing on a smaller sized field. In our case, 5vs5 smallsided games were played, without goals, as collective possession play (see S2 Video). The format of the game-play comprised repeated 2-min bouts interspersed with 1-min of passive rest. Each drill was performed in a continuous regime, under the supervision, coaching, and motivation of the coaches to maintain a high work-rate. The ball was always available owing to prompt replacement any time it was hit out of the measurement area.

Data analysis
Parameters for analysis. As indicated in the introduction, the validation of EPTS should be implemented through analysis of (i) position data, (ii) instant speed and acceleration data, and (iii) KPIs. It should also be noted here, that modern GPS systems derive speed and acceleration data based on the Doppler shift effect, instead of differentiation of position data [21]. We procured fundamental and derived data from the export option of each tracking system (XYdata, instant speed, and acceleration). Instead of using the KPIs as provided by the manufacturers' proprietary software, we deliberately decided to re-calculated these metrics, allowing us to use exactly the same algorithms for all tested systems. Manufacturer proprietary software often use data-processing algorithms that are subject to intellectual property protection, and their specific algorithms are not disclosed to the end user [21]. Therefore, to achieve a transparent validation procedure, and to facilitate appropriate interpretation and replication by others, it was decided to independently calculate the KPIs based on the provided raw data. Running intensities were divided into the following speed thresholds: standing (<1 kmÁh -1 ), low speed (!1 to <6 kmÁh -1 ), moderate speed (!6 to <15 kmÁh -1 ), elevated speed (!15 to <20 kmÁh -1 ), high speed (!20 to <25 kmÁh -1 ), and very high speed (! 25 kmÁh -1 ). Peak speed was defined as the highest measured speed value. High acceleration and deceleration thresholds were set at !3 mÁs -2 , and 3 mÁs -2 , respectively. Data processing. To produce an evenly sampled time series among the systems prior to accuracy analysis, each data set was up-sampled to 100 Hz. The timing offset between the data sets was estimated by means of a cross-correlation procedure. Each coordinate system was then aligned with the VICON coordinate system via a generalized Procrustes analysis (GPA, euclidean similarity transformation, i.e. translation and rotation). After spatial and temporal synchronization of all systems involved, the VICON time code served as the ultimate reference for detecting the EPTS's start and end points of the respective exercise/section. Data processing of raw VICON data consisted of filtering using a 4th order 10 Hz Butterworth low pass filter. Gaps in the data of 1 to <10 ms were filled using spline interpolation. Gaps that were !10 ms were excluded from analysis. XY-positions for spatial accuracy analysis were directly derived from the 100 Hz VICON data. The third dimension (Z-coordinate) was neglected in the calculations.
Raw VICON data needs further adjustments in order to serve as an appropriate reference. When humans walk and run, between the heel-strike and mid-stance, the forward speed of the COM decreases and between the mid-stance and toe-off, it increases within each instance of ground contact of each leg [29]. This results in a "true" horizontal speed curve that looks like a sine wave oscillating around the mean horizontal speed (see Fig 4 left).
Since most EPTS do not have the capability to assess intra-cyclic speed or acceleration fluctuations, a comparison with a gold standard that does have this capability would be "unfair" in the sense that first, there is an increased deviation because intra-cycle speed is not achievable in the case of these systems, and second, EPTS are only meant for assessing the gross movements of players. For this reason, comparisons with "gait-neutralized" sped and acceleration of the gold standard is advisable. To achieve this goal, we studied typical speed signals of football-specific movements through spectral analysis using Fast Fourier Transform (FFT) (Fig 4  right). The occupied bandwidth, as a measurement of the frequency bandwidth that contains 99.0% of the total power of the speed signal, is located at approximately 2 Hz. A further noticeable peak in the spectrum, at approximately 2.5 Hz, most probably corresponds to the intracyclic variations of movement speed. Therefore, the gait-neutralized reference speed was calculated using a 4th order 2 Hz Butterworth low pass filter on the raw VICON speed (change in position divided by change in time). The gait-neutralized acceleration was calculated using finite differentiation of the gait-neutralized speed (change in speed divided by change in time). Further analysis of the VICON data showed that the projection of the COM travels considerably even if there is no perceivable movement of a player. This is due natural body sway (see S1 Video for a graphical illustration), or postural changes, which do not result in discernible changes of position. Therefore, the gold standard's positional data was additionally processed using a "waypoint method" to account for these microscopic movements that are partially detectable only in the case of highly sensitive devices, but not for EPTS, and should be excluded from the assessment of the athlete's gross motion. The waypoint method assumes that only after a distance traveled between any two tracking points exceeds a certain threshold, typically one step length, these tracking points can be considered for distance calculation. With these remaining points (support points) a new trajectory was calculated using cubic spline interpolation (see Fig 5). We used a threshold of 60.0 cm as our investigations showed that this is a good estimate for the COM displacement during a walk cycle, thus aiming to exclude COM displacements that are smaller than a single step in a way that they are excluded from the measurements of the gross motions of players. It should be stressed here, that the waypoint method was only used to obtain the aggregated distance references, whereas the spatial accuracy (XY-position in space) of each system was validated against the raw VICON positions (4th order 10 Hz Butterworth low pass filter applied to the raw positions).

Statistical analysis
Accuracy of fundamental XY-position data was estimated by means of the root mean square error (RMSE). Since we also analyzed the error pertaining to speed and acceleration To analyze the accuracy of fundamental (XY-position) and derived (instant speed and acceleration) measures, single-sample t-tests were conducted to determine if the mean of the resulting RMSEs of an individual EPTS was statistically significantly different from zero. Twotailed paired t-tests were used to compare the aggregated (numerical) metrics derived by the respective EPTS with that derived from the reference system. Inter-system differences in accuracy levels were tested using repeated-measures one-way analysis of variance (ANOVA). Bonferroni's post hoc analyses were used when significant differences were found. A Shapiro-Wilk test was applied for testing the normality of the residuals and a Levene's test was used to test the homoscedasticity. In cases where data failed the normality test, non-parametric test procedures were used to analyze the data (Wilcoxon Signed-Ranks test and Kruskal-Wallis test by ranks). Effect sizes (ES) were quantified to indicate the meaningfulness of the differences in the mean values. Cohen's d effect sizes for the t-tests was classified as trivial (0-0.19), small (0.20-0.49), medium (0.50-0.79) and large (>0.80) [30]. Eta squared (η 2 ) ES for the analysis of variance were classified as small (0.02-0.12), medium (0.13-0.25) and large (>0.26) [30]. Since pre-screening of results revealed skewed error distributions and frequent outliers, descriptive statistics have been presented as the median (Med) and standard deviation (SD). Statistical significance for all calculations was set at p <0.05. Table 1 summarizes the number of single observations included for analysis. The number of observations for each system varied due to organizational reasons, which were mainly caused owing to incomplete data sets, time restrictions and the fact that only one wearable technology could be analyzed at the same time (whereas the VID system recorded all trials-irrespective of which wearable system was measured at the same time). The total number of exercises comprised 26 SSCs, 4 SHUs, and 14 SSGs. The total number of trials included for analysis results from the sum of participating players per exercise (see Table 1). For GPS, LPS, and VID, four, three, and 13 data files contained data gaps. Accordingly, the relative loss of data sets due to measurement errors was 6.3%, 4.2%, and 4.6%, respectively. Table 2 and Fig 6 report the measurement error of EPTS in the respective category. Overall, smallest errors of fundamental spatial accuracy (dRMSE) were achieved by the radar-based LPS system (SSC: 27±5 cm; SHU: 22±13 cm; SSG: 23±5 cm; pooled: 23±7 cm), followed by the image-based VID system (SSC: 57±9 cm; SHU: 59±28 cm; SSG: 56±12 cm; pooled: 56±16 cm), and the GPS system (SSC: 88±22 cm; SHU: 133±54 cm; SSG: 81±51 cm; pooled: 96±49 cm). GPS showed noticeable exercise-dependent fluctuations in spatial accuracy. In particular, GPS demonstrated lower spatial accuracy during the shuttle runs (133±54 cm). For VID, we found significant differences in the X and Y dRMSE accuracy (X: 28±13 cm, Y: 50±15 cm) [F(1, 392) = 247.40, p<.001]. Post hoc analysis of the ANOVA revealed no homogeneous subsets, implying that the spatial error (dRMSE) differs significantly between all tested systems.
Post hoc analysis of the ANOVA revealed notably frequent homogeneous subsets of GPS and LPS, implying that the vRMSE and aRMSE of GPS and LPS did not differ significantly (with the only exception of aRMSE for SSCs and vRMSE for standing (see Table 2)).

Sport-specific course
Results of the specific categorization into fundamental movement patterns during the SSC trials are presented in Table 3. The point that all the systems have in common is that dRMSE, vRMSE, and aRMSE were lowest during low speed location changes. Compared to GPS and LPS, VID showed significantly lower speed accuracy values during linear sprint exercises (15 m sprint into 5 m acceleration and backward into forward sprints), which were both aligned at a 90˚angle (perpendicular to the camera view). However, in the opposite direction, (505 agility test, movements parallel to the camera view), VID showed smaller errors (0.32±0.23 mÁs -1 ) than both LPS (0.51±0.07 mÁs -1 ) and GPS (0.53±0.11 mÁs -1 ).

Results of key performance indicators (KPI).
The percentage difference in KPIs between the respective EPTS and the criterion measure are presented in Table 4.

large GPS&VID / LPS&VID
Presented as median ± standard deviation (SD). Inter-system differences in accuracy levels were tested using repeated-measures one-way analysis of variance (ANOVA).

Discussion
Results showed that largest accuracy differences between EPTS were present in the first data category (fundamental XY-position in space). In particular, LPS had higher accuracy than both VID and GPS for measuring an athlete's position in space. However, our results also revealed that in the second category (instant speed and acceleration) errors of GPS are comparable to those of LPS, most likely related to the fact that GPS uses two fundamentally different measurement principles to determine an athlete's position and speed. In the third data category (KPIs), differences between technologies were not as pronounced as in the first and second data category, yet all technologies had in common that the magnitude of the error increased as the speed of the tracking object increased.

Position accuracy
The radar-based LPS system demonstrated the highest spatial accuracy with a dRMSE ranging from 22 cm (SHU) to 27 cm (SSC) (see Table 2). These findings are in accordance with previous research by Ogris et al. [8] (23 cm) and Siegle et al. [17] (24 cm). The sport-specific course has, however, also revealed that the spatial accuracy of the LPS system is dependent on instantaneous dynamics. In particular, fast changes of direction can lead to a significant increase of  the spatial error (e.g. 0.45 cm during 90˚turns, see Table 3). Rapid speed and direction changes seem to be a challenge for the underlying Kalman filter, which is generally based on linear dynamical systems, thus suppressing rapid movement changes. The only previous study that analyzed the spatial accuracy of the VID system reported an error of 73 cm dRMSE [17] (vs. 56-59 cm in the present study, see Table 2). Such a difference could be caused by either technological advancements in camera gear, different viewing angles, or the used criterion reference (LAVEG vs. VICON). For VID, we found significantly higher spatial errors in the 505 agility test of the sport-specific course (Table 3). Since players tend to lower their upper body to counteract accelerations occurring at the turning point, we assume  that the visual tracking algorithm detects the center of the athletes' body at a lower height, thus leading to a spatial position shift in the vertical Y-axis.
To the best of our knowledge, information about the spatial accuracy of sport-specific GPS systems has not been reported prior to this study. This could be due to the fact that GPS systems are predominantly used to evaluate physical performance metrics (rather than spatial/tactical behavior). Nevertheless, it is incomprehensible why only limited information is available on the spatial accuracy of GPS systems, especially against the background that various studies made use of GPS coordinates to analyse spatial motion behaviour (e.g. position-specific centroids, team centroids and distance between centroids [24,25]), as well as the fact that several commercial GPS systems determine distance metrics via differentiation of position data [21]. This study shows that the average spatial measurement error of GPS was 96 cm, almost twice as high as that of its nearest competitor VID (56 cm).

Instantaneous speed and acceleration accuracy
Considering that GPS exhibited the highest spatial errors, one would think that this error pattern should exert its influence on the vRMSE/aRMSE categories. Contrastingly, it is found that GPS speed errors were not significantly different from those of LPS (see Table 2). Whereas the vision-based and radar-based technology utilize differentiation of position data over time for speed determination, most commercially available GPS systems circumvent the problem of error propagation from fundamental to derived data by using two completely different measurement principles. Modern GPS systems can determine speed by measuring the rate of change in the satellites' electromagnetic signal frequency, also known as the Doppler effect [31,32]. Doppler measurements are immune to cycle slips (temporary signal anomalies or low signal-to-noise ratio caused by obstructions such as buildings, trees, etc.) [33]. Thus, research works dealing with GPS speed measurement reveal that using GPS Doppler measurements can provide greater speed accuracy than indirect measurement, which is based on error-prone position data [34]. As a consequence, despite comparably inferior spatial accuracy values, GPS systems are capable of measuring instantaneous speed, and consequently acceleration, with comparatively higher accuracy.
As depicted in the results, errors of the VID system were lower in movements in the X-axis when compared to movements in the Y-axis. Thus, lowest vRMSE errors for section 3 (505) were achieved by VID. This specific test was carried out in parallel alignment to the camera view (X-axis). Apart from that, it is apparent that instant speed an acceleration errors of the LPS and GPS technology are fairly consistent whereas the errors of the video technology have proven to be considerably higher. These results demonstrate the importance of the most accurate possible detection of position in space. Any inaccuracy on the fundamental data (XYpositions) will otherwise lead to increased error propagation in the derived data category (instantaneous speed and acceleration).

KPI accuracy
Overall, lowest deviations can be observed in the total distance category. RMSE% ranged from 1.2% (VID during SSC) to 4.9% (GPS during SSGs). These differences are in line with previous literature on GPS (1.9%) [35] and LPS (1.6-2.0%) [7,22]. Given a total distance of approximately 11.4 ± 1.0 km in professional soccer matches [36], an error of 4.9% would correspond to a discrepancy of 560 m, which in turn is more than half a standard deviation (1.0 km). It is therefore questionable to what extent EPTS with an apparently small error of e.g. 4.9% for total distance can sufficiently describe the performance hierarchy between players. In agreement with previous studies [6,15], we found evidence that GPS units are capable of accurately measuring distances with low and moderate speed (see Table 4), whereas they still have problems with regard to tracking movements involving high-speed direction changes (e.g. 90-180t urns, see Table 3). GPS had the lowest sampling rate in this study (GPSports 15 Hz units are actually 5 Hz with interpolated data). Our results again confirm that a 5 Hz sampling rate only partially captures high-intensity movements involving frequent changes of direction. Similar findings have been identified by previous GPS validation studies [10,15]. The generally high deviations in the lowest speed category of all systems (standing, <1 kmÁh -1 ) can be attributed to the fact that standing phases practically never occurred during the exercises, and thus the values for the total standing distance were considerably low. Minor differences could therefore lead to high deviations. The same applies to high-intensity categories such as high speed distance or high acceleration distance. It should also be noted that the RMSE% increases significantly as the movement intensity increases. This characteristic error pattern is particularly obvious in the case of high-speed categories. Considering that the percentage deviation increases considerably in these relevant performance categories (e.g. high-speed distance during small-sided games: RMSE% ranged from 43.8% (LPS) to 97.6% (VID)), the present results confirm that to this day EPTS may not be accurate enough to measure high-speed and acceleration distances with a reasonable degree of accuracy [13]. The RMSE in the peak speed category ranged between 0.22 mÁs -1 (GPS during shuttle runs) and 0.71 mÁs -1 (VID during shuttle runs). These values reveal the technology-dependent accuracy variations of the VID system. As the movement direction of the shuttle run was conducted in the vertical (perpendicular) camera axis (Y-axis), VID tended to overestimate the peak speed during shuttle runs.

Limitations
It is regrettable that at the time being, there is no gold standard for a full-size pitch of teambased sports. Since the natural field of application of EPTS are official matches, this leads to the fact that they might not be validated in the scenarios that are most relevant to them.
We could provide optimum environmental conditions for LPS and near-optimum conditions for GPS but could only meet the minimum requirements in case of the the VID system. It can be assumed that results of the VID system improve under optimum conditions such as in stadiums with steep stands in close proximity to the pitch.
It is worth noting that our results are based on untreated raw data, as provided by the manufacturer's proprietary software. Therefore, it is to be expected that the validity of the tested EPTS could be further improved by additional data filtering procedures.
Finally, this study did not examine the inter-unit agreement, i.e. systematic or random differences between different sensors in GPS and LPS systems. This is important though, in case valid comparisons between different players and sessions are of interest in the sensor-based systems [15].

Conclusion
Collectively, results of this study revealed that largest differences between EPTS occurred at the spatial accuracy, whereas speed and acceleration errors of GPS were comparable to those of LPS. Yet one important insight in this regard is the noticeably large error margin in the third data category (accuracy of KPIs) that is independent of the respective system or technology, which we are still facing in EPTS in general. Especially in KPI categories that might have a high impact on practical decisions, such as high speed performance indicators, we found significant deviations from the gold standard. Thus, the primary aim of future activities should be aimed at diminishing these inherent errors. Until then, it is recommended that practitioners do not make direct comparisons between KPIs collected by different EPTS. Since there are typically different systems at work in competition and training, we encourage any development toward a standardization of internal algorithms. In case there is no hint available at different operational definitions for filtering techniques or KPIs in different systems, this means the sports practice is led astray. For the time being, a consequence in this regard is to conduct comparisons between EPTS on the level of XY-data, instantaneous speed, and acceleration data, in addition to merely comparing calculated KPIs.
Supporting information S1 Video. Body sway visualization. Exemplary 3D animation of the center of mass (COM) displacement. Despite being static (the whole body is not traveling any distance in the conventional sense), the animation demonstrates that COM is constantly in motion, thus leading to unintended accumulation of travel distance. This example demonstrates the need for a distance calculation method that compensates this effect.