Extracting spatial-temporal features that describe a team match demands when considering the effects of the quality of opposition in elite football

Spatiotemporal patterns of play can be extracted from competitive environments to design representative training tasks and underlying processes that sustain performance outcomes. To support this statement, the aims of this study were: (i) describe the collective behavioural patterns that relies upon the use of player positioning in interaction with teammates, opponents and ball positioning; (ii) and define the underlying structure among the variables through application of a factorial analysis. The sample comprised a total of 1,413 ball possession sequences, obtained from twelve elite football matches from one team (the team ended the season in the top-5 position). The dynamic position of the players (from both competing teams), as well as the ball, were captured and transformed to two-dimensional coordinates. Data included the ball possession sequences from six matches played against top opponents (TOP, the three teams classified in the first 3 places at the end of the season) and six matches against bottom opponents (BOTTOM, the three teams classified in the last 3 at the end of the season). The variables calculated for each ball possession were the following: ball position; team space in possession; game space (comprising the outfield players of both teams); position and space at the end of ball possession. Statistical comparisons were carried with magnitude-based decisions and null-hypothesis analysis and factor analysis to define the underlying structure among variables according to the considered contexts. Results showed that playing against TOP opponents, there was ~38 meters game length per ~43 meters game width with 12% of coefficient of variation (%). Ball possessions lasted for ~28 seconds and tended to end at ~83m of pitch length. Against BOTTOM opponents, a decrease in the game length with an increase in game width and in the deepest location was observed in comparison with playing against TOP opponents. The duration of ball possession increased considerable (~37 seconds), and the ball speed entropy was higher, suggesting lower levels of regularity in comparison with TOP opponents. The BOTTOM teams revealed a small EPS. The Principal Component Analysis showed a strong association of the ball speed, entropy of the ball speed and the coefficient of variation (%) of the ball speed. The EPS of the team in possession was well correlated with the game space, especially the game width facing TOP opponents. Against BOTTOM opponents, there was a strong association of ball possession duration, game width, distance covered by the ball, and length/width ratio of the ball movement. The overall approach carried out in this study may serve as the starting point to elaborate normative models of positioning behaviours measures to support the coaches’ operating decisions.


Introduction
One of the biggest challenges in team sports is to validate performance indicators that contribute to optimize the coaching process and competition outcome [1]. Within the traditional methods of performance analysis in team sports, the notational analysis has been used to obtain indicators of discrete actions and/or events by using advanced statistical procedures [2,3]. This approach allows to understand the static complexity of performance, to produce a valid and reliable description of individual and team behaviours and to describe teams' performance by correlating a wide range of variables [4,5]. Also, the players' physical and physiological competitive demands [6] and their comparison with training have been incessantly investigated over the last years, allowing to identify different match profiles according to the distances covered at different speed thresholds [7][8][9]. The identification of such profiles brought relevant aspects to plan the physical loads from short to mid-term planning guidelines and also to minimize the fatigue and the risk of injuries [10,11].
However, as such variables and methods describe the match results, they are limited in underpinning behaviours that lead to understand or even predict the outcomes. In line with these concerns, the players' position data has recently emerged as one of the key determinants of team sports performance [12][13][14][15]. Such analysis allows to capture the players' and teams' spatiotemporal dynamics at different levels of analysis (from individuals to teams) with the integration of the contextual circumstances. Accordingly, several new instruments, procedures, processing techniques, and new visuals may be incorporated into the performance analysis scope to complement the static complexity analysis and attend the new dimension of questions related to the dynamic complexity.
In fact, the training process in modern association football is a multifactorial process that requires high complementary among physiological, technical and tactical workload prescriptions. To accomplish these goals, contextual information, such as field location, team strategy and opponent behaviour, that sustain players and teams' tactical behaviour should be integrated in the analysis of physical loads [16][17][18]. That is, players' physical demands over competition are constrained by their positioning dynamics [1,19] that sustain the individual and collective tactical behaviour [20]. Tactical behaviour of players and teams are regulated by spatial-temporal informational constraints within the teams' collaborative principles, that consequently, affect match physical demands. In this line of reasoning, previous ideas and methods from dynamical systems have been used to integrate different dimensions of analysis [21][22][23][24] and to explain how system working parts are connected and how they continuously adapt over time [1,25]. The main idea is that the players and teams' behaviours should be considered as a whole, where systems with many dynamically interacting elements are capable of wide-ranging patterns of behaviour [22]. In this regard, substantial advances have been made to increase the processing techniques aiming to assess regularities in players and teams' positioning derived data [26,27]. The entropy computations serve this purpose by allowing to measure the probability that the configuration of one segment of data in a time series will allow to predict the configuration of another segment of the time series a certain distance apart [28]. This technique has been used to identify if players' positioning dynamics express a predictable pattern which may provide insights about the local information sources that are being used in their match decisions [26,[29][30][31]. Also, recent research has developed metrics to identify the determinants of collective behaviour that may optimise the players' ability to attune their decisions within teammates during mutual tasks [12,15]. For instance, the entropy of players' distance to the centroid was suggested to be an order parameter that represents the use of local information in the decision making and team-related behaviours [26,29]. Other metrics from the match-level of analysis, such as teams' length and width, have been used to provide information regarding space occupation [32,33].
The identification of individual and collective patterns of play according to game environment and teams' collaborative principles over competition allow coaches to improve the process of planning, designing and executing training tasks. In fact, match analysis allows to extract relevant and transferable information from the competition to practice. Based on the information derived from competition, the training tasks should ensure a representative design, i.e., training is intended to represent the competitive context so practitioners can experience the similar perceptual-motor relations landscape, and exploit the inherently adaptive nature of their perceptual systems in their interactions with environments [34][35][36]. The main goal is to improve the transfer between technical, physical and individual and collective tactical behaviours from training to competition by sampling the informational constraints that players use to perform under the competitive environment [37]. Therefore, variables such as space of play, time, numerical relations, distance to the goal or even ball trajectory might play an important role in assisting the fundamentals of coaching.
Therefore, the aim of this study was to extract spatial-temporal patterns of play that characterize a team match demands when considering the effects of the quality of opposition. To fulfill this main goal, the approach was carried on a twofold step: (i) describe the collective behavioural patterns that relies upon players' positioning in interaction with teammates, opponents and ball position; (ii) and define the underlying structure of the derived variables through application of a factorial analysis. It was hypothesized that accounting for the quality of opponents might reveal important variations in team behaviour as a function of ball possession. This information is key for tactical planning and would enhance the transferability of match demands to training tasks. Also, this approach may serve as the basis to elaborate normative models of positioning behaviour measures to support coaches' decisions.

Sample and data collection
The sample of this study comprised a total of 1,413 ball possession sequences, obtained from twelve elite football matches from English Premier League. The dynamic position of the players (from both competing teams), as well as the ball, was captured and transformed to two-dimensional coordinates using the TRACAB Optical Image Tracking System at 25 Hz. The system uses super-HD cameras and patented image processing technology to deliver live tracking of all moving objects with a maximum delay of just three frames (https://chyronhego.com/) and its used continuously in several leagues (English Premier League, German Bundesliga and Spanish La Liga) and also provided data for several studies [38,39]. Data included the ball possession sequences from the six matches of one team when playing against top opponents (TOP, the three top teams classified in the top-3 at the end of the season) and six matches against bottom opponents (BOTTOM, the three bottom teams classified in the bottom-3 at the end of the season). The team ended the season in the top-5 position. The study protocol was approved and followed the guidelines stated by the Ethics Committee of the of University of Trás-os-Montes and Alto Douro, based at Vila Real (Portugal) and conformed to the recommendations of the Declaration of Helsinki.

Processing and variables
The considered number of ball possessions was selected, processed and analysed based on the following inclusion criteria: (i) minimum possession duration of 8 s (for nonlinear computations purposes described in the next paragraph); (ii) all possessions without set pieces. The ball possessions for each team started when a player performed an action with the ball (following an action from an opponent's player or a game interruption) and ended when the ball was lost or when the ball went out of the pitch after a shot. For each ball possession sequence, several variables were calculated based on: ball position; team space in possession; game space (comprising the outfield players of both teams); position and space at the end of ball possession. The absolute mean values and the coefficient of variation (CV%) were calculated for each variable. The normalized approximate entropy (ApEn), a measure of regularity derived from the original ApEn [27], was calculated for the speed of the ball in each possession sequences. These methods allow the comparison between ball possessions sequences with time series with different lengths. ApEn technique was used to assess regularity or predictability of the ball speed time series where input values for computations were 2.0 to the vector length (m) and 0.2 standard deviations to the tolerance factor (r) [40]. The outcome range between 0 and 2 (arbitrary units) and lower values represented more repeatable, regular, predictable and less chaotic sequences of data points [28,41]. Also, and according to previous calculation recommendations [40], to increase the reliability of normalized ApEn computation, only ball possessions larger than eight seconds (8 seconds × 25 Hz = 200 frames) were considered for further analysis. From a processing approach, ApEn expresses the probability that the configuration of one segment of data in a time series will allow the prediction of the configuration of another segment of the time series a certain distance apart [42]. It permits to identify if the ball displacement trajectories express a regular and predictable pattern which may, in turn, provide information about dynamical behaviors [23,43].

Data analysis
The variables extracted from each ball possession sequence were analysed according to the quality of the opponent (TOP or BOTTOM). The descriptive outcomes were graphically represented using split-violin plots and absolute values were presented in tables. Statistical comparisons were carried with both magnitude-based decisions and null-hypothesis analysis. For the MBI, and prior to the comparisons, all processed variables were log-transformed to reduce the non-uniformity of error. Within, non-clinical inferences were assessed via differences in group means, expressed in percentage with 95% confidence limits (CL). The threshold for a change to be considered practically important (the smallest worthwhile difference) was 0.2 times the standardisation estimated from between-subject standard deviation. The following magnitudes of clear effects were considered: <0.5%, most unlikely; 0.5-5%, very unlikely; 5-25%, unlikely; 25 to 75%, possibly; 75% to 95% likely; 95% to 99%, very likely; and >99% most likely large [44,45]. Also, the comparisons were assessed via standardized mean differences and respective 95% confidence intervals. Thresholds for effect sizes statistics were 0.2, trivial; 0.6, small; 1.2, moderate; 2.0, large; and >2.0, very large [46]. For null-hypothesis analysis, and after the assumption of normality and homogeneity of the data, an independent t-test was conducted to evaluate the differences in variables for each comparison context and statistical significance was set at p < .05.
A factor analysis was performed to define the underlying structure among variables according to the considered contexts (playing against TOP and BOTTOM opponents). The principal components analysis method (PCA), with the Varimax rotation, was preferred since it is the most appropriate when data reduction is paramount. The Bartlett's test of sphericity was computed to provide the statistical significance that the correlation matrix has significant correlations among at least some of the variables. The measure of sampling adequacy was also developed with Kaiser-Meyer-Olkin (KMO) and computed to evaluate the appropriateness of applying factor analysis, considering that values above .50 for the entire matrix or an individual variable indicate appropriateness. The number of factors to be retained was based on eigenvalues (greater than 1.0) and that explained higher than 60% of percentage of variance. Afterward, although factor loadings of ±.30 to ±.40 are minimally acceptable, values greater than ±.60 were considered for practical significance (for all processing decisions please see Hair and colleagues [47]. MBI were carried using a specific spreadsheet to compare two group means [48]. The independent t-test and the PCA computations were conducted using the Statistical Package for the Social Sciences software (IBM Corp. Released 2016. IBM SPSS Statistics for Macintosh, Version 24.0. Armonk, NY: IBM Corp). The graphical representations were processed using R [49] with open-source R-packages for performing the split-violins [50] and the 3D rotated plots [51]. and Table 2 show the descriptive and inferential analysis between the matches played against BOTTOM and TOP opponents, respectively. As a complement, Fig 5 presents the standardized (Cohen's d) differences. Table 3 and Table 4 present the component factor loadings, component statistics, Bartlett's test of sphericity and Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy of the factor analysis (principal component methods) for considered variables in the games against TOP and BOTTOM opponents, respectively. Finally, Fig 6 shows the component factors of the factor analysis in 3D space according to the quality of the opponents.

Results
The changes in position-derived variables showed different trends according to the quality of the opponents. When facing TOP opponents, trivial to small differences were observed in all the considered variables. The ball speed and ApEn were likely lower (p < .01, small effect) in TOP opponents while higher values were observed to CV% and length/width ratio (p < .001, small effect). The team space occupation and position show likely small higher values in EPS (p < .001) when the team was in possession (Team = 984.95±212.22 m 2 vs TOP = 908.8 ±242.02 m 2 ) with lower CV% (p < .001, small effect). The team's possessions tended to end closer to the opponents' goal (nearly trivial effect, p = .003, Team = 83.2±13.0 m vs. TOP opponents = 80.2±13.9 m).
When analysing matches against BOTTOM opponents, the variables changed substantially (differences from small to nearly large according to Cohen's d effects). The ball possessions of  the team were higher in duration (higher 13.9%; ±8.0%, p = .009, small effect) and higher in both distance covered and ball speed (p < .001, moderate effect) compared to the ball possessions of the BOTTOM teams. This trend was followed by moderate lower ball speed's variability (lower 20.8%; ±5.6%, p < .001) and a moderate higher ball speed's ApEn during the possession (29.6%; ±7.1%, p < .001). The variables related to team space occupation and position presented moderate higher values in the effective playing space of the team (Team = 1058.6±188.3 m 2 vs BOTTOM = 862.9±223.6 m 2 ) with a moderate lower CV% (lower 31.2%; ±9.1%, p < .001).   Table 3 and Fig 6, upper panel, for complementary information). Extracting spatial-temporal features that describe a team match demands

Variables Component factors (Team) Component factors (TOP)
In the analysis of the team against BOTTOM opponents, the principal component model accounted for 71.9% and 72.7% of the total variance, respectively. For the team, the first component ( Table 4 and Fig 6, lower panel, for complementary information).

Discussion
The aim of this study was twofold: (i) to describe the collective behavioural patterns that relies upon players' positioning in interaction with teammates, opponents and ball position; (ii) and to define the underlying structure of the derived variables using factorial analysis. As was hypothesized, the quality of opponents promoted substantial variations in team tactical behaviour. The descriptive analysis may provide key information to plan, design and execute Extracting spatial-temporal features that describe a team match demands representative tasks during the training process. In addition, the PCA revealed the determinants of collective tactical behaviour that coaches should consider when designing representative training tasks. Improving performance requires that coaches use a multifactorial training process that integrates information from physical, technical and tactical game requirements [12,52]. The alignment on teammates' performance with the collective strategic plan as well considering the specificities of opponent team is fundamental to ensure the transfer between practice and competition [34][35][36]. Consequently, the daily challenge for coaches' intervention is to identify, manipulate and monitor the variations in spatial-temporal information constraints within the simulated scenarios [36]. Despite some very interesting approaches that highlight the importance of representativeness in youth players' training [53,54], it is still not clear how to gather the information on the individual and collective responses and the different levels of contextual unpredictability that the training process should embrace to reflect the competitive reality. Extracting spatial-temporal features that describe a team match demands In this sense, a deeper analysis of match outcomes might help to guide coaches in the planning, designing and execution of more ecological training tasks. In other words, the practice should promote the emergence of match specific behaviours and favour a high degree of transferability of players' behaviour from the training drills to competition [34].
Previous research explored the effect of teams' performance level in spatial variables revealing that there is a higher use of the width of the pitch in detriment to the length, and that first division teams display greater depth than second division teams [55]. Also, results suggested that the playing space is influenced by the ball location. In fact, playing lengths tend to decrease while widths tend to increase as the ball moves from the goal area to the midfield zones [56]. Accordingly, the rectangle comprising all the outfield players have been characterized by lower values of length and higher values of width when the ball is in the central area of the pitch [57]. While these studies have added important information regarding space occupation, also, important practical information may be picked to design playing area dimensions, if it is considered the dynamics between the opposing teams when accounting to the quality of opposition. In the present study, different playing spaces emerged when comparing the players performance according to the opposition quality. For example, when facing TOP opponents, the possessions used~38 meters game length per~43 meters game width with 12% coefficient of variation (%). This information can be directly applied in the design of training tasks. When the scope of the task is working on functional solutions near to the scoring area, the definition of playing space constraints can be parametrized by these results. Additionally, the possessions last~28 seconds and at the end of the ball possession, the deepest location of the offensive player was at~83m of the pitch length. These behavioural features reflect the temporal adaptations to new spatial structures and configurations generated by players' sub-groups interactions as match situations dynamically change [22,23,58,59]. These results can be used as spatiotemporal guidelines with higher transferability from match to training tasks because of the high level of representativeness. Overall, the final aim is to potentiate the development of players' tactical awareness through the creation of training tasks based in contexts that stimulates decision-making under strong time-pressure.
The length, width and surface area have been used before by exploring the effects of several manipulations on the players' performance during small-sided games [60][61][62]. The same metrics have been also analysed in match context according to the quality of the opposition, and the results showed higher values for offensive situations when playing against weak teams [63]. In this study, we updated this information with complementary insights. Against BOTTOM opponents, the deepest location increases, consequently making the scoring area smaller to the team. The playing game space increased in width and decreased in length. In addition, the duration of ball possession increased considerably (~37 seconds) and the ball speed ApEn was higher, suggesting lower levels of regularity. The combination of these match features is associated with lower ratio of the length/width ball displacement, suggesting passing actions towards the lateral direction. This might be the result of defensive compactness, that requires the use of the pitch width to create space. In fact, previous research has shown that teams of lower quality are likely to defend closer to their target to restraint space and time [63], which may led to higher use of the pitch width by the offensive team to break the defensive stability and create space for shooting opportunities. Practical applications of these results can be observed in Fig  7. The spatial references (game area, scoring area, etc.) represent real values and may be used as baseline information to create task structural boundaries. Accordingly, coaches may use complementary constraints (e.g. number of touches allowed to increase/decrease the ball speed, floater players to give higher depth or width to the possession, the goal of the exercise to manipulate the ball possession duration, etc.) to develop and practice team strategies and boost task representatives.
Previous research has shown that weaker teams are likely to adopt a more direct playing style as pattern of play [64]. While this behaviour has been attributed to their lower ability to maintain the ball possession, in turn, the results from this study may provide additional information. Accordingly, from the perspective of the BOTTOM teams in possession, players face a higher pressure since their EPS is small, even offensively (864.21±222.94 m 2 ). This pressure affected the deepest location that considerably decreased (~77 meters), creating an open space in the defensive back (of the TOP opponents). This open space may afford the BOTTOM teams to use a more direct playing style to explore. In fact, the increased ball length per width ratio (~2 arbitrary units) seems to support this idea. Nevertheless, is important to highlight some spatiotemporal boundary condition so that players' can adequately reproduce during the training the interaction dynamics that are mediated by match environmental information.
On integrating association mining with PCA, this study extends the application and understanding of match analysis by compression statistical mechanism. Previous studies have Extracting spatial-temporal features that describe a team match demands already explored this method which frequent patterns are analyzed for their interrelationship in order to generate association in players' performances [65][66][67][68][69]. Results related to notational analysis suggested shots, shots on goal, playing time with ball possession and percentage of ball possession as the most important variables to discriminate winning teams from drawing and losing [68]. Additionally, winning teams have already been reported to exhibit different and consistent profiles from drawing and losing teams, mainly discriminated by their ability to recover the ball in zone 2 (close to middle line in the defensive half), and to organize the offense using penetrative passes to the penalty area to increase the number of shots and, consequently, goals [66]. The present study extends the PCA approach to the spatiotemporal variables, aiming to define its underlying structure in terms of emergent collective behaviour, as well as to model how the quality of opposition affects these relations. The outcome can be used as key information to understand how the manipulation of tasks will change the variables' association. Also, it may be used by coaches to verify whether the essence of tactics or playing style practiced in sessions is performed during competition. The matches played against TOP opponents showed similar trends. The first three components explained more than 50% of the data variance, showing that the increase of ball speed and ApEn of the ball speed were associated with ball speed CV%. The increase in the deepest location at the end of ball possession generated less offensive available space, as expected, and the EPS of the team in possession was well correlated with the game space, specially the game width. Bringing this information to the training process, coaches can increase/decrease the effective playing space by changing the pitch width any consequent changes in ball movement (number of touches allowed, for example) will promote changes in ball speed variability (for magnitude, CV%, and structure, ApEn). A recent study explored the soft-assembly of tactical patterns and the timescales of positioning-derived variables that define them during a soccer match, allowing understanding the multilevel organization of tactical behaviors as defined by the timescales of evolution of collective patterns [65]. In fact, the authors showed how teams behave during a competition and how these behavioral patterns change over the course of the match, influenced by the different constraints. However, there was still a gap to be filled about the boundary conditions that could be applied to training tasks so the players' ability to adapt to different context information would be practiced in a rate of change which reflects competition.
The PCA analysis for the matches played against BOTTOM opponents provided further relevant context information. In the team, the principal components showed a strong association among ball related variables. The ball possession duration was strongly related to the game width and distance covered by the ball, and negatively related with length/width ratio of the ball movement. Moreover, these results are connected with the descriptive analysis, where was shown that the team need to increase the width of the game in order to create open space. In fact, stronger teams with higher skilled players seem to be able to sustain the ball possession for longer time, develop longer passing sequences, thus producing more goals per possession than shorter passing sequences [64].
Further research can include all levels of performance because the current study only explored the impact of playing against top and bottom teams. Also, technical and physical performance data can be considered in a more multidimensional and integrated approach, to increase the understanding of how players base their game interactions and, thus, constitute a solid criterion for fine-tuning the training process and performance modeling.

Conclusion
Spatiotemporal temporal patterns of play that sustain collective tactical behaviour of teams can be extracted from matches to design highly representative training tasks and provide a better understanding of the underlying processes, contributing to performance outcomes. In line with this statement, the ball-position derived variables added novel information to describe the collective behaviour patterns and should be considered into analysis. As expected, the quality of opponents promotes a great variation in team behaviour as function of possession and is presented as an important factor to be considered. When playing against TOP opponents, the team possessions lasted for~28 seconds, in a playing space of~38 meters of length per~43 meters of width and with the deepest location of the offensive player at~83m of pitch length. Against BOTTOM opponents, the deepest location increased, as well as the game width. These results seem to emerge as the team attempts to explore the pitch lateral spaces to break the defensive stability of the BOTTOM teams, leading to a decrease in the ratio of the length/ width ball displacement, and an increase in the duration of ball possession (~37 seconds). From the perspective of BOTTOM teams in possession, players face a higher pressure, reflecting a small EPS. Furthermore, this open space may afford the BOTTOM team to explore a more direct playing space, which can be depicted from the increase in the ball length per width ratio (2 a.u.). The PCA provided relevant complementary information to define the underlying structure among the variables and showed a strong association with ball speed, ApEn and CV % of the ball speed. The EPS of the team in possession was correlated with the game space, especially the game width against TOP opponents. Against BOTTOM opponents, there was a strong association with ball possession duration, game width, distance covered by the ball, and length/width ratio of the ball movement. The overall approach carried in this study may serve as baseline to elaborate normative models of positioning behaviours measurements to support coaches' operational decisions towards a holistic, representative and complex point of view that help to design highly representative training tasks.
Supporting information S1 Video. Video animation produced from the real calculated variables for better visualization of the underlying dynamics. The video ended when the ball goes out, however, to data processing, the ball possession ended in the exact time frame when the player performed the shot (represented in Fig 1). (MP4)