Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Maximizing ball movement unpredictability in association football: A Rényi entropy-based approach to optimizing event distribution randomness

  • Ishara Bandara ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    isharabnd@gmail.com

    Affiliations School of IT, Deakin University, Melbourne, Victoria, Australia, Research Centre for Fluid and Complex Systems, Coventry University, Coventry, United Kingdom

  • Sergiy Shelyag,

    Roles Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliations School of IT, Deakin University, Melbourne, Victoria, Australia, College of Science and Engineering, Flinders University, Adelaide, South Australia, Australia

  • Sutharshan Rajasegarar,

    Roles Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliation School of IT, Deakin University, Melbourne, Victoria, Australia

  • Dan Dwyer,

    Roles Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliation Centre for Sport Research, Deakin University, Melbourne, Victoria, Australia

  • Eun-jin Kim,

    Roles Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliation Research Centre for Fluid and Complex Systems, Coventry University, Coventry, United Kingdom

  • Maia Angelova

    Roles Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliations Aston Digital Futures Institute, Aston University, Birmingham, United Kingdom, Institute for Biophysics and Bioengineering, Bulgarian Academy of Sciences, Sofia, Bulgaria

Abstract

Modern football prioritizes team play and tactical strategies over individual brilliance. However, its low-scoring nature makes evaluating team performance challenging. Unpredictable ball movement enhances offensive play while complicating defensive setups. To better capture this dynamic nature, authors’ prior work has proposed entropy-based time-series metric to assess unpredictable ball movement by quantifying Spatial Event Distribution Randomness (EDRan). However, some teams may prefer to dominate specific areas with unpredictability, while others utilize the entire field. Existing literature has not examined whether emphasizing dominant (frequently used field regions for ball movement) or considering all regions equally, including rarely used areas, is a more effective approach for computing randomness in event distribution. Moreover, existing research has not investigated the underlying patterns of event distribution randomness, particularly how these variations differ between winning and losing teams, both in terms of overall field coverage and concentration within dominant regions. This study addresses these gaps by analyzing event distribution randomness using Rényi entropy with varying alpha values (0).Correlation analysis indicated that assigning equal weight to all field regions, including rarely used areas, with Max entropy (alpha) was most strongly associated with match-winning performance. In men’s data, machine learning models trained with alpha and 0.5 achieved statistically significant improvements over models trained with the traditionally used Shannon entropy (alpha). These results suggest that unpredictability distributed across the entire field, maximizing the use of diverse regions, is more strongly associated with success than randomness restricted to dominant areas. The best-performing model, obtained with alpha, significantly outperformed both the baseline and existing models in the literature, achieving an accuracy of 80.61% in predicting match winners.

1 Introduction

Association football is a highly dynamic and strategic sport where team coordination and tactical execution play a crucial role in a team’s success. Thus, the team’s manager or tactical decision maker’s role is highly valued by most teams rather than individual star players. Unlike sports with frequent scoring ways and opportunities, association football is a low scoring invasion sport where the team attacks while defending attacks from opposition and shooting at goal is the only way of scoring [1,2]. Association football can be viewed as a complex system [3], and its results are often influenced by chance or luck [46]. Thus, performance evaluation becomes challenging, often requiring advanced analytical approaches to evaluate team performance.

This unpredictability and randomness in match outcomes can be influenced by controllable factors such as tactics [7] and skill [8], as well as uncontrollable factors like chance or luck [4,5]. Since scoring in association football is only possible through shots on goal, widely used offensive performance metrics, such as expected goals (xG), assess the probability of scoring based on statistical factors influencing shot success, including distance from goal, shooting angle, and goalkeeper positioning [911]. However, recent studies have introduced expected goals models that account for the sequence of events leading to a shot, demonstrating that prior events significantly influence shot success probability [12].

Ball possession is another commonly analyzed performance metric in the literature [1316]. While some studies suggest a positive correlation between possession and match-winning performance [13,14], others report no significant relationship between possession and victory probability [15,16]. These conflicting findings may be due to lack of time-series analysis, as teams adjust their strategies based on various factors such as game state and opposition strength [17,18]. Guan et al. (2023) revealed that teams leading by a single goal adopt a more cautious approach, particularly in the latter stages of a match or when facing a strong opponent or when leading by only one goal [17,18]. Additionally, some teams prioritize possession-based tactics, emphasizing ball retention, whereas others favor direct play, focusing on long passes rather than prolonged possession. Existing research indicates that “direct-play” strategies and counterattacks are more effective than possession-based tactics [7,19]. For these reasons, possession alone may not provide comprehensive insights into team performance.

Other traditional methods for evaluating offensive performance have often relied on event-based statistics, such as the number of shots, shots on target, and completed passes [2023]. However, these event-based metrics fail to capture the temporal aspects of the game and often overlook the complexity of ball movements and structured or unstructured tactical plays.

Open space in the field available for the ball-carrier has been evaluated in the literature as an alternative performance evaluation metric to evaluate offensive performance as today’s football tactics focus on creation of open space to create scoring opportunities [2427]. However, in order to create open spaces in the field, opposition defense should be disrupted. Therefore, both offense and defense play a crucial role in invasion sports such as association football. In defense, swift decision making is necessary to stop opposition attacks. However, predictable ball movements by the offensive team can make it easier for opposition defenders to defend the attacks. Recent research has explored entropy-based models to quantify randomness in ball movements, considering spatial event distribution randomness (EDRan) [28] and player-to-player interactions [2931] offering a new perspective on offensive unpredictability.

While some of the previous work has evaluated the player to player interaction randomness as a performance evaluation metric [2931], players subject to substitutions, role changes, and positional adjustments. Therefore, as a solution, Spatial Event Distribution Randomness (EDRan) has been proposed by the authors, as an alternative measure to quantify randomness in team performance assuming that ball movement is more critical than player movement for analysis of offensive team performance [28]. EDRan has been introduced as a time-series team performance evaluation metric that exhibits a positive correlation with match-winning performance. In previous studies, a simple machine learning model developed with Generalized Linear Model (GLM), utilizing only EDRan data collected across ten time intervals achieved an accuracy of 79.95%, highlighting its significance as a team performance evaluation metric in association football. For quantifying spatial event distribution randomness (EDRan), previous research proposed segmenting the football field into predefined regions and generating probability distributions based on ball movement events within each region. Shannon entropy, which accounts for each region proportionally to its ball movement events, was used as the metric to measure randomness in these distributions.

However, football tactics vary from team to team. Modern football tactics continue to evolve [32], with some teams emphasizing possession-based play to control the ball and create goal-scoring opportunities, while others adopt “direct-play” or defensive strategies, allowing the opposition to attack and capitalizing on counterattacks. Additionally, while some teams prefer certain regions of the field (e.g, some teams prefer flanks more while some teams prefer central channel to move the ball), others utilize the full width of the field to create space. As a result, certain teams may dominate specific areas, while some may try to distribute their actions more randomly across the field [33,34]. Taki et al. defined a player’s dominant region as the area they can reach before any other player and when the dominant regions of all players on a team are combined, this forms the team’s dominant region [33]. However, earlier studies have not evaluated whether assigning greater or lesser emphasis on dominant or rarely used regions are effective for quantifying EDRan and has not explored how the temporal variation in these regions differs between men’s and women’s games.

To address these gaps in the literature, this study investigates whether placing greater emphasis on more frequently used regions, assigning equal weight to all regions, including rare ones, or utilizing Shannon entropy, which proportionally weights all regions based on their probabilities, correlates more strongly with match-winning performance. As an alternative to Shannon entropy, this study employs Rényi entropy with varying α values () to determine whether an entropy measure that is more sensitive to rare events or one that prioritizes frequently used regions is more suitable for quantifying event distribution randomness (EDRan). This study defined dominant and rare regions according to how frequently they were used for ball movement.

  • Dominant regions are defined as areas of the field that teams frequently utilize for ball movement, where the likelihood of a ball movement event (e.g.: pass) occurring is significantly higher compared to other regions.
  • Rare regions are defined as areas of the pitch that teams infrequently utilize for ball movement, where the likelihood of a ball movement event occurring is significantly lower compared to other regions.

Rényi entropy, introduced by Alfred Rényi in 1961 [35], generalizes classical Shannon entropy [36] by incorporating a parameter α, providing a more flexible framework for quantifying information diversity.

The analysis of this work was conducted separately for men’s and women’s game data, focusing on the temporal dynamics of EDRan computed using Rényi entropy across varying α values. Match-winner prediction models were developed using data across several α values, and based on these results, an advanced model was proposed to investigate the relationship between temporal performance metrics and match outcomes.

Investigating these aspects could contribute to improve the accuracy of match outcome prediction models based on EDRan, facilitating more effective assessments of team performance that are independent of final scores. Moreover, it facilitates deeper insights into gameplay dynamics, including temporal patterns, the extent to which, randomness in dominant versus less utilized rarer regions correlates with winning outcomes, and how such randomness evolves in relation to time.

2 Materials and methods

Ethics approval for this study was obtained from the Coventry University Ethics Approval Committee (Approval Number: P174511). All procedures performed in this study were in accordance with the ethical standards of Coventry University, UK.

This study employed a retrospective observational design based on secondary data from professional football matches. A cross-sectional comparative framework was applied, analyzing event distribution randomness across matches to compare winning versus losing teams. No interventions were introduced, and all analyses were conducted on previously recorded match event data.

Considering that some teams concentrate ball movement in some specific regions of the field (dominant regions) while others distribute play across the entire field, this study compares three approaches to quantifying spatial event distribution randomness (EDRan): (i) emphasizing dominant regions (frequently used regions for ball movement), (ii) assigning equal weight to all regions, including rarely used ones, and (iii) using Shannon entropy, which weights regions proportionally to their event distribution probabilities. To facilitate this comparison, Rényi entropy is applied with α values in the range , using region-based cumulative possession matrices as proposed in past investigations [28]. Football field is divided into 30 equal area named regions, each game is divided into ten time intervals, five equal time intervals per half (ti;), and region-based cumulative possession matrices are constructed to derive the probability distribution associated with a team’s event distribution during the considered ti time period. These probability distributions are then used to quantify randomness with different α values in Rényi entropy, allowing for the identification of the optimal α value for EDRan through a correlation analysis with match-winning outcomes and development of match winner prediction models. Fig 1 provides an overview of the summarized framework of this study.

All data processing and analyses in this work were conducted in Python 3.11.13 using scikit-learn 1.6.1, pandas 2.2.2, numpy 2.0.2, matplotlib 3.10.0, and scipy 1.16.1 within the Google Colab environment. The experiments in this study were executed on commodity hardware in a CPU-only environment, using a personal laptop (16 GB RAM); no GPUs were employed.

2.1 Data

This work used the StatsBomb open data set [37], which is a publicly available event-log dataset containing match data from top-tier men’s and women’s international and club competitions in Europe. Given the technical and physiological differences between men’s and women’s football in existing literature [38], men’s and women’s football games were evaluated separately.

The dataset included detailed event-log data, such as event location, event type, ball possession team, event duration, and player involvement. Event locations were mapped onto a pixel grid, oriented in the direction of the possession team’s attack, with the x-coordinate representing the field’s length and the y-coordinate representing its width. In the StatsBomb dataset, teams are labeled as Home team and Away team. However, these labels do not necessarily reflect the actual home or away status of the teams, they are used solely as a naming convention. To avoid confusion, this work instead refers to the two sides as “Team A” and “Team B”, which correspond respectively to the Home team and Away team designations in the StatsBomb dataset. Men’s dataset consisted of games from 116 men’s teams while women’s dataset consisted of games from 54 women’s teams.

Games without a match winner were excluded from the dataset, as the study aims to assess the temporal contributions of the considered temporal metrics to match-winning performances. Additionally, to maintain consistency in temporal feature extraction, where each game is divided into ten equal time intervals, only matches that concluded with a winner within regular playing time (two halves) were considered. Remaining men’s dataset consisted of 608 games where “Team A” has won 342 games (56%) and “Team B” has won 266 games (44%). Remaining women’s dataset consisted of 374 games where 206 games (55%) has been won by “Team A” and 168 games (45%) has been won by “Team B”.

To assess adequacy of sample size, a post hoc power analysis was conducted for a two-sided one-sample proportion test against p0 = 0.50 at . For the men’s dataset (n = 608, 342/608 = 0.5625), the estimated power was , indicating sufficient power to detect the observed deviation from 50–50; for the women’s dataset (n = 374, 206/374 = 0.5508), the power was , indicating the sample is underpowered. This limitation is acknowledged, and while analyses are reported for both datasets, no definitive conclusions are drawn from the underpowered women’s dataset.

2.2 Event distribution randomness

A previous study introduced event distribution randomness (EDRan) as a time-series team performance evaluation metric for assessing unpredictable ball movement performance in association football [28]. This approach involves dividing the field into 30 equal segments and constructing region-based cumulative possession matrices for each team over a defined time period to estimate the probability distribution of spatial event distributions.

To generate “Region-based Cumulative Possession Matrices”, the football field is divided into 30 equal-area regions, as proposed in [28]. Liu et al. (2016) proposed dividing the association football field into thirty regions to analyze passing patterns using data mining techniques [39]. In the authors’ previous work [28], this method was refined by dividing the field into 30 equal-area regions, as unequal region sizes could bias the measurement of randomness. The present work adopts the same equal-area division of the field into 30 regions. This is achieved by subdividing each third of the field (attacking third, central third, and defensive third), as defined in the literature [40,41], into 10 equal regions. However, different field partitioning schemes were also evaluated, including increasing the number of regions (e.g., 96 regions (12 columns × 8 rows)) and reducing it (e.g., 15 regions (5 rows × 3 columns)). The analysis indicated that dividing the field into a higher number of regions resulted in a sparse distribution of events, with many regions containing very few or no events, which consequently led to poor model performance. In contrast, reducing the number of regions resulted in all teams utilizing every region during play, thereby limiting the discriminatory power of the randomness analysis. The 30-region configuration offered the best balance, yielding more informative distributions.

This approach enables the computation of event distribution in terms of ball movement, independent of individual players, as players are subject to substitutions, role changes, and positional adjustments. Thus, it is assumed that ball movement is more critical than the player movement for analysis of team performance. Fig 2 shows the division of football field into 30 regions.

In this study, each game is divided into ten equal time intervals , with covering the first half and the second. Unlike [28], which segmented the full match into ten equal parts without respecting the half-time break, allowing the fifth interval to contain events from both halves due to added (injury) time at the end of each half. The present approach first identifies the half-time boundary and then divides each half (including its injury time) into five equal intervals. This prevents cross-half mixing and yields half-specific, temporally aligned segments that are comparable across matches. In association football, the halftime break is followed by approximately 45 minutes of play, allowing players to rest, switch sides, and revise tactical strategies. The previous approach often placed both the end of the first half and the beginning of the second half within the same interval typically the fifth, potentially ignoring important temporal patterns. By segmenting each half separately, this study preserves the distinct dynamics at the end of the first half and the beginning of the second half, ensuring more accurate temporal modeling. However, both increasing and decreasing the number of time periods were also examined. A reduction in the number of periods resulted in a high concentration of events within each time period and a diminished set of temporal features. In contrast, increasing the number of periods produced fewer events per time period, leading to a greater number of sparse regions. To achieve a balanced configuration, 10 time periods (5 for each half) were employed.

Within each interval, a region-based cumulative possession matrix is constructed for both teams by analyzing ball-movement events occurring during that period. This matrix, structured as a matrix (30 values), represents 30 distinct field regions. For each ball-movement event, the corresponding team’s matrix is updated by adding the event duration to the respective cell associated with the field region where the event occurred. Off-ball events were not included in this study due to dataset limitations and because their analysis lies beyond the scope of the present work, which focuses on examining randomness in ball movement and its association with match-winning performance. By iterating through all events within a given time interval, the resulting matrix captures the total ball-carrier event duration for each region. To derive the probability distribution of event distributions, the matrix values are normalized by dividing the matrix by the team’s total event duration within that interval. This process is repeated for both teams across all ten time intervals in each match. Fig 3 shows the steps of generating region-based cumulative possession matrix to derive probability distributions of event distributions as proposed in [28].

thumbnail
Fig 3. The steps of generating region-based cumulative possession matrices to compute probability distributions of event distributions, as outlined in [28].

https://doi.org/10.1371/journal.pone.0326800.g003

Subsequently, an entropy measure is applied to quantify event distribution randomness using the generated probability distributions for both teams in each game across the 10 defined time intervals.

In earlier work, Shannon entropy was proposed as a measure of calculation of entropy using the probability distribution generated by the region-based cumulative possession matrix [28]. Shannon entropy is a measure of uncertainty in information theory, quantifying the randomness of a probability distribution [36]. A higher Shannon entropy value indicates greater randomness, whereas a lower entropy value suggests a more predictable distribution.

However, Shannon entropy assigns proportional importance to all 30 regions based on their event distribution probabilities. This raises the question of its suitability and leaves space for improvement, as some football tactics focus on dominating specific field regions (by frequently using specific regions of the field for ball movement (e.g., flanks)), while some emphasize utilizing the entire playing area for ball movement, increasing spatial dominance. To address this, this study evaluates the quantitative value of EDRan with Rényi entropy with to determine which value of α serves as a more appropriate entropy measure. Rényi entropy generalizes this concept by allowing flexibility to emphasize either dominant or rare events, making it more suitable for capturing tactical nuances such as reliance on specific passing patterns in football.

Rényi entropy is a generalized measure of entropy that introduces a parameter α to control the weighting of probabilities, providing a flexible way to quantify randomness. It is defined as:

(1)

where,

pi represents the probability of event i,

n is the number of possible states

X is the discrete random variable

α is the order of entropy (where and )

When , Rényi entropy simplifies to the Max entropy (or Hartley entropy):

Substituting to Rényi entropy

Since any nonzero probability raised to the power of zero equals one, the summation simplifies to the number of nonzero probability terms, denoted as N:

Thus, the expression for H0(P) simplifies to:

This corresponds to Max entropy, which quantifies the logarithm of the number of nonzero probability states.

(2)

When , Rényi entropy simplifies to the Shannon entropy:

Substituting the definition of Rényi entropy:

To evaluate this limit, L’Hôpital’s rule is applied:

Taking the derivatives:

Since as , the final expression simplifies to:

(3)

This result corresponds to Shannon entropy.

When , Rényi entropy simplifies to the collision entropy:

Substitute to Rényi entropy:

(4)
  • For , EDRan computed with Rényi entropy is more sensitive to rare events.
  • For , EDRan computed with Rényi entropy becomes less sensitive to rare events and places greater emphasis on dominant events.

The event distribution randomness was quantified using Rényi entropy with α values at intervals of 0.5 ranging from 0 to 2. Specifically, (Max Entropy), 0.5, 1 (Shannon Entropy), 1.5, and 2 (Collision Entropy) for both teams in each game, across 10 time periods. Additionally, to evaluate how the distribution behaves as α approaches zero, was also included in the analysis.

2.3 Data preprocessing

The event distribution randomness (EDRan) of ball movement was extracted for both teams across each time interval ti in every game, for both the men’s and women’s datasets. Event distribution randomness was computed using Rényi entropy with α values of 0 (Max entropy), 0.1, 0.5, 1 (Shannon entropy), 1.5, and 2 (Collision entropy), resulting in five different preprocessed datasets on EDRan data. Each EDRan dataset contains event distribution randomness of “Team A” and “Team B” across 10 time periods (), quantified using Rényi entropy with varying α values, along with the game result. Each preprocessed dataset contains one row per game included. Game result column is a binary value where 1 represents “Team A” win and 0 represents a “Team B” win.

2.4 Temporal analysis of EDRan with match winning performances

Next, Renyi entropy α values for which the EDRan was most associated with match winning results were evaluated.

First, EDRan difference between winners’ EDRan and losers’ EDRan for each time period was computed to evaluate whether winning teams prioritize unpredictability across all regions of the field vs dominant regions for match winning performances.

Subsequently, a correlation analysis between EDRan (for ) and match outcomes was conducted. In an earlier work, match-winner classifiers were trained on a reduced feature set obtained by representing the EDRan difference between opposing teams as a single variable [28]. Same dimensionality-reduction technique was employed in this study to reduce the number of features from 20 features per game to 10 features per game.

Event distribution randomness (EDRan) difference between two teams for time period ti:

(5)

Here, is EDRan difference between two teams for time period i (ti;) where, EDRan computed with Renyi entropy alpha value α, is EDRan of Team A for time period ti, is EDRan of Team B for time period ti, is Duration of the time period.

EDRan difference between two teams has been normalized by the duration of time period as the duration of a time period varies from one half to another due to injury time.

This dimensionality reduction resulted in 10 temporal features per game, representing the differences in event distribution randomness (EDRan) between the two teams across ten time intervals. Separate datasets were created for each of the six considered α values (0, 0.1, 0.5, 1, 1.5, and 2) using the corresponding preprocessed EDRan datasets. Each EDRan difference dataset contained 10 features representing the EDRan differences between “Team A” and “Team B” over the 10 time intervals, along with a binary game outcome label, where 1 indicates a win for “Team A” and 0 indicates a win for “Team B”. A positive EDRan difference suggests that “Team A” maintained higher EDRan than “Team B” during the respective time period, while a negative difference indicates that “Team B” had greater EDRan than “Team A.” Therefore, a positive correlation between the EDRan difference and the match result label would indicate that EDRan is positively associated with match-winning performances, whereas a negative correlation would indicate that EDRan is negatively associated with match-winning performances.

2.5 Machine learning model development

Match-winner classification models were also developed using features computed at to assess whether unpredictability concentrated in dominant regions or distributed across the entire field is more predictive of victory.

For the match winner classification machine learning model development, Random Forest (RF) classifier was considered due to its ability to handle a high number of features while maintaining robust performance with relatively small datasets, its robustness to nonlinearity and ease of interpretation via feature importances. Although SVM, XGBoost, and neural networks were initially considered, Random Forest was chosen for its interpretability through feature importance, as well as its computational efficiency and reduced need for extensive hyperparameter tuning compared to XGBoost and neural networks.

Its built-in feature importance evaluation allows for an assessment of each metrics contribution to match-winning performances. The Random Forest model is an ensemble learning method that constructs multiple decision trees and combines their outputs to improve predictive accuracy and reduce over-fitting. Additionally, random forest models provides more flexibility in hyperparameter tuning (e.g., number of trees, maximum depth), enabling performance optimization and fair comparison across varying α values.

The match winner classification model was evaluated using multiple performance metrics, including accuracy, F1-score, precision, recall, and Matthews Correlation Coefficient (MCC). To ensure a robust and comprehensive evaluation, each model’s average performance was estimated via repeated 50 rounds of 5-fold cross-validation. For each of 50 repetitions, the dataset was randomly shuffled and partitioned into five folds with random state of repetition number (); in turn, four folds were used to train the model (80% train data) and the remaining fold served as the test set (20% test data).This yielded evaluations per model (developed for each α value) across different data subsets.

Hyper-parameter tuning of the Random Forest model was performed using a grid search (GridSearchCV, scoring = accuracy) over , , , , and ; The random seed of the model was fixed at 42 for random forrest classifier. In order to avoid any hyperparameter data leakage, hyperparameter tuning was done to the training folds of each evaluation separately.

The mean, standard deviation of each performance metric were computed over these 250 evaluations. This repeated cross-validation approach provides a more reliable estimate of the model’s generalization capability by averaging performance over multiple training and testing splits.

To assess and compare changes in model performance relative to the baseline model using Shannon entropy (), proposed in a previous published work [28], a t-test was conducted. This statistical test evaluates whether the means of two groups differ significantly. Specifically, model performances for α values of 0, 0.1, 0.5, 1.5, and 2 were each compared against the performance at . In this context, a p-value less than 0.05 indicates a statistically significant difference between the two model performances. However, conducting multiple pairwise comparisons increases the risk of Type I error. To address this, both the Bonferroni correction and the Holm–Bonferroni correction were used.

3 Results

3.1 Temporal analysis results

The influence of EDRan on match outcomes was evaluated first. Initially, the variation of EDRan across the 10 time periods was analyzed separately for men’s and women’s games. Fig 4 illustrates the temporal trends in mean EDRan for winners and losers in both datasets. Overall, winners consistently demonstrated higher EDRan values compared to losers. However, the gap between the two groups tended to narrow as α increased.

thumbnail
Fig 4. Temporal variations in mean Event Distribution Randomness (EDRan), computed using Rényi entropy for , across the ten time periods of the match for both men’s and women’s datasets.

Standard deviations indicated in shaded area

https://doi.org/10.1371/journal.pone.0326800.g004

To further investigate how this difference between winners and losers evolves over time, the cumulative EDRan difference (Winners’ EDRan – Losers’ EDRan) was calculated across all time periods for all the games in two datasets. Fig 5 displays the aggregated EDRan differences between winners and losers over the 10 time intervals for all matches in the dataset.

thumbnail
Fig 5. Cumulative EDRan difference between winners and losers across 10 time periods with (a) men’s dataset (b) women’s dataset.

https://doi.org/10.1371/journal.pone.0326800.g005

It was observed that cumulative EDRan difference between winners and losers is a positive value For both mens and womens data across 10 time periods. However, towards the last phases of the game EDRan difference has decreased. The greatest EDRan difference between winners and losers was observed at , whereas for the women’s data, the maximum difference occurred at . Although differences in EDRan between winners and losers are suggestive, they are insufficient on their own to robustly identify the optimal value of α most strongly associated with match outcomes.

The machine learning models of this work is trained using EDRan difference between two teams; “Team A” and “Team B” () (quantified using Rényi entropy for ) as features. Therefore, correlation between and match winning results were also evaluated across ten ti time periods. Given the non-normality characteristics (tested with Shapiro-Wilk test) of the distributions, Spearman correlation was employed. Fig 6 illustrates the Spearman correlation between event distribution randomness difference over ten ti time periods and match-winning performances with men’s football (Fig 6(a)) and women’s football (Fig 6(b)).

thumbnail
Fig 6. Spearman correlation between (EDRan difference between “Team A” and “Team B”) quantified using Rényi entropy with for ten time periods and game result (“Team A” win/ “Team A” loss) with (a) Men’s data (b) Women’s data.

https://doi.org/10.1371/journal.pone.0326800.g006

Fig 6 shows that, the correlation between the and match outcome reaches its highest value at maximum entropy () for the men’s dataset, and at for the women’s dataset. The mean Spearman correlation across all ten time periods tends to increase as decreases. Across all considered values of , the is positively correlated with the match result, suggesting that higher EDRan values are associated with winning performances. However, the correlation coefficients decline toward the end of the game.

3.2 Match winner prediction model results

In existing literature, Shannon entropy (i.e., Rényi entropy as ) has commonly been used as the standard metric for quantifying event distribution randomness (EDRan). However, as illustrated in Fig 6, the correlation between and match outcomes tends to decrease as increases, for both men’s and women’s datasets. This raises the question of whether Shannon entropy is the optimal measure of randomness, as the results suggest that improvement in EDRan when is more strongly associated with match-winning performances than when . However, this only capture the monotonic association between and match result for each period; this does not necessarily indicate predictive strength. The predictive power may also depend on the extent to which EDRan differs between winners and losers across varying α values, offering insights into the distinct playing styles of both groups.

Therefore, to assess the predictive ability of EDRan difference across values (0, 0.1, 0.5, 1, 1.5, and 2), match winner prediction models were developed using a simple Random Forest classifier. Tables 1 and 2 presents a comparison of the performance between models trained with Rényi entropy for α values 0,0.1,0.5,1,1.5,and 2 with men’s and women’s datasets respectively.

thumbnail
Table 1. Performance of match winner prediction model of men’s games at different α Values (mean ± std and 95% confidence interval).

p-values correspond to accuracy vs. accuracy when

https://doi.org/10.1371/journal.pone.0326800.t001

thumbnail
Table 2. Performance of Match Winner Prediction Model of women’s*** games at different α values (mean ± std and 95% confidence interval) and p-values correspond to accuracy vs. accuracy when .

https://doi.org/10.1371/journal.pone.0326800.t002

To evaluate the statistical significance of performance improvements across varying α values, pairwise t-tests were conducted. Given the slight class imbalance in the dataset, the Matthews Correlation Coefficient (MCC) was selected as the evaluation metric, as it is relatively robust to data imbalance. However, performing multiple pairwise t-tests introduces the risk of inflating the family-wise Type I error. To control this Type I error, both Bonferroni and Holm–Bonferroni corrections were applied across the five planned comparisons against the baseline. Adjusted p-values were computed as with K = 5, and statistical significance was assessed at using . Under the Bonferroni correction, corresponds to , whereas under the Holm–Bonferroni correction, corresponds to . This procedure mitigates inflation of false positives due to multiple testing. These t-test results with men’s and women’s data are provided in Table 3.

thumbnail
Table 3. Pairwise comparisons of MCC values against baseline for men’s and women’s data (reporting raw and adjusted p-values).

https://doi.org/10.1371/journal.pone.0326800.t003

With men’s data (Table 1), the model developed with Max entropy () outperformed the model based on Shannon entropy () across all evaluation metrics, including accuracy, F1-score, recall, precision, and MCC. This result was validated through a comprehensive evaluation involving fifty rounds of five-fold cross-validation, totaling 250 evaluations. These findings further reinforce the stronger correlation between and match outcomes (Fig 6), suggesting that the correlation tends to decrease as the value increases with men’s data.

Pairwise comparisons of MCC values against the baseline revealed that and achieved significantly higher MCC scores, with large effect sizes (Hedges’ g = 0.95 and 0.77, respectively; Holm-adjusted p < 10−15). A smaller but statistically significant improvement was also observed at (g = 0.23, Holm-adjusted p = 0.033), although the effect size was modest. In contrast, and did not differ significantly from the baseline (Holm-adjusted p > 0.2), indicating no performance advantage. These results indicate that lower α values, particularly , provide the most robust improvements in MCC relative to the Shannon entropy baseline with men’s football data.

In contrast, the results for the women’s dataset ([tab:accuracies_womens]Table 2) did not exhibit a consistent pattern of decreasing predictive performance with increasing α values. The highest performance was observed at , whereas the lowest occurred at . Pairwise t-tests did not reveal any statistically significant differences in performance. It should be noted, however, that the women’s dataset is underpowered, and therefore no definitive conclusions can be drawn from these results.

The best performing random forest models with men’s and women’s data ( with men’s data and with women’s data) developed Random Forest model assigned greater importance to the early phases of the game, suggesting that these periods are more crucial in determining the match outcome. Fig 7 illustrates the feature importance of these Random Forest models across the ten time periods ().

thumbnail
Fig 7. Random Forrest feature importance of the best performed models with (a) Mens data () (b) Womens data ().

https://doi.org/10.1371/journal.pone.0326800.g007

4 Discussion

This study explores the most effective method for quantifying event distribution randomness (EDRan) in association football by comparing three approaches: emphasizing dominant or frequently used regions for ball movement, assigning equal weight to all used regions for ball movement, and using Shannon entropy, which weights each region based on its event distribution probability. To facilitate this comparison, Rényi entropy is employed with varying values in the range . The values at which the EDRan difference between two teams shows the highest correlation with match-winning outcomes are analyzed across ten defined time intervals (ti, where ). Furthermore, to assess how predictive power varies with different values, machine learning models are developed and evaluated using 50 rounds of five-fold cross-validation. Identifying the optimal values that enhance predictive accuracy and correlation with match outcomes not only uncovers deeper insights into the game but also enables the development of a match winner classification model that outperforms existing models.

From Figs 4 and 5, it can be observed that winning teams generally maintain higher EDRan values compared to losing teams throughout the duration of the game. However, Fig 5 also shows that this difference in EDRan tends to diminish towards the later stages of the match. Additionally, as illustrated in Fig 6, the correlation between and match outcomes decreases significantly in the final phases of the game for all α values. Furthermore, Random Forest models attribute greater importance to features from the early stages of the match, while assigning less importance to later time periods (Fig 7). Collectively, these findings may reflect the adoption of more cautious, risk-averse strategies by winning or leading teams towards the end of matches, a trend that aligns with observations in previous studies [17,18], where teams in the lead often prioritize defensive play to preserve their advantage towards the end of the game.

The correlation analysis between the Rényi entropy-based EDRan difference across teams () and match-winning performance (Fig 6) revealed the strongest association at (maximum entropy) and the weakest at (collision entropy). These findings suggest that higher levels of spatial randomness, particularly involving the use of rare or less frequently occupied regions of the field, are generally associated with more favorable match outcomes. A likely explanation is that exploiting such regions increases unpredictability, disrupts defensive structure, and opens opportunities for sustained passing sequences or shots on goal. In contrast, when randomness is confined primarily to dominant regions (as captured at higher α values), defensive units are better able to anticipate and counter attacking movements, thereby limiting the tactical advantage.

With machine learning model performances, when with men’s data, models have shown statistically significant improvement compared to baseline model performance at . When the Rényi entropy parameter α is low, the EDRan metric becomes more sensitive to rare or infrequently used spatial regions on the field. The observation that the highest model performance for men’s football data occurs at and significant improvements when suggests that winning men’s teams tend to utilize broader range of spatial regions, demonstrating greater event randomness across the entire field. Further analysis of EDRan differences between winning and losing teams (Fig 5) shows that, for men’s data, the greatest disparity also occurs at , reinforcing the idea that winning teams tend to adopt a more spatially diverse and unpredictable playing style.

It is important to consider how this finding can be explained tactically and if it represents differences in physical performance. This pattern indicates men’s winning teams use a tactic of distributing play more evenly and unpredictably across the entire space in the field. Unpredictability that extends across all regions of the field may provide a greater tactical advantage, as it forces the opposition defense to account for a wider range of potential movements and makes anticipating attacking plays more difficult. Ultimately, this approach may facilitate the creation of open spaces by disrupting the opposition’s defense, enable safer or longer passing sequences, and generate opportunities for opportunistic shots on goal. In contrast, if randomness is restricted to dominant or frequently used regions, defenders can narrow their focus and adapt more easily to the limited set of possible actions, thereby reducing the effectiveness of the unpredictability. Additionally, a team’s ability to play unpredictably across a wider range of field zones than their opposition probably requires higher physical capacity. Higher physical capacity allows a team to move quickly into multiple field zones during an attacking phase, which would require the opposing team to perform more physical work in defense. If they are not trained for this level of physical performance they may experience higher levels of fatigue which could reduce the effectiveness of their defense, providing an advantage to the attacking team.

In contrast, the women’s data did not exhibit a consistent trend in model performance across varying α values. This may be attributable to the limited statistical power of the dataset or to other factors that differ from the men’s game. Consequently, although the results are presented, no definitive conclusions can be drawn. Nonetheless, the generalisability of the proposed approach, as well as potential differences in game-play randomness, could be more rigorously examined in future studies with sufficiently powered datasets.

In summary for top-tier men’s football data,

  • Match-winning performance is positively associated with more unpredictable event distribution across the field, relative to concentrating unpredictability in a few dominant areas.
  • Toward the end of matches, lower spatial randomness is observed for teams that won compared with earlier phases, whereas higher randomness is typically observed earlier in matches.

These patterns are associative rather than causal and may be influenced by potential confounding variables, including team and opponent strength, home/away status, game state/scoreline, and competition context.

The results from match winner prediction using machine learning models with men’s data showed statistically significant improvements for over the Shannon entropy-based baseline model Moreover, the top-performing model in this study not only surpassed the performance of the baseline model introduced in [28], which utilized EDRan differences between two teams calculated with Shannon entropy, but also outperformed comparable match outcome prediction models reported in recent literature. A summary comparison of the results from this work, the baseline model [28], and other related studies is presented in Table 4.

thumbnail
Table 4. Comparison of match winner prediction model results.

https://doi.org/10.1371/journal.pone.0326800.t004

In comparison with existing studies, the present work adopts a smaller set of features and a simpler machine learning model, which enhances the interpretability of feature importance. Previous approaches [21,22] have primarily sought to maximize match-winner prediction accuracy using black-box recurrent neural networks, often at the expense of interpretability. In contrast, the focus here is on developing a temporal team performance evaluation metric. To this end, a single variable (EDRan) was extracted across 10 time steps, rather than the multiple features employed in prior studies, and a simple random forest model was implemented, enabling interpretation of feature importance with lower computational demands. Furthermore, while [21] incorporated historical data for model development, the current study relied exclusively on in-game data. Despite these simplifications, the approach achieved competitive predictive performance, underscoring the effectiveness of EDRan as a meaningful metric for evaluating team performance.

This study is subject to several limitations. First, the dataset is relatively small due to data availability constraints. While the men’s dataset was sufficiently powered, the women’s dataset was underpowered, preventing a reliable assessment of the approach in the women’s game. Second, the analysis was restricted to top-tier European competitions; therefore, the findings may not generalize to lower divisions, non-European leagues, or teams employing different tactical styles. Given that tactics evolve and can vary across skill levels, regions, age groups, and genders, the external validity of the findings beyond this cohort and period remains uncertain. Future research could expand the dataset to improve robustness, test generalizability across genders, and extend the analysis to different competitive levels, regions, and demographic cohorts to capture potential context-specific differences. Although standard cross-validation was used in this study, the authors acknowledge that temporal or group-wise cross-validation (e.g., by team or season) could further strengthen the assessment of model generalizability and reduce potential data leakage across similar match contexts. However, as the present analysis focuses not on team identity or season-level trends, but on the temporal sequence of in-game events to capture randomness and unpredictability in ball movement, standard cross-validation was deemed appropriate. Future work incorporating season-wise or team-based validation could provide additional insights into the stability of these findings over time and across teams. Moreover, this analysis did not incorporate game-state dependencies (e.g., leading, trailing, or tied situations), which are known to influence team behavior. This study was designed to examine overall temporal trends in event distribution randomness irrespective of current score or game context. However, future work could integrate game-state factors to provide deeper tactical insight into how teams adapt their strategies under different match conditions. Finally, drawn matches were excluded from the evaluation, as the primary aim was to identify factors contributing to winning performances. However, since draws are an integral part of football and may even be pursued strategically under certain circumstances, future work could incorporate drawn results to uncover additional patterns and strategic implications.

5 Conclusions

This study investigated the application of Rényi entropy with varying α values to quantify event distribution randomness (EDRan) in association football. The objective was to assess whether emphasizing randomness in ball movement confined to dominant regions of the field or distributing randomness more evenly across all regions better explains match outcomes. The analysis revealed that the strongest association between EDRan and match-winning performance occurred at (Max entropy), which reflects the breadth of distinct field regions utilized, regardless of frequency. This finding highlights the importance of maximizing spatial unpredictability, whereby successful teams exploit a wider range of field zones to disrupt opposition defenses and create advantageous opportunities.

Machine learning models developed with EDRan at for men’s data outperformed both the baseline model [28] and recently published match-winner classification models in the literature. These results underscore the potential of EDRan as a powerful performance evaluation metric and reinforce the tactical value of maintaining unpredictable ball movement across the entire field.

In conclusion, while football strategies may vary, with some teams favoring possession-based play and others focusing on defensive play with low-possession, successful teams tend to prioritize maintaining unpredictability across all regions of the field. These findings highlight the tactical advantage of fostering unpredictable event distributions across all regions of the field and suggest that spatial unpredictability in ball movement should be a key consideration in team strategy and performance evaluation.

Acknowledgments

The authors gratefully acknowledge StatsBomb for providing publicly available event data from association football, which supported this study.

References

  1. 1. Lamas L, Barrera J, Otranto G, Ugrinowitsch C. Invasion team sports: strategy and match modeling. International Journal of Performance Analysis in Sport. 2014;14(1):307–29.
  2. 2. Lemmink KAPM, Frencken W. Tactical performance analysis in invasion games: perspectives from a dynamic systems approach with examples from soccer. 2013. p. 89–100.
  3. 3. Vilar L, Araújo D, Davids K, Bar-Yam Y. Science of winning soccer: emergent pattern-forming dynamics in association football. J Syst Sci Complex. 2013;26(1):73–84.
  4. 4. Reep C, Benjamin B. Skill and chance in association football. Journal of the Royal Statistical Society: Series A. 1968;131(4):581–5.
  5. 5. Reep C, Pollard R, Benjamin B. Skill and chance in ball games. Journal of the Royal Statistical Society Series A (General). 1971;134(4):623.
  6. 6. Wunderlich F, Seck A, Memmert D. Skill or luck? Analysing random influences on goal scoring in football. Advances in Intelligent Systems and Computing. Springer; 2023. p. 126–9. https://doi.org/10.1007/978-3-031-31772-9_27
  7. 7. Plakias S, Armatas V, Mitrotasios M. Influence of tactics and situational variables on goal scoring in European football. Proceedings of the Institution of Mechanical Engineers, Part P: Journal of Sports Engineering and Technology. 2025. https://doi.org/10.1177/17543371241313252
  8. 8. Utama CA, Maksum A, Kristiyandaru A. The influence of physical condition, skill, and mental factors on the ability to play football. Compet J Pendidik Kepelatihan Olahraga. 2024;16(2):466.
  9. 9. Eggels H, Pechenizkiy M, Almeida R, Van Elk R, Van Agt L. Expected goals in soccer: explaining match results using predictive analytics. 2016. https://research.tue.nl/en/studentTheses/expected-goals-in-soccer
  10. 10. Cavus M, Biecek P. Explainable expected goal models for performance analysis in football analytics. arXiv preprint 2022.
  11. 11. Mead J, O’Hare A, McMenemy P. Expected goals in football: improving model performance and demonstrating value. PLoS One. 2023;18(4):e0282295. pmid:37018167
  12. 12. Bandara I, Shelyag S, Rajasegarar S, Dwyer D, Kim E-J, Angelova M. Predicting goal probabilities with improved xG models using event sequences in association football. PLoS One. 2024;19(10):e0312278. pmid:39475977
  13. 13. Lago-Peñas C, Lago-Ballesteros J, Dellal A, Gómez M. Game-related statistics that discriminated winning, drawing and losing teams from the Spanish Soccer League. J Sports Sci Med. 2010;9(2):288–93. pmid:24149698
  14. 14. Göral K. 2014 FIFA Dünya Kupasının Başarılı Takımlarında Pas Başarı Yüzdeleri ve Topa Sahip Olma. ISSCS. 2015;3(9):86–86.
  15. 15. Aquino R, Machado JC, Manuel Clemente F, Praça GM, Gonçalves LGC, Melli-Neto B, et al. Comparisons of ball possession, match running performance, player prominence, team network properties according to match outcome and playing formation during the 2018 FIFA World Cup. International Journal of Performance Analysis in Sport. 2019;19(6):1026–37.
  16. 16. Kubayi A, Toriola A. The influence of situational variables on ball possession in the South African Premier Soccer League. J Hum Kinet. 2019;66:175–81. pmid:30988851
  17. 17. Guan T, Cao J, Swartz T. Should you park the bus? 2021. https://www.sfu.ca/tswartz/papers/bus.pdf
  18. 18. Guan T, Cao J, Swartz TB. Parking the bus. Journal of Quantitative Analysis in Sports. 2023;19(4):263–72.
  19. 19. Hughes M, Franks I. Analysis of passing sequences, shots and goals in soccer. J Sports Sci. 2005;23(5):509–14. pmid:16194998
  20. 20. Rocha-Lima EM, Wallan Tertuliano I, Norberto Fischer C. Determinant football elements for Euro16 match results. R Intelig Compet. 2023;13:e0428.
  21. 21. Danisik N, Lacko P, Farkas M. Football match prediction using players attributes. In: 2018 World Symposium on Digital Intelligence for Systems and Machines (DISA). 2018. p. 201–6. https://doi.org/10.1109/disa.2018.8490613
  22. 22. AlMulla J, Islam MT, Al-Absi HRH, Alam T. SoccerNet: a gated recurrent unit-based model to predict soccer match winners. PLoS One. 2023;18(8):e0288933. pmid:37527260
  23. 23. Šunjić I, Versic S, Modric T, Corluka M, Zaletel P. The comparison of position-specific match performance between the group and knockout stage of the UEFA Champions League. Sport Mont. 2024;22(2):9–17.
  24. 24. Fernández J, Bornn L. Wide open spaces: a statistical technique for measuring space creation in professional soccer. Presented at: Sloan Sports Analytics Conference; 2018.
  25. 25. Martens F, Dick U, Brefeld U. Space and control in soccer. Front Sports Act Living. 2021;3:676179. pmid:34337401
  26. 26. Bandara I, Shelyag S, Rajasegarar S, Dwyer DB, Kim EJ, Angelova M. Time-series analysis of ball carrier open-space in association football. In: Dong JS, Izadi M, Hou Z, editors. Sports analytics. Cham: Springer; 2024. p. 1–17.
  27. 27. Bandara I, Shelyag S, Rajasegarar S, Dwyer DB, Kim E, Angelova M. Time-Series Analysis of Ball Carrier Open-Space (BCOS) in Association Football. SN COMPUT SCI. 2025;6(4).
  28. 28. Bandara I, Shelyag S, Rajasegarar S, Dwyer D, Kim E-J, Angelova M. Winning with chaos in association football: spatiotemporal event distribution randomness metric for team performance evaluation. IEEE Access. 2024;12:83363–76.
  29. 29. Neuman Y, Israeli N, Vilenchik D, Cohen Y. The adaptive behavior of a soccer team: an entropy-based analysis. Entropy (Basel). 2018;20(10):758. pmid:33265847
  30. 30. Kusmakar S, Shelyag S, Zhu Y, Dwyer D, Gastin P, Angelova M. Machine learning enabled team performance analysis in the dynamical environment of soccer. IEEE Access. 2020;8:90266–79.
  31. 31. Berman Y, Mistry S, Mathew J, Krishna A. Temporal match analysis and recommending substitutions in live soccer games. In: 2022 IEEE International Conference on Web Services (ICWS). 2022. p. 397–404. https://doi.org/10.1109/icws55610.2022.00066
  32. 32. González-Rodenas J, Moreno-Pérez V, Campo RL-D, Resta R, Coso JD. Evolution of tactics in professional soccer: an analysis of team formations from 2012 to 2021 in the Spanish LaLiga. J Hum Kinet. 2023;87:207–16. pmid:37559775
  33. 33. Taki T, Hasegawa J. Quantitative measurement of teamwork in ball games using dominant region. In: Proceedings of the International Archives of Photogrammetry and Remote Sensing (IAPRS). Amsterdam, The Netherlands. 2000. p. 125–31. https://www.isprs.org/proceedings/XXXIII/congress/part5/125_XXXIII-part5s.pdf
  34. 34. Caetano FG, Barbon Junior S, Torres R da S, Cunha SA, Ruffino PRC, Martins LEB, et al. Football player dominant region determined by a novel model based on instantaneous kinematics variables. Sci Rep. 2021;11(1):18209. pmid:34521897
  35. 35. Rényi A. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability. 1961. p. 547–62.
  36. 36. Shannon C, Weaver W. The mathematical theory of communication. Urbana: University of Illinois Press; 1949.
  37. 37. StatsBomb. StatsBomb Open Data. 2022. https://github.com/statsbomb/open-data.git
  38. 38. Pappalardo L, Rossi A, Natilli M, Cintia P. Explaining the difference between men’s and women’s football. PLoS One. 2021;16(8):e0255407. pmid:34347829
  39. 39. Tianbiao L, Andreas H. Apriori-based diagnostical analysis of passings in the football game. In: 2016 IEEE International Conference on Big Data Analysis (ICBDA). 2016. p. 1–4. https://doi.org/10.1109/icbda.2016.7509795
  40. 40. Lago C. The influence of match location, quality of opposition, and match status on possession strategies in professional association football. J Sports Sci. 2009;27(13):1463–9. pmid:19757296
  41. 41. Guimarães JP, Rochael M, Andrade A, Glória S, Praça G. How reaching the pitch’s final third is related to scoring opportunities in soccer?. Retos. 2021;43:171.