Effective injury forecasting in soccer with GPS training data and machine learning

doi:10.1371/journal.pone.0201264

Table 1.

Training workload features used in our study.

Description of the training workload features extracted from GPS data and the players’ personal features collected during the study. We defined four categories of features: kinematic features (blue), metabolic features (red), mechanical features (green) and personal features (white).

More »

Expand

Fig 1.

Construction of the training dataset and the forecasting model.

In step 1 we split the dataset into two parts: T^TRAIN (30% of T) and T^TEST (70% of T). We then oversample the minority class in T^TRAIN by using ADASYN, select the most important features and fit the hyper parameters (Step 2). We then split T^TEST into two folds in order to perform a stratified cross validation (step 3).

More »

Expand

Fig 2.

Classifiers performances.

Distributions of the classifiers—DT, LR and RF—performances obtained testing the algorithms 10,000 times. This figure shows the performance of the baselines and the ACWR- and MSWR-based injury forecasters as well.

More »

Expand

Table 2.

Performance of DT compared to RF, LR, the four baselines and the ACWR- and MSWR-based forecasters.

For each forecaster we report precision, recall and F1 on the two classes and the overall AUC.

More »

Expand

Fig 3.

Performance of forecasters in the evolutive scenario.

As the season goes by, we plot week by week the cumulative F1-score of the forecasters DT, RF, LR, B₁, …, B₄ trained on the data collected up to that week. Black crosses indicate injuries that not detect by DT, red crosses indicate injures correctly predicted by DT. For every week i we highlight in red the number of injuries detected by DT up to week i.

More »

Expand

Fig 4.

Interpretation of the multi-dimensional injury forecaster.

(a) The six injury rules extracted from DT. For each rule we show the range of values of every feature, its frequency (Freq) and accuracy (Acc). (b) A schematic visualization of decision tree. Black boxes are decision nodes, green boxes are leaf nodes for class No-Injury, red boxes are leaf nodes for class Injury.

More »

Expand