Table 1.
Training workload features used in our study.
Description of the training workload features extracted from GPS data and the players’ personal features collected during the study. We defined four categories of features: kinematic features (blue), metabolic features (red), mechanical features (green) and personal features (white).
Fig 1.
Construction of the training dataset and the forecasting model.
In step 1 we split the dataset into two parts: TTRAIN (30% of T) and TTEST (70% of T). We then oversample the minority class in TTRAIN by using ADASYN, select the most important features and fit the hyper parameters (Step 2). We then split TTEST into two folds in order to perform a stratified cross validation (step 3).
Fig 2.
Distributions of the classifiers—DT, LR and RF—performances obtained testing the algorithms 10,000 times. This figure shows the performance of the baselines and the ACWR- and MSWR-based injury forecasters as well.
Table 2.
Performance of DT compared to RF, LR, the four baselines and the ACWR- and MSWR-based forecasters.
For each forecaster we report precision, recall and F1 on the two classes and the overall AUC.
Fig 3.
Performance of forecasters in the evolutive scenario.
As the season goes by, we plot week by week the cumulative F1-score of the forecasters DT, RF, LR, B1, …, B4 trained on the data collected up to that week. Black crosses indicate injuries that not detect by DT, red crosses indicate injures correctly predicted by DT. For every week i we highlight in red the number of injuries detected by DT up to week i.
Fig 4.
Interpretation of the multi-dimensional injury forecaster.
(a) The six injury rules extracted from DT. For each rule we show the range of values of every feature, its frequency (Freq) and accuracy (Acc). (b) A schematic visualization of decision tree. Black boxes are decision nodes, green boxes are leaf nodes for class No-Injury, red boxes are leaf nodes for class Injury.