Fig 1.
The distributions of task scores for 15 tasks for 68 groups.
Fig 2.
The value distributions of baseline team features for 68 teams.
Table 1.
An example of a partial text chat log recorded while a non-talking team was working on tasks.
Fig 3.
F2-scores for binary classification of log messages into responses and non-responses for different time windows [t1, t2] sec.
Fig 4.
An examples of two types of collaboration networks constructed from a team’s chat log.
In the shown sparse network S50, 50% of the lowest-weight edges of W have been dropped.
Fig 5.
Correlation between baseline team features as well as their correlation with the task scores.
Only statistically significant correlation values are displayed, with p-value threshold’s being 0.05. The p-values have been corrected using Benjamini-Hochberg procedure.
Fig 6.
The team features emphasized in the regression models built with baseline team features.
Each row includes the team feature coefficients of the best elastic net regression model for a given task. In the best models, the regularization is dominated by LASSO, with λ1: λ2 = 75%: 25% in the loss function (1).
Fig 7.
The prediction accuracy of the baseline best regression models for each task.
On the displayed scatter plots, the x-coordinate corresponds to the true task score of a team, the y-coordinate corresponds to the predicted task score, and each point corresponds to a task score prediction for a single team. The root mean square errors (RMSEs) for predictions are also reported.
Fig 8.
The series of scores for each of 15 tasks for each of 68 teams.
The teams are arranged from the lowest- (top left) to the highest- (bottom right) performing with respect to the mean task score. Each score series is accompanied by the best fit line.
Fig 9.
For each team, its average score over all but the first tasks (to avoid overfitting) is compared to its score on the first task.
The two measures are significantly correlated (ρ = 0.50, p = 1.73 ⋅ 10−5).
Fig 10.
Correlation between the performance dynamics-based features of Sec. 3.2 and the baseline team features.
Only the statistically significant correlation values are displayed, with the p-value threshold of 0.05. The p-values have been corrected using Benjamini-Hochberg procedure.
Fig 11.
Correlation between the performance dynamics-based team features and task scores.
Only the statistically significant correlation values are displayed with the p-value threshold of 0.05. The p-values have been corrected using Benjamini-Hochberg procedure.
Fig 12.
The features based on the dynamics of historical team performance emphasized in the best regression models.
Fig 13.
The prediction accuracy of the best regression models obtained using the team performance dynamics features.
On the displayed scatter plots, the x-coordinate corresponds to the true task score of a team, the y-coordinate corresponds to the predicted task score, and each point corresponds to a task score prediction for a single team. Since prediction requires information about the teams’ performance on the previous tasks, the performance for the first three tasks is not predicted.
Table 2.
Comparison of RMSEs for team performance prediction on 15 tasks using the general method of Sec. 3.1 with baseline features (Baseline) and the performance dynamics-based features (Dynamic), as well as using several standard time series extrapolation methods.
Fig 14.
Correlation between the collaboration team features of Sec. 3.3 with task scores.
Only the statistically significant correlation values are displayed with the p-value threshold of 0.05. The p-values have been corrected using Benjamini-Hochberg procedure. The features names of (dense weighted) collaboration networks have prefix “wn”, the feature names of the sparse unweighted collaboration networks with X% lowest-weight edges dropped are prefixed with “snX”, and the general chat log-based features have prefix “log”.
Table 3.
Top log-based team features significantly correlated with task scores.
Fig 15.
The relationship between the difference of the mean and the standard deviation (STD) of the out-degree of the collaboration network and its algebraic connectivity for each of 34 teams.
Fig 16.
The chat log-based team features emphasized in the best regression models.
Each row includes the team feature coefficients of the best elastic net regression model for a given task. The features names of (dense weighted) collaboration networks have prefix “wn”, the feature names of the sparse unweighted collaboration networks with X% lowest-weight edges dropped are prefixed with “snX”, and the general chat log-based features have prefix “log”.
Table 4.
Top 4 chat log-based team features emphasized in the best regression models.
Fig 17.
The accuracy of team performance prediction based on chat log-based and, particularly, network-based features.
The best elastic net regression model is used for score prediction of each team on each task. On the displayed scatter plots, the x-coordinate corresponds to the true task score of a team, the y-coordinate corresponds to the predicted task score, and each point corresponds to a task score prediction for a single team. The root mean square error (RMSE) for predictions are reported.
Table 5.
Comparison of RMSEs for team performance prediction on 15 tasks using the general method on Sec. 3.1 with baseline features of Sec. 2 (Baseline), team performance dynamics-based features of Sec. 3.2 (Dynamic), and the features extracted from the text chat logs (Log-based), respectively.
Fig 18.
Relative cumulative payoffs over a sequence of 15 tasks corresponding to different workload distribution policies.
The payoffs are scaled by the payoff of the oracle baseline policy.
Fig 19.
Workload distributions ω(t) chosen by each policy on each of 15 tasks.
The workload distributions of first two baseline policies are time-invariant. Other policies, relying on the use of historical data, use Pmean if not enough historical data is available (for example, both least squares and ARMA fitting require at least 2 observed data points). For the first task, Pmean chooses a uniform workload distribution, as in Puniform.