Dynamical networks of influence in small group discussions

In many domains of life, business and management, numerous problems are addressed by small groups of individuals engaged in face-to-face discussions. While research in social psychology has a long history of studying the determinants of small group performances, the internal dynamics that govern a group discussion are not yet well understood. Here, we rely on computational methods based on network analyses and opinion dynamics to describe how individuals influence each other during a group discussion. We consider the situation in which a small group of three individuals engages in a discussion to solve an estimation task. We propose a model describing how group members gradually influence each other and revise their judgments over the course of the discussion. The main component of the model is an influence network—a weighted, directed graph that determines the extent to which individuals influence each other during the discussion. In simulations, we first study the optimal structure of the influence network that yields the best group performances. Then, we implement a social learning process by which individuals adapt to the past performance of their peers, thereby affecting the structure of the influence network in the long run. We explore the mechanisms underlying the emergence of efficient or maladaptive networks and show that the influence network can converge towards the optimal one, but only when individuals exhibit a social discounting bias by downgrading the relative performances of their peers. Finally, we find a late-speaker effect, whereby individuals who speak later in the discussion are perceived more positively in the long run and are thus more influential. The numerous predictions of the model can serve as a basis for future experiments, and this work opens research on small group discussion to computational social sciences.


Introduction
In many domains of life, complex problems can be successfully addressed by pooling the knowledge of several individuals [1,2]. When making decisions, forming judgments, or solving multidimensional problems, groups of people can outperform the best individual in the group, and sometimes even the experts in the problem domain. In everyday life, this collective achievement is commonly accomplished by means of face-to-face group discussions, during which the exchange of information and ideas between people results in the emergence of accurate collective solutions PLOS [3]. Whereas research in social psychology has a long history in studying the performances of small group discussions, more recent methods of computational social science are less often used to address this issue [4][5][6][7]. In this context, the present article introduces a network approach to study the internal dynamics that operate during a group discussion. Given the omnipresence of group discussions in many areas of life, the factors impacting the performances of a group discussion have been extensively studied in the past. Classical research on group performance has highlighted numerous detrimental effects that can impair the quality of the discussion [3]. For instance, the hidden profile effect refers to the situation where group members fail to share important private information and tend to focus mostly on the elements of information known by the majority of them [8,9]. Likewise, groupthink and conformity are common issues that arise during discussions and occur when the group members ignore important facts or unwillingly adopt the judgment of others to reach a non-contentious collective consensus [10,11]. Also, group discussions can be subject to polarization effects, in which the judgments of the individuals tend to become more extreme as a result of social interactions [12,13]. Nevertheless, group discussions remain a powerful mean to aggregate the ideas and judgments of several people. In controlled experimental settings, it has been shown many times that groups can outperform single individuals in a wide variety of tasks, such as for detecting lies [14], reconstructing noisy signals [15], establishing a medical diagnosis [16], and in a variety of binary-choice tasks [17].
Yet, the conditions under which a group would perform good or bad remain unclear. In a recent series of experimental studies, Woolley et al. revealed the existence of a 'collective intelligence factor' that is predictive of groups performance across a wide variety of tasks [1]. That factor is not associated with the average skills of the individual group members. Rather, it strongly correlates with the social sensitivity of the individuals, that is, their ability to listen and integrate the arguments of the others, and to balance the speaking turns across all group members. This suggests that one key aspect of group performance lies in the internal dynamics that operate during the discussion, more than in the individual skills of the group members. However, although the collective intelligence factor is a powerful indicator to anticipate the group's performance, it does not explain the underlying causal mechanisms leading to collective good or bad performances. In fact, the dynamics of the group discussion, that is, the pattern of communication that takes place during the discussion and the social influences that operate among group members is not yet well understood.
This dynamical aspect of collective intelligence has been deeply investigated in a different domain. In the past decade, computational social scientists have begun to understand more precisely the dynamics driving judgment formation and social contagion in large populations of people composed of hundreds of individuals connected in social networks [18][19][20]. Numerical models have been proposed to describe how repeated interactions between a large number of individuals can possibly drive a population towards a consensual judgment, or on the contrary, polarize the beliefs of the crowd [13,[21][22][23]. These models generally rely on the assumption that agents tend to revise their judgments by averaging their own and their neighbors' judgments, gradually converging towards a consensus. A similar averaging process has also been used in numerous models of advice-taking in psychology, this time at the scale of a dyad [24,25]. Nevertheless, most existing research of opinion dynamics has dealt with large social networks, often focusing on how the network topology impacts the propagation of judgments. However, these methods have rarely been applied to the case of face-to-face discussions, where the group size is small-typically three to five individuals-and where all the individuals are interconnected in a full network.
In the present work, we aim at describing the internal dynamics that operate during group discussions, using tools and concepts inherited from the network science and the computational social sciences. For this, we describe the group as a small social network in which each group member is represented by a node, and all the nodes are connected to one another by weighted ties representing the extent to which individuals influence each other. In simulations, we show that the structure of this influence network determines the performance of the group during a group discussion. Importantly, we also assume that individuals can adapt the weight they assign to their peers after observing their past performances: Good performers tend to become more influential, and bad performers tend to lose influence in the group. Over time, the influence network evolves and often converge to the optimal structure. Crucially, this only happens when individuals exhibit a social discounting bias, that is, when people systematically downgrade the relative performances of their peers. Finally, we show that the speaking order has significant consequences on the emerging structure of the influence network, thus drawing links to the collective intelligence factor. The surprisingly complex dynamics that emerge from our simple model opens numerous experimental perspectives for future research.

Discussion dynamics
Our model describes the process of group discussions, in which N individuals undertake an estimation task collectively. Each individual i in the group has an initial estimate x 0 i drawn from a normal distribution with mean μ and standard deviation σ. The discussion is composed of N r speaking rounds across which the individuals progressively revise their initial estimate. The estimate of the individual i at round r is noted x r i . In each speaking round r, a randomly selected individual speaks up and communicates her current estimate x r i to all the others. Every time an individual speaks up, all the other group members revise their current estimate using a weighted average procedure (see, e.g., [13,24,25]. Formally, the revised estimate of the individual j after the individual i has spoken up is given by In the above equation, the term w ij represents the weight that the individual j assigns to the speaker i. The weight is defined in the interval [0 1]. According to the above equation, a weight w ij = 0 indicates that j ignores the judgment of i, and a weight w ij = 1 indicates that j fully adopts the judgment of i. The speaker does not revise her estimate in round r, leading to The same process repeats round after round, until the last round r = N r . The weights w ij are not necessarily the same for all pairs of individuals and the weight w ij is not necessarily identical to w ji . Hence, the N individuals are connected by an influence network, that is, a weighted directed graph that determines how group members influence one another during the discussion. The Fig 1 illustrates the dynamics of a group discussion for two different influence networks. The above equation of social influence has been experimentally confirmed and used in numerous models of opinion dynamics (see, e.g. [12,13,21,23,25,26]). Note that, in principle, the weight factors w ij do not need to be bounded to the interval [0 1]. Weights higher than 1 or lower than 0 could represent more extreme social influence phenomena, such as social repulsion (w ij < 0) or over-adoption (w ij > 1)-which have potential to generate group polarization [27]. Nevertheless, we choose to restrict ourselves to weights varying in the interval [0 1] in the present study for simplifying the traceability of the simulation results. Another simplification of our model is that, in contrast to other formalizations [28], the weights are associated with a given person and not to a given argument that a person formulates. The model, therefore, assumes that some individuals are naturally more influential than others, rather than considering the persuasiveness of each communicated argument separately.
Our approach differs from the simple averaging of the initial estimates that are typically used in the "wisdom-of-crowds," and from the repeated averaging across all group members typically used in a DeGroot updating procedure [29]. Here, individuals only integrate the estimate of the last speaker and do not average across all individuals simultaneously. This creates complex dynamics involving judgment propagation and indirect influence among group members. In the first part of the 'Results' section, we study the optimal structure of the influence network for various group compositions.

Social learning
Our model does not only focus on the outcome of the discussion but also on how individuals adapt to it in the long run. For this, we assume that the same group of individuals undertakes not only one, but a series of N T estimation tasks from the same problem domain. For each estimation task, a new discussion takes place between the same set of individuals, following the procedure described in the previous section. For the first discussion, the group members are strangers and know nothing about each other's skills. However, as individuals undertake repeated estimation tasks together, they can learn about and adapt to each other's past performances. This social learning aspect is represented by a change in the weights that each individual gives to the others. In other words, the influence network evolves over time, depending on how the individuals perceive their peers.
Formally, we now include a time dependency on the weights w ij (t), where the variable t varies from 1 to N T . The variable t indicates the number of discussions that the pair of individuals {i,j} undertook together. Individuals who had no past interactions with their partner assign a default weight w ij (0) = w 0 to him or her.
Previous experimental measurements have shown that individuals update the weight assigned to others based on their relative, not absolute, performances [30]. Furthermore, experimental data have also revealed the existence of a social discounting bias in this process, indicating that people tend to underweight their own error as compared to the errors of their partners [25,30,31]. In our model, we describe these facts by assuming that the weight given by i to j is increased by an offset w Ã if j performed sufficiently better than i during the previous discussion, and is decreased by w Ã otherwise: Here, e 0 i ¼ jx 0 i j is the error that the focal individual i made on her initial estimate x 0 i during the previous discussion, and e Ã j ¼ jx Ã j j is the error that the individual j committed on the first communicated estimate during the previous discussion. This formalization reflects the fact that the focal individual i does not know what was the initial estimate x 0 j of the individual j, and can only consider the first communicated estimate x Ã j of the individual j to judge him or her. The parameter α is the social discounting bias. The higher α the stronger i downgrades the quality of j's judgments.
In the second part of the 'Results' section, we explore how the weights w ij (t)-and thus the structure of the influence network-evolve as t increases, and compare the emerging group structure to the optimal one. The model variables and parameters are summarized in Table 1.

Optimal group configuration
Ignoring the social learning aspect of the model for now (i.e. considering N T = 1), we addressed the question of what are the optimal weights w ij that each individual should assign to all the others such that the group error is minimized. Is the group better off by assigning equal weights to everybody, irrespective of the individual members' skills, or is it more efficient to give a stronger power to the best performers?
To address this question, we varied the group composition by defining two types of individuals: 1) the good performers, for whom the initial estimates are drawn from a normal distribution with mean μ and standard deviation σ + ; and 2) the bad performers, for whom the initial estimates are drawn from a normal distribution with the same mean μ but a standard deviation σ − > σ + (S1 Fig). We defined the group error E as the average error of the group members at the end of the discussion: P i e N r i =N where e N r i ¼ jx N r i j is the final error of the individual i. Using an optimization procedure (see the Methods section), we computed the optimal network structure-that is, the weight values w ij for all pairs {i,j}-that minimizes the final error E of the group for different group compositions. The results are presented in Fig 2. Groups composed of equally skilled members (either good or bad performers) reach their best performances when individuals assign an equal weight w % 0.2 to each other. When individuals do not perform equally, however, the weights need to be adjusted accordingly. For instance, in groups composed of two good and one bad performer, the group performs best when the two good performers assign a weight w ij % 0.2 to one another while ignoring the bad performer, but at the same time receiving a weight w ij % 0.7 from her. In the next section, we study whether groups can naturally converge towards these optimal structures via social learning.

Emerging patterns
Next, we addressed the question of whether groups can self-organize to reach the optimal structures described in Fig 2. For this, we conducted another set of simulations, this time allowing for social learning across a series of N T = 100 discussions. For each group composition, we also varied the value of the social discounting bias α in the interval [0 2]. For all values of α, we measured the average group performance after N T = 100 discussions, for different group compositions. Surprisingly, we found that the best collective performances are found for a social discounting bias α > 0 (Fig 3). That is, individuals do benefit from moderately downgrading the performances of their peers. To better understand this result, we looked at the associated network structures for three values of α (α = 0, α = 0.1, and α = 1). The results are shown in Fig 4. It is visible from this figure that in the absence of bias (i.e., α = 0) the weights that individuals assign to each other are too high. However, increasing the bias tends to reduce the overall weight values. When the social discounting bias is large enough, the   weights of the influence networks match the optimal ones presented in Fig 2, and yield the best group performances shown in Fig 3. Why do people benefit from downgrading the performances of their peers? Social discounting is necessary to counterbalance the fact that individuals tend to overestimate the skills of their peers. The reason is that individuals judge the performance of the others based on the first estimate x Ã i they communicated, which is generally better than their real initial estimate In groups composed of two good performers and one bad performer (here p 3 is the bad performer, depicted in red), the group performs best when the two good performers assign a weight w ij % 0.2 to one another while ignoring the bad performer, but at the same time receiving a weight w ij % 0.7 from her. (C) In groups composed of two bad performers and one good performer (here p 1 is the good performer), the best collective performance is found when p 1 receives a strong weight w ij = 0.9 from the two others while at the same time ignoring them. The two bad performers give each other a weight of 0.5. (D) When all group members are bad performers, the best collective performance is found when all individuals assign the same weight w ij % 0.2 to all others, similar to (A). The width of the arrows is proportional to w ij , that is, the weight given by j to the judgment of i. https://doi.org/10.1371/journal.pone.0190541.g002 x 0 i . For instance, in the illustrative discussion dynamics sketched in Fig 1A, the individual p 3 (in yellow) communicates her first estimate x Ã 3 at round 4. This estimate is very close to the true value, giving the impression that p 3 is an excellent performer. However, the actual initial estimate x 0 3 of p 3 was far off. Because the initial estimate x 0 3 was not communicated, the other group members could only judge the performance of p 3 based on x Ã 3 and thus overestimated her skills. Generally speaking, the first communicated estimate x Ã i tends to be more accurate than the initial estimate x 0 i , because x Ã i has been revised in light of what the others have communicated before [2,32]. For that reason, the weights are usually too high when α = 0 (Fig 4). Social discounting can correct this overestimation and is therefore beneficial to the group members.

Speaking order effect
One important side effect of the above mechanism is that individuals who tend to speak for the first time later in the discussion are more likely to be positively perceived by their peers. In fact, one can remain silent during the beginning of the discussion, integrate the estimates communicated by the others, and speak up later to communicate a revised and more accurate estimate to the rest of the group. This would have the effect of giving others the impression that the late speaker is a good performer. We evaluated the late-speaker effect in an additional series of simulations, by manipulating the round at which one group member speaks up for the first time. As predicted, the average weight w ij (N T ) that the individual j receives from the others after N T discussions is significantly increased as j speaks for the first time later in the discussion (Fig 5A). The late-speaker effect is attenuated for the calibrated values of α = 0.1, but does not disappear completely.
This result contrasts with the empirical fact that the individual who speaks first in a group deliberation have a stronger impact on the outcome of the discussion (generally known as the anchoring bias; see e.g., [33]). Interestingly, this "first-speaker" effect is also visible from our simulations (Fig 5B). In fact, the first-speaker and late-speaker effects are not incompatible: On the one hand, individuals who speak early during a discussion have a stronger influence on that discussion. On the other hand, however, individuals who speak late during a discussion have the stronger influence in the long run, because they tend to receive greater weights from others.

Discussion
Based on methods inspired by network science and opinion dynamics, we studied how the internal structure of a group could emerge and shape the group's collective performances. For this, we introduced the influence network-a weighted, directed graph that determines the extent to which each group member influences the others. We showed that the structure and the evolution of that influence network could be a major determinant of group performance: Groups perform well when their internal structure reflects the skills of the group members well, but perform poorly otherwise. It is also interesting to compare the performances of faceto-face discussions with those of other methods of collective intelligence such as the wisdomof-the-crowds approach (WOC). In contrast to face-to-face discussions, the WOC computes the average estimate of the group members in the absence of any social interactions [34]. Our additional simulations (see supplementary S2 Fig) show that the WOC outperforms the group discussion when the skills of the group members are similar. However, for groups composed of a mixture of good and bad performers, the discussion outperforms the WOC on the longrun because the group members will eventually find out who are the best performers and follow them while ignoring the judgments of the bad performers. In other words, groups can of individuals with higher social sensitivity tend to perform better than those with lower social sensitivity. An important question would then be whether this correlation between the social sensitivity of the individuals and the group's performance could be explained by the structure of the group's influence network. It is conceivable that individuals with a higher social sensitivity have a better ability to perceive the skills of their peers and to adjust the weights they give them during a discussion and in the long run. On the contrary, individuals with a lower social sensitivity would fail to adequately balance the weight they give to one another and produce maladaptive influence networks leading to poor collective performances.
Another important component of the collective intelligence factor is the ability of the group members to take conversational turns equally. Experiments have shown that groups where a few people dominate the conversation are outperformed by those with an equal distribution of speaking turns. In our simulations, however, all individuals have equal probability to speak up at each discussion round, and the impact of unbalanced speaking turns was not explored. The reason is that the relationship between an individual's skills, social influence, and speaking frequency is unclear. The speaking probability can be affected by the individual's skills, or by the individual's status in the influence network. This aspect of the discussion dynamics needs to be evaluated experimentally.
In sum, our simple model produces a rich set of predictions that could constitute important explanations to existing research on group discussion. This work calls for a series of experimental studies that would (1) validate the predictions, (2) test the relationship with the individual's social sensitivity, and (3)

Methods
The optimal influence networks presented in Fig 2 were computed through an exhaustive search optimization procedure. For each of the four group compositions presented in Fig 2, we systematically varied the six weight values of the network in the interval [0 1], with steps of 0.2. In such a way, we tested a total of 46656 different configurations for each group composition. For each configuration, we measured the average group error across 5000 discussions. The best 30 configurations that produced the smaller group errors were then merged by averaging the weights w ij across them. The six resulting weights are those presented in Fig 2. Supporting information S1 Fig. Performance of the agents. In the simulations, good performers have their initial estimate x 0 randomly drawn from a normal distribution with a mean 0 and standard deviation 1 (blue distribution). Bad performers draw their initial estimate x 0 from a normal distribution with a mean 0 and standard deviation 5. Estimates are assumed to be normalized such that the truth always equals 0. In such a way, the error e associated with a given estimate x is simply given by e = |x|. (EPS)

S2 Fig. Wisdom-of-the-crowds.
Comparison between the performances of the group discussions and the wisdom-of-the-crowds approach (WOC) for different group compositions and over 100 learning rounds. The WOC is evaluated by measuring the error of the average estimate of the group members before the discussion starts. The WOC does not involve interaction between group members and is therefore identical across all learning rounds for a given group compositions. In contrast, the performance of the group discussions depends on the weights that the group members assigned to one another and therefore change over learning rounds. When all group members are equally skilled (either all good or all bad), the discussion is outperformed by the WOC (in A and D). However, when there exist skill differences within the group (in B and C), the discussion eventually outperforms the WOC because group members gradually learn to rely on the judgment of their best performers, whereas the WOC weights the judgments of the good and bad performers equally. Results are averaged over 5000 simulations, with N = 3 and α = 0.