Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

Structure of the ECM.

The ECM consists of two layers, one for the percepts and one for the actions. Percepts and actions are connected by edges whose weight hij determines the transition probability from the given percept to each action (see Sec. 2.2 for details on the model).

More »

Fig 1 Expand

Fig 2.

Graphical representation of the percepts’ meaning.

Only the front visual range (colored region) is considered, which corresponds to the values that category sf can take. The focal agent is represented with a larger arrow than the frontal neighbors. The agent can only see its neighbors inside the visual range and it can distinguish if the majority are receding (light blue) or approaching (dark blue) and if they are less or more than three.

More »

Fig 2 Expand

Fig 3.

Structure of the simulation.

Each ensemble of agents is trained for 104 trials, where each trial consists of 50 global interaction rounds (g.i.r.). At each g.i.r., the agents interact sequentially (see text for details).

More »

Fig 3 Expand

Fig 4.

1D environment (world).

Agents are initialized randomly within the first 2VR positions. Food is located at positions F and F′. dF is the distance from the center of the initial region C to the food positions.

More »

Fig 4 Expand

Table 1.

Description of the parameters used in the learning simulations with PS.

More »

Table 1 Expand

Fig 5.

Learning curves for dF = 4, 10, 21 and dF = 21 for non-interacting (n.i.) agents.

The curve shows the percentage of agents that reach the food source and obtain a reward of R = 1 at each trial. For each task, the average is taken over 20 (independent) ensembles of 60 agents each and the shaded area indicates the standard deviation. Zooming into the initial phase of the learning process, the inset figure shows a faster learning in the task with dF = 10 than in the task with dF = 21. In the case of dF = 21, no agent is able to reach the food source in the first trial, and it takes the interacting agents approx. 200 trials to outperform the n.i. agents.

More »

Fig 5 Expand

Fig 6.

Learned behavior at the end of the training process.

The final probabilities in the agents’ ECM for the action “go” are shown for each of the 25 percepts (5x5 table). (a) and (b) Final probabilities learned in the scenarios with dF = 21 and dF = 4 respectively. The average is taken over 20 ensembles (each learning task) of 60 agents each. Background colors are given to easily identify the learned behavior, where blue denotes that the preferred action for that percept is “go” and orange denotes that it is “turn”. More specifically, the darker the color is, the higher the probability for that action, ranging from grey (p ≃ 0.5), light (0.5 < p < 0.7) and normal (0.7 ≤ p < 0.9) to dark (p ≥ 0.9). Figures (c) and (d) show what the tables would look like if the behavior is purely based on alignment (agent aligns to its neighbors with probability 1) or cohesion (agent goes towards the region with higher density of neighbors with probability 1), respectively. See text for details.

More »

Fig 6 Expand

Fig 7.

Final probability of taking the action “go” depending on the learning task (increasing distance to food source dF) for four significant percepts.

The percepts are (< 3r, < 3a), (< 3r, ≥ 3a), (< 3a, < 3r), (≥ 3a, < 3r), respectively (see legend). The average is taken over the agents’ ECM of 20 independently trained ensembles (1200 agents) at the end of the learning process. Each ensemble performs one task per simulation (dF does not change during the learning process).

More »

Fig 7 Expand

Fig 8.

Trajectories (position vs. time) of an ensemble of 60 agents in one trial prior to any learning process.

The vertical axis displays the position of the agent in the world and the horizontal axis the interaction round (note that the trial consists of n = 50 rounds). Each line corresponds to the trajectory of one agent. However, some agents’ trajectories overlap, which is indicated by the color intensity. The trajectory of one particular agent is highlighted for clarity.

More »

Fig 8 Expand

Fig 9.

Trajectories of all agents of an ensemble in the last trial of the learning process for (a) dF = 21 and (b) dF = 4.

Ensembles of agents trained to find distant food form aligned swarms (a), whereas agents trained to find nearby food form cohesive, unaligned swarms (b). With the same number of interaction rounds, aligned swarms (a) cover larger distances than cohesive swarms (b). In addition, observe that trajectories in panel (b) spread less than in Fig 8.

More »

Fig 9 Expand

Fig 10.

Evolution of the global alignment parameter through the learning processes with dF = 4,21.

At each trial, there is one data point that displays the average of the order parameter, first over all the (global) interaction rounds of the trial and then over 20 different ensembles of agents, where each ensemble learns the task independently. Shaded areas represent one standard deviation.

More »

Fig 10 Expand

Fig 11.

Evolution of the average number of neighbors around each agent through the learning processes with dF = 4,21.

At each trial, there is one data point that displays the average of M, first over all the (global) interaction rounds of the trial and then over 20 different ensembles of agents, where each ensemble learns the task independently. Shaded areas represent one standard deviation.

More »

Fig 11 Expand

Fig 12.

Trajectories of an ensemble of 60 agents, in a world of size W = 8000, shown over 5000 interaction rounds.

(a) Agents trained with dF = 21 form a swarm that continuously loses members until it dissolves completely. (b) Agents trained with dF = 4 form a highly cohesive swarm for the entire trial. The centered inset of this plot shows the first 2500 rounds, with a re-scaled vertical axis to observe the movement of the swarm. Insets on the right zoom in to 20 interaction rounds so as to resolve individual trajectories.

More »

Fig 12 Expand

Fig 13.

Evolution of the average number of neighbors throughout the trial of 5000 interaction rounds.

Average is taken over 20 ensembles of 60 agents each, where for each ensemble the simulation is performed independently. Shaded areas indicates one standard deviation.

More »

Fig 13 Expand

Fig 14.

Average number of neighbors (in percentage), global and local alignment parameter as a function of the distance dF.

Note that dF is the distance to the point where food is placed during the training. Each point is the average of the corresponding parameter over all interaction rounds (50) of one trial, and over 100 trials. 20 already trained ensembles are considered.

More »

Fig 14 Expand

Fig 15.

Percentage of agents that visit the positions situated at a distance from C given on the horizontal axis.

Since C is located at world position 6 (see Fig 4), a distance of e.g. 10 on the horizontal axis refers to the world positions 16 and 496. The already trained ensembles walk for one trial of 50 interaction rounds. For each of the four trainings (see legend), the performance of 20 ensembles is considered.

More »

Fig 15 Expand

Fig 16.

Swarm velocity 〈ξ〉 as a function of the training distance dF.

Each point is the average over the agents of 20 independently trained ensembles that have performed 50 independent trials each.

More »

Fig 16 Expand

Fig 17.

Trajectories of one ensemble of 60 agents that were trained with dF = 21.

The world size is W = 500. Color intensity indicates the number of agents following the same trajectory, i.e. moving within the swarm. Some agents leave the swarm and then rejoin it when the swarm completes the cycle and starts a new turn. Only the first 5000 interaction rounds (of a total of 105) are shown.

More »

Fig 17 Expand

Fig 18.

Hidden Markov model for the CCRW.

There are two modes, the intensive and the extensive, with probability distributions given by pI and pE (see text for details). The probability of transition from the intensive (extensive) to the extensive (intensive) mode is given by 1 − γII (1 − γEE), where γII and γEE are the probabilities of remaining in the intensive and extensive mode respectively. δ is the probability of starting in the intensive mode.

More »

Fig 18 Expand

Fig 19.

Mean squared displacement (MSD).

Log-log (base 2) plot of the MSD as a function of the time interval for two types of trajectories: trajectories performed by agents trained with dF = 21 (blue curve, circles) and by agents trained with dF = 4 (orange curve, triangles). We observe that the former present ballistic diffusion, whereas the latter exhibit close-to-normal diffusion. 600 individual trajectories (10 ensembles of 60 agents) are considered for each case.

More »

Fig 19 Expand

Fig 20.

Survival probability as a function of the step length.

The survival probability is the percentage of step lengths larger than the corresponding value on the horizontal axis. Each panel depicts the data from the trajectory of one agent picked from (a) aligned swarms and (b) cohesive swarms, so that this figure represents the most frequently observed trajectory for each type of dynamics. The survival distributions of the four candidate models are also plotted. The distributions for each model are obtained considering the maximum likelihood estimation of the corresponding parameters (see Sec. 4.3 for details). The curve for the CCRW model is obtained by an analytic approximation of the probabilities of each step length, given the maximum likelihood estimation of its parameters. Since the order of the sequence of step lengths is not relevant for this plot, we estimate the probabilities of each step length as (see Eq (9)) with .

More »

Fig 20 Expand

Table 2.

Average values of the MLE parameters for the different models.

More »

Table 2 Expand

Fig 21.

Violin plots that represent the Akaike weights obtained for each model.

(a) Akaike weights of trajectories of agents trained with dF = 21 (aligned swarms). (b) Akaike weights of trajectories of agents trained with dF = 4 (cohesive swarms). 600 individual trajectories —per type of swarm— were analyzed for each plot. The ‘•’ symbol represents the median and the vertical lines indicate the range of values in the data sample (e.g. PL model in figure (a) has extreme values of 0 and 1). Shaded regions form a smoothed histogram of the data (e.g. the majority of Akaike weights of the CCRW model in figure (a) have value 1, and there are no values between 0.2 and 0.8). See text for more details.

More »

Fig 21 Expand

Fig 22.

Percentage of trajectories that are best fit by each model according to the BIC criterion.

A model is considered to best fit the data of a given trajectory if it has the lowest BIC value and its difference with respect to the rest of the models is larger than 10. 600 individual trajectories —per type of swarm— were analyzed for each histogram.

More »

Fig 22 Expand