Identifying influential neighbors in animal flocking

Schools of fish and flocks of birds can move together in synchrony and decide on new directions of movement in a seamless way. This is possible because group members constantly share directional information with their neighbors. Although detecting the directionality of other group members is known to be important to maintain cohesion, it is not clear how many neighbors each individual can simultaneously track and pay attention to, and what the spatial distribution of these influential neighbors is. Here, we address these questions on shoals of Hemigrammus rhodostomus, a species of fish exhibiting strong schooling behavior. We adopt a data-driven analysis technique based on the study of short-term directional correlations to identify which neighbors have the strongest influence over the participation of an individual in a collective U-turn event. We find that fish mainly react to one or two neighbors at a time. Moreover, we find no correlation between the distance rank of a neighbor and its likelihood to be influential. We interpret our results in terms of fish allocating sequential and selective attention to their neighbors.


Author summary
Schooling fish exhibit impressive group-level coordination in which multiple individuals move together in a seamless way. This is possible because each individual in the group responds to the movement of other group members. But how many individuals does each fish pay attention to? Which are the influential neighbors? It is necessary to answer these questions in order to understand how directional information propagates across a group. Our research shows that in the rummy-nose tetra species there is a limited number of influential neighbors which are not necessarily the closest ones.

Introduction
Collective motion phenomena such as swarming, flocking and schooling behavior have been observed in a large variety of animal species ranging from bacteria to humans [1]. Several theoretical models have been proposed to explain how such large scale coordination patterns emerge from "microscopic level" interaction rules among individual animals [2][3][4][5][6][7]. These models have been instrumental in improving our understanding of collective motion in real animal groups by providing an indication of which interaction mechanisms are sufficient to reproduce realistic patterns of collective behavior. In particular, most models agree on the fact that two types of interaction are responsible for maintaining group cohesion to achieve coherent collective motion: attraction and alignment. More recent improvements in remote sensing and video-tracking technologies [8][9][10] have made possible to automate data collection and test directly theoretical models against highly resolved empirical movement data in various species. Generally, these studies have confirmed the importance that attraction and alignment behavior play in the formation and maintenance of collective movement patterns [11][12][13][14][15]. However, there is a less clear scientific consensus about how these interaction rules are implemented in the sensory-motor responses of individuals. This lack of agreement underscores the importance of answering the following question: how do individuals mediate interactions with multiple neighbors? [16].
Rather than siding with one or more of the proposed neighborhood definitions, we adopt a fully data-driven approach with minimalist modeling assumptions. The simplest hypothesis consists of assuming that fish copy the actions of their neighbors, but not instantaneously: the fish reaction takes time to process sensory information and to trigger the appropriate behavioral response. Those assumptions impose a temporal constraint given by the sequential occurrence of the perception of the neighbors' actions, and the movement response [21,22]. We thus assume that animals following a particular neighbor in a new direction are subject to a time-delay when copying the heading of influential neighbors.
Considerable work has already appeared on the identification of these time-delays. The delays with which individuals align with each other have in fact been exploited to determine social hierarchies in animal groups, as shown, e.g., for pigeon flocks [23], where the leadership network is constructed with link weights given by the delay for which pairwise angle correlation is maximal. Improvements on how to identify such delays from movement data have proposed the use of time-dependence in pairwise angle correlation [24]. A computational analysis, based on similarities between trajectories (Fréchet distance), has also been proposed and implemented in a visual analytic tool [25]. A different approach has made use of a timeordering procedure on the pairwise angle correlation to determine temporary leader/follower relations in foraging pairs of echolocating bats [26]. The analysis of the bat trajectories was instrumental in identifying transient leadership and coupling it to sensory biases of the species. However, only pairs of individuals were considered and group influence on individual behavior was not investigated.
Since identifying influential neighbors is key to unravel the mechanisms of interaction, there is a need in collective behavior studies to establish transient leadership from the dynamics of the individual trajectories. One way to bridge this gap consists of determining who are those influential individuals whose heading is being copied more closely by others, how many of such influential neighbors exist, and where are located in the group.
Fish have the ability to choose not only when to copy the heading of another individual, but also the extent to which this heading is copied, that is the similarity and the pace at which fish match the trajectory's curvature of another individual [11,27]. The closer two (or more) fish are to this matching, the more aligned they are (even if with some delay), and the more faithfully they are following the movement path of the transient leader.
Here, we introduce a procedure that allows us to identify the influential neighbors of fish moving in a group, and we test it along a series of experiments in groups of two and five individuals of the freshwater tropical fish Hemigrammus rhodostomus swimming in a ring-shaped tank (see details in Materials and methods). In this set-up, fish swim in a highly synchronized and polarized manner, and can only head in two directions, clockwise or anticlockwise, regularly switching from one to the other. We base our procedure for identifying influential neighbors on time-dependent directional correlations between fish, focussing our analysis on the interactions that occur during these collective U-turns. Indeed, during U-turns, fish have to make a substantial change of direction to reverse their heading, making easier the extraction of the correlation resulting from the direct interactions between individuals rather than other incidental correlations, e.g., their channeled motion in the ring-shaped tank. Moreover, as correlation does not imply causal influence, we need to control for potential spurious correlations. We do so by constructing a null model of collective U-turns to show that the patterns of interaction observed in the experiments are not due to random processes.

Dynamics of collective U-turns
Hemigrammus rhodostomus performs burst-and-coast swimming behavior that consists of sudden heading changes combined with brief accelerations followed by quasi-passive, straight decelerations [15]. Moreover, fish spend most of their time swimming in a single group along the wall of the tank. Fish regularly change their position within the group [28], so that every individual fish can be found at the front of the group.
A typical collective U-turn event starts with the spontaneous turnaround of a single fish (hereafter called the initiator), mostly located at the front of the group [28]. This sudden change of behavior triggers a collective reaction in which all the other individuals in the group make a U-turn themselves, so that, after a short transient, all individuals adopt the same final direction of motion as the initiator. Overall, we analyzed 1586 U-turns of which 1111 were observed in groups of 2 fish and 475 in groups of 5 fish. Fig 1 shows two examples of collective U-turns in groups of N = 2 (left column, panels ABC) and N = 5 fish (right column, panels DEF; see also supplementary S8 Fig and supplementary S1 and S2 Videos in the Supplementary Information). Fig 1A shows a first fish F 1 (red color) swimming close to the upper-left region of the tank, followed by a second fish F 2 (purple color) at a distance d 12 % 8.5 cm, swimming in the same direction. Right before the U-turn starts (Fig 1A), fish F 1 reduces its speed (circles become closer to each other), the distance d 12 decreases (to % 5.1 cm), and F 2 also reduces its speed. Then, both fish perform a change of direction which lasts about 1 second and during which fish F 2 clearly follows fish F 1 (see the corresponding circles at each instant of time in Fig 1B). Once the U-turn is completed (Fig 1C), F 1 accelerates again, and so does F 2 , which also adopts the direction of motion of F 1 . The distance d 12 increases again (% 9.5 cm), due to the larger velocities, and remains of the same order along the depicted trajectory.
The situation is less clear when we try to describe collective U-turns in larger groups. Fig 1D, 1E and 1F show a collective U-turn for the case where N = 5. Before the U-turn, fish F 2 (orange) seems to be the fish that the rest of the group follows, the first circle of its trajectory being the most advanced one in the direction of motion. In fact, a position order can be inferred from Fig 1D: F 2 , F 3 , F 5 , F 1 and F 4 . However, it is rather complicated to extract from Panel E a precise information about which fish is the initiator of the U-turn, in which order the other fish follow, and therefore, who is influencing whom, especially if time-delays and reaction times are taken into account. The same happens with the information about fish's positions after the U-turn, provided by Panel F.
In order to describe rigorously the individual behavior of the N fish during a U-turn, we introduce the angle ϕ i (t) as an instantaneous measure of the direction of motion of a fish F i ; see Fig 2. We assume that the instantaneous heading of a fish F i can be defined in terms of the velocity vectorṽ i ðtÞ, so thatṽ i ¼ ð cos 0 i ; sin 0 i Þ kṽ i k. The heading of a fish ϕ i allows us to characterize the angle of incidence of the fish relative to the wall, θ wi = ϕ i − ψ i , where ψ i is the angle formed by the position vector of the fish with the horizontal line (see Fig 2). The angle of incidence θ wi is an individual measure that doesn't depend on the heading of another fish. When a fish F i is swimming along the wall, the value of θ wi is around ±90˚(we choose, by convention, the positive sign for the anticlockwise angle). In our experiments, most of the time the absolute value of the angle of incidence is close to 90˚; equivalently, |sin(θ wi (t))| % 1. When the motion is perpendicular to the wall, the incidence is zero if the fish points towards the wall (θ wi = 0˚), and maximal if the fish points towards the center of the tank (θ wi = 180˚); in both cases, sin(θ wi (t)) = 0.
The change of sign of angle θ wi can serve as an indicator that a U-turn has taken place. In fact, this allows us to delimit the individual U-turns with precision and, consequently, to determine the start and the end of a collective U-turn.
We define the start and end times t s,i and t e,i of the individual U-turn of fish F i in terms of the absolute value of the angle of incidence, |θ wi (t)|. Once a U-turn has been detected, we obtain the time t s,i at which |θ wi (t)| has decreased (from approximately 90˚) below a given Angle ψ j denotes the angular position of fish F j with respect to the horizontal (positive values fixed in the anticlockwise direction); angle ϕ i is the heading of fish F i ; θ wi is the angle of incidence of fish F i with respect to the outer wall; d ij is the distance between F i and F j ; θ ij is the viewing angle of F i with respect to F j (not necessarily equal to θ ji ), and ϕ ij = ϕ j − ϕ i is the heading difference of F i with respect to F j . https://doi.org/10.1371/journal.pcbi.1005822.g002 threshold " y s , and the time t e,i at which |θ wi (t)| has increased again and is above another given threshold " y e (see Materials and methods for more details). Thus, the start of a collective U-turn is determined by the time t s at which the first individual U-turn starts, while the end of a collective U-turn is given by the time t e at which the last individual U-turn finishes. That is: For each collective U-turn, we have made a convenient time shift so that t s = 0. Then, t e denotes not only the end time but also the duration of the collective U-turn. We also introduce an instantaneous measure of how similar the direction of motion of individual fish are across the group. We define the instantaneous group polarization P(t) as the following function of normalized fish velocity vectors: whereẽ i ¼ṽ i = kṽ i k. When all the fish have the same direction then the polarization is maximal and P(t) = 1. The minimum value P(t) = 0 is reached instead when the velocity vectors cancel. Figs 3 and 4 depict the two U-turns introduced in Fig 1, in terms of the polarization P(t) and the sine of the angle of incidence of each fish with respect to the outer wall θ wi (t). The duration of the two illustrated collective U-turns is t e = 0.94 s for N = 2 and t e = 1.5 s for N = 5.
For both group sizes, the group polarization (Figs 3B and 4B) before and after the U-turn is quite close to 1, showing that before and after the collective U-turn, all individual fish maintain essentially the same common direction. During the U-turn, the polarization decreases, describing a sharp V-form with a minimum at P(t) % 0.27 for N = 2 and P(t) % 0.60 for N = 5. The minimum is reached at approximately half the duration of the collective U-turn, t m = (t s + t e )/2: t m = 0.47 s for N = 2 and t m = 0.75 s for N = 5.
Figs 3C and 4C show the change of direction individually for each fish in both U-turns: from anticlockwise to clockwise direction for N = 2, and vice versa for N = 5. Fig 3C clearly indicates that at t % 0.3 s, the fish F 1 has almost completed its individual U-turn, while F 2 has just started to change direction: sin(θ w2 (0.3)) % 0.98, while sin(θ w1 (0.3)) % −0.5.
In Fig 4C, a similar ordering can be inferred from the times of departure from the bottom line at ordinate sin(θ wi ) = −1 + δ, where δ > 0 is a small parameter with respect to the range of ordinate values; we used δ = 0.1. Thus, the order is 2-3-1-5-4. However, the order in which individual fish change the sign of their angle of incidence θ wi is different, 2-1-3-5-4, and also different is the arrival order to the top line at ordinate sin(θ wi ) = 1 − δ: 2-5-1-4-3. Moreover, some of these departure and arrival times are almost identical (see, e.g., F 1 and F 4 ), and the behavior of the fish during the U-turn is completely different. These difficulties in establishing a consistent order show that another criterion is necessary to identify the relation of influence between fish.
We have based our criterion to decide if a fish is an influential neighbor of another fish on the average value of the time-dependent directional correlation between the two fish along a time window.
For each pair of fish F i and F j , we define the directional correlation H ij as a function of the heading of F i evaluated at time t and the heading of F j evaluated at a delayed time t − τ, where τ The function H ij (t, τ) is in fact the cosine of the angle formed by the headingsẽ i ðtÞ and e j ðt À tÞ, and is a measure of how aligned is fish F i at time t with fish F j at time t − τ. The values of H ij (t, τ) are between −1 (when fish swim in opposite directions) and 1 (when fish have the same direction), and equals zero when fish have perpendicular directions.
By averaging H ij (t, τ) along a time-window of length (2w + 1)Δt, we are able to quantify how much the focal fish F i is copying the moving direction of its neighbor with a time-delay τ by means of the following function [26] C ij ðt; t; wÞ where t k = kΔt (the time-step in our experiments is Δt = 0.02s). The time-window parameter length w has been determined by means of a sensitivity analysis (pairwise similarity matrix), finding that w = 2 yields the more satisfactory results; see Section "Parameter selection" in Materials and methods and S5 Fig. The average directional correlation C ij (t, τ, w) allows us to characterize a fish F j as an influential neighbor of a focal fish F i at time t with time-delay τ, if the value of C ij (t, τ, w) is larger than a given threshold C min . Details on how w and C min are obtained are given in Sections "Optimal setting parameters for influential neighbors identification" and "Parameter selection" in Material and Methods. Fig 5 shows the directional correlation H 12 and its time-average C 12 between fish F 1 and F 2 along the collective U-turn depicted in Fig 3. Left (resp. right) panels aim to indicate the alignment of fish F 1 (resp. F 2 ) at each time t with respect to the alignment of fish F 2 (resp. F 1 ) at an earlier time t − τ. Panels A and C show respectively that for all τ, there is always an interval of time during which H 12 (t, τ) % −1 and C 12 (t, τ) % −1 (dark region), meaning that for all timedelays there is always an interval of time in which fish have opposite directions. Moreover, the larger the time-delay, the wider the black region where the direction of F 1 is opposite to the direction of F 2 at the previous time.
On the other hand, the figures of the directional correlation of F 2 with F 1 , especially Panel D, show a connected region in which the correlation C 21 (t, τ) remains positive and above the threshold (yellow in the figure) around τ % 0.42 s where H 21 % 1 during all the time interval [−0.5, 2 s]. This strongly suggests that, during this time interval, F 2 is copying the behavior of F 1 with a 0.42 s time-delay, denoted τ 2,1 for this specific U-turn. Thus, one can consider that F 1 is influencing F 2 with time-delay τ 2,1 , while F 2 is not influencing F 1 in this specific case. This influence dynamics is illustrated in Fig 3D by drawing an arrow at time t from F j to F i when F j satisfies the condition C ij (t, τ, w) > C min for being an influential neighbor of F i at time t, which in turn receives this influence and responds by copying the exhibited heading with a timedelay τ.
Using the same procedure for the N = 5 case depicted in Fig 4, we draw Fig 6 that shows F 1 copying F 2 with a time-delay τ 1,2 % 0.5 s (Panels A and E). F 1 also copies F 3 and F 5 with, respectively, τ 1,3 % 0.2 s (Panels B and F) and τ 1,5 % 0.1 s (Panels D and H), but it doesn't copy the end of its U-turn, with the middle representing the time when a fish has finally reversed its original direction. (D) Interaction with influential neighbors: arrows point from influential neighbors to the focal fish and with the same color as the focal fish. (E) Fish bursting activity and their influential neighbors. Dots at i = 1, 2 correspond to bursting activity, blank corresponds to coasting. Dots at i − 0.5 represent bursting activity of the neighbor influencing fish i. https://doi.org/10.1371/journal.pcbi.1005822.g003

Effect of bursting on the transmission of information
The specific behavior of H. rhodostomus, namely, the successive alternation of bursts and coasts [15], leads us to ask whether these abrupt changes of acceleration and speed can provide information that other fish could use to adjust their own movement. To address this aspect we study whether there is any correlation between the bursting activity of one fish at time t and the fact that this fish is an influential neighbor of another fish shortly after time t.
A burst corresponds to a brief phase of acceleration during which most changes in fish heading occur [15]. Panels E in Figs 3 and 4 show the bursting activity of each fish F i , i = 1, . . ., N, and that of its influential neighbors. For each fish F i , we draw a dot at time t and ordinate i if fish F i is displaying a burst precisely at time t. Dot color at ordinate i corresponds to fish F i 's color. The absence of a dot at a given time denotes that the fish is in a coasting phase at that time.
fish and with the same color as the focal fish. (E) Fish bursting activity and their influential neighbors. If there is more than one influential neighbor, F j with largest index value j is shown. Grey lines in Panels BCDE denote the start and end of the collective U-turn.
https://doi.org/10.1371/journal.pcbi.1005822.g004 A second row of colored dots is drawn at ordinate i − 0.5 for some values of t when two conditions are met: (1) Fish F i is being influenced at those times by one or more fish F j , j 2 {1, . . ., N}, j 6 ¼ i, whose identity is given by the color of the dots, and (2) the influential fish F j was bursting when it was influencing F i at time t − τ earlier. If F i has more than one influential neighbor at time t, the dot drawn at time t in row i − 0.5 has the color of the F j fish with the highest index j.
In Fig 3E, red dots at i = 1 mean that fish F 1 is bursting at those time-steps and coasting at the other time-steps, and red dots at i − 0.5 = 1.5 indicate that, first, F 1 is the influential fish of F 2 at those time-steps, and second, F 1 was bursting when it was earlier influencing F 2 . In turn, there are two possible reasons to explain the absence of red dots at i − 0.5 = 1.5 for certain time values: either F 2 has no influential neighbor, or F 1 was coasting. To assess which of the two explanations is valid, one needs to look at Fig 3D. For example, the absence of dots at i − 0.5 = 1.5 during 0.57 s and 0.62 s is due to F 2 having no influential neighbors, while the absence of dots in the same row between 0.75 s and 0.81 s results from the fact that F 1 , which is the influential neighbor of F 2 , is in a coasting phase at time t − τ (in this example the delay was found to be τ = 0.42 s). Fig 3E shows that the bursting activities of both the focal fish and its influential neighbor are not directly correlated, suggesting that the primary source of information for fish to adjust their movements is the distance, orientation and angular position of their neighbors [15]. The same conclusion is obtained for N = 5. By focusing on fish F 2 for example, Fig 4E shows that there is no systematic overlap between the yellow dots at i = 2 and those at i − 0.5 for i 6 ¼ 2, suggesting that the correlation between the bursting activity of a fish and that of their influential neighbors is marginal.

Number of influential neighbors
For all U-turns, we have counted the number of frames in which a fish is an influential neighbor, that is, the number of frames where the above described condition for identifying influential Identifying influential neighbors in animal flocking neighbors is met. When there are only two fish, a fish is found to be the influential neighbor 30% of the time spent in a U-turn. In groups of five fish, this proportion grows up to 62%.
We have counted the number of influential neighbors N if a fish F i has during a U-turn in groups of five fish, finding that in most cases, a fish has only one or two influential neighbors (for 58% of the time spent in a U-turn N if = 1 or 2); see Fig 7A. The most frequent case is N if = 1 (43%). Having more than one influential neighbor is frequent (19%), but less than Identifying influential neighbors in animal flocking having no influential neighbors (38%). The cases where there are more than two influential neighbors are negligible (less than 4% of the total time spent in U-turns).
For each fish F i , we have calculated the respective distance d ij (t) at which the other N − 1 fish F j are from F i during the U-turns, thus establishing a rank order among the neighbors influenced by F i . We have then compared the influence of close neighbors with those of distant neighbors, finding no correlation between the distance rank of a neighbor and the influence it exerts on the focal fish. This is shown in Fig 7B, where we have depicted the distribution of the distance rank of influential neighbors with respect to a focal fish. The figure shows that fish spent the same proportion of time (% 25%) being an influential neighbor of a focal fish independently of their distance rank. In other words, influential neighbors are not necessarily the closest ones.
When trying to identify events of causal influence by means of correlations, it is crucial to keep in mind that correlation does not imply causation. We thus have controlled the effects of potential chains of influence, where e.g. fish F 1 is highly correlated with F 3 not because F 1 is directly influencing F 3 , but because F 1 is influencing fish F 2 , which in turn is influencing F 3 .
To check the impact of these chains of influence on our results, we have removed from our data all the pairwise influence data that correspond to the following situation: if F 1 is influenced by both F 2 and F 3 and F 2 is simultaneously influenced by F 3 (or F 3 is influenced by F 2 ), then we removed the pairwise correlation (focal fish, influential neighbor) corresponding to (F 1 , F 2 ) (or (F 1 , F 3 )). After removing 7172 out of 69703 data points and recomputing the results with the remaining data, we found that our results remain practically unchanged.
We have also calculated the position rank that each fish occupies in the group during a collective U-turn, finding that influential neighbors are mostly located in the front region of the group: 32% in the leading most advanced position, and 20% in the second place; see Fig 7C. Noticeably, influential neighbors can be found in the back of the group (in 29% of the cases in the fourth or fifth position), and even in the last position (a non-negligible 13% of cases).
We also paid attention to the order in which each fish starts its individual U-turn during a collective U-turn, finding that influential neighbors are those that most frequently turn earlier (32% of the cases), and that this relation decreases linearly; see Fig 7D. It is again noticeable that influential fish can be found to be the last turning fish (in 8% of the cases).
The apparently surprising fact that influential fish can be found in the back of the group and that the last fish turning can be an influential fish is due to the anisotropic perception of fish and their relative orientations during U-turns. But these findings have to be understood in the light of our specific time-dependent characterization of influential neighbor. If, for instance, F 1 turns first and influences F 2 , F 2 will turn with some time-delay after F 1 . Then, when F 2 is at half of its individual turning process, F 2 can be rotating in the same direction as F 1 in such a way that F 1 , influenced by F 2 , slightly adjusts its direction. We would then say that F 2 , which is the last turning fish, has influenced F 1 , the first turning fish.
In order to compare different collective U-turns, we define a normalized time " t ¼ ðt À t s Þ=ðt e À t s Þ in terms of the actual time t and the starting and ending time of each U-turn, so that the duration of a U-turn is now " t ¼ 1. Thus, " t ¼ À 1 corresponds to a time as long as the U-turn duration previous to the start of the U-turn, and " t ¼ 2 corresponds to a time as long as the U-turn duration after the end of the U-turn. We have calculated the instantaneous value of the average speed VðtÞ ¼ hkṽðtÞ ki, the average group polarization PðtÞ ¼ hPðtÞi and the average number of influential neighbors N ðtÞ ¼ hN if ðtÞi. Here, angle brackets refer to the average across all fish in the U-turn along a time-window containing the collective U-turn.  valid for the general case: the speed decreases before the U-turn (from VðÀ 1Þ % 150 mm/s to Vð0Þ % 115 mm/s), it reaches a minimum at half the U-turn duration " t ¼ 0:5 (Vð0:5Þ % 70 mm/s), and it then grows to a higher value after the U-turn (Vð1:5Þ % 165 mm/s). A very similar behavior was found in groups of 2, 4, 8 and 10 fish of the same species in [28]. At the same time, the polarization is very high and almost constant outside the U-turn (Pð " tÞ % 0:95), and exhibits a perfect V-shape during the U-turn, with the high values (Pð " t ¼ f0; 1gÞ % 0:93) reached at exactly the instants where the start and end of the U-turn takes place " t ¼ 0 and " t ¼ 1, and the minimum value (Pð0:5Þ % 0:48) at the middle of the U-turn. As expected, the average group polarization Pð " tÞ significantly decreases during the U-turn to almost half the value it has outside the U-turn. Right after reaching this minimum, there is a sharp increase of speed and polarization as more fish adopt the new direction of motion. Fig 8C shows that before the U-turn the average number of influential neighbors N ðtÞ increases until a maximum value is reached right before the start of the U-turn (N ðÀ 0:1Þ % 1:45). During more than one half of the U-turn, N ðtÞ decreases until a minimum (N ð0:6Þ % 0:8), and grows again beyond the end of the U-turn until a second maximum (N ð1:2Þ % 1:6, twice the height of the minimum). After that, all fish have completed their U-turns and N ðtÞ decreases again.
When the polarization is very high, the time-delay with which influential neighbors are detected is often too small in comparison with biologically realistic reaction times τ R , so that these influential neighbors are not taken into account (we used τ R = 0.04 s; see Section "Optimal setting parameters for influential neighbors identification" in Materials and methods). This is the reason why the average number of influential neighbors N ðtÞ appears to be smaller in regions outside the U-turn, than when the U-turn is just about to start ( " t % À 0:1Þ or slightly after its end ( " t % 1:2). Meanwhile, the decrease of N ðtÞ in the middle of the U-turn has a different origin: once a fish has started to turn around, there is no real need of updating its alignment according to all its neighbors. That fish can safely reverse its motion by keeping the alignment with only one of those neighbors and even not paying attention to them for some period of time.
Another indicator of how fish make decisions while turning is how frequently a focal fish pays attention to other individuals. We define the relative variation of the number of influential neighbor per fish N if (t) between two successive time-steps as follows: denoting by Δt the time-step between frames (Δt = 0.02 s).
We have depicted the time-evolution of the average hη(t)i in Fig 8D, finding that hη(t)i remains essentially constant before, during and after the U-turn event, the amplitude of its variation being smaller than 10% of the signal (0.007 and 0.08, respectively).
Since the average number of influential neighbors N ðtÞ is smaller when fish are engaged in the U-turn than right before or right after the U-turn, a constant average hη(t)i suggests that fish adjust their heading more frequently during the U-turn than outside the U-turn. Indeed, in the middle of a U-turn, no real common direction of motion exists (PðtÞ % 0:5), that is, there is a high diversity of headings, so that fish have to frequently update their direction by paying attention to different neighbors.

Spatial organization of influential neighbors
We are now interested in determining the dynamical spatial organization of the influential neighbors of a focal fish. The relative state of a fish F j with respect to a focal fish F i is characterized by several parameters: the relative position of the neighborũ ij ¼ũ j Àũ i , wherẽ u i is the vector position of F i in cartesian coordinates, the distance between them d ij ¼kũ ij k, the viewing angle of F j relative to the direction of F i [26], which is the angle θ ij with which F i perceives F j (note that θ ij is not necessarily equal to θ ji ), the relative velocityṽ ij ¼ṽ j Àṽ i , and the relative heading ϕ ij = ϕ j − ϕ i . All these quantities are time-dependent. We have calculated their average value for all the U-turns in a uniform spatial grid of square cells to facilitate the interpretation of the vector field of these continuous variables. Each square cell, of side 20 mm, shows the average of the arbitrarily different number of values contained in the cell. Fig 9A shows the density map of the relative position of the influential neighbor with respect to the focal fish when N = 2. The intensity of color is proportional to the frequency of occupation of the grid cell, showing that the influential neighbor is mostly located in front of the focal fish and at a distance of one to three body lengths from the focal fish. The same information is quantified in Panel B with a heat map in polar coordinates, highlighting the most frequent location of the influential neighbor.
The average relative velocity hṽ ij i is shown in Fig 9A (arrows), superimposed to the density map. The vector field shows that when the influential neighbor is in front of or behind the focal fish (sinhθ ij i % 0), both fish move at similar speed although the focal fish is a little bit faster (the small black arrows are pointing in the opposite direction to the red one) and the difference in heading is also small. However, when the influential neighbor is on the sides of the focal fish, relative speed and heading difference tend to vary more as the distance between them increases.
The distributions of distances d ij and exposure angles θ ij between a focal fish and its neighbors are depicted in Panels C and D of Fig 9 respectively. We find, on the one hand, that their most frequent separation is 62.6 mm ± 29.7 mm (mean and standard deviation of histogram in Fig 9C), a value that is consistent with previous results where it was shown that the behavioral reactions of a fish depend on the angular position of its neighbors, as a consequence of the anisotropic perception of the environment [15].
On the other hand, the distribution of the exposure angle of fish F j to the focal fish F i is narrower when F j is influencing F i than when F j is a neighbor of F i , not necessarily influencing F i . As both distributions are centered on θ ij = 0, this shows that F j is more frequently located in front of F i when F j is an influential neighbor of F i than in the case when F j is just a neighbor of F i . Fig 10 shows similar results for groups of N = 5 fish. Influential neighbors are more frequently located in front of the focal fish (although with a slight shift to the right; see Panels A and B) and at a mean distance of 67.5 mm ± 40.6 mm (Panel C).
In turn, the velocity field has a smaller intensity and is much more homogeneous than in the case where N = 2. A slight asymmetry can also be observed (not noticed when N = 2) with fish located in front and slightly to the right of the focal fish having a higher velocity than those located elsewhere. Moreover, the distribution of exposure angles is more dispersed than in the case of two fish, meaning that influential neighbors are exposed to the focal fish with a larger diversity of angles, something that is simply due to the higher number of fish.
The difference in the homogeneity of the velocity field between groups of 5 and 2 individuals is not necessarily the result of averaging over a larger number of individuals. Although averaging over fish data pairs may reduce the uncertainty in the extracted parameter values, it is well-known that the level of homogeneity in the direction of motion of the school increases with group size [29]. But one also ought to consider that specific values of delay and curvature the individuals adopt during the U-turns could help to limit variability in coordinating the group. Some theoretical studies support this idea: simplified models of velocity alignment with additive noise have shown semi-analytically the existence of delay and rate of turn values that minimise the fluctuations in the variance of the individual speed [30], and flocking models of self-propelled particles have also shown that delay can be tuned to increase stability and alignment of the group [31].
Finally, we have analyzed the variation of the time-delay τ as a function of both the distance between the focal fish and its influential neighbors d ij and the difference of heading ϕ ij , finding Identifying influential neighbors in animal flocking that in both cases N = 2 and N = 5, the time-delay increases with respect to both the distance d ij and the heading difference ϕ ij (see Fig 11). This result can be understood because during a U-turn the fish speed is decreasing and two fish can display larger reaction times the more separated they are and the less aligned they are. Identifying influential neighbors in animal flocking A null model to detect spurious correlations As already mentioned in the introduction, establishing causal influence on the basis of correlation measures requires controlling for spurious effects. Although our experimental data correspond to a specific collective behavior in which individuals influence each other, the relatively short time-windows over which cross-correlation are averaged and the use of several parameters through sensitivity analysis can weaken the accuracy of our results. To demonstrate that the particular detections of influential neighbors are not purely due to chance, we generated random artificial U-turns events by bootstrapping the data and applying the same procedure used to analyze collective U-turns in our experiments.
The null model is built for groups of 5 fish, for which our experimental data provide M = 2375 individual trajectories (5 × 475 collective U-turns). For every fish F i , i = 1, . . ., M, the trajectory is rotated so that the individual turning point of the fish (where sin(θ wi ) = 0) is located in the upper part of the tank, by randomly sampling the new angular position ψ i in the interval [π/2 − ξ, π/2 + ξ], where ξ is a small angle (we used ξ = π/12). Similarly, the time scale of each fish is shifted by sampling the instant of turning in the time interval [−z, z], where z is a short time (we have used z = 1 s). Then, five trajectories are randomly sampled, each one from a different randomly sampled collective U-turn, and mirrored if necessary so that the five individual U-turns are done in the same direction, clockwise or anti-clockwise. This way, the five fish of the artificial U-turn make their individual U-turn approximately at the same place and approximately the same time. For more details, see the section "Null model" in Materials and methods.
We  Identifying influential neighbors in animal flocking Fig 7A shows that in artificial U-turns the proportion of time during which a focal fish has no influential neighbor is more than 63% of the time, while in the experiments it was less than 39%. The analysis also reveals that in artificial U-turns a focal fish has one influential neighbor for less than 28% of the time, while in the experiments, the proportion raises to 43%. Similarly, Fig 8C shows that the average number of influential neighbors N ðtÞ ¼ hN if ðtÞi is much smaller in artificial U-turns (% 0.4) than in real U-turns, where N ðtÞ is almost always greater than 1. Note that the increase of N ðtÞ during U-turns in artificial data is the consequence of the channeled motion of fish by the corridor. Moreover, the variation of N ðtÞ along time, including the transients preceding and following the U-turn, decreases in artificial U-turns while it remains constant and with a higher value in experiments. Fig 7B shows that distance rank has no significant effect on which fish is the influential one, both in experiments and in artificial U-turns. The decreasing number of influential neighbors comes from the fact that the tank is circular and the method we use. If the tunnel had been a straight corridor, we should have detected no decrease in our null model. However, in a circular tank, because of the geometrical constraints imposed by the curvature, even when two fish are both swimming in the same direction (i.e., clockwise or anti-clockwise), as the distance between fish increases, our method will detect a decrease of correlation. While Fig 7C confirms that influential neighbors are slightly more often ranked in the first position of the group, this effect is much more pronounced in the experiments. In fact, Figs 7B, 7C and 7D and 8A and 8B show that the selected null model satisfactorily reproduces the typical spatiotemporal behavioral patterns of real U-turns: the position and turning ranks are almost identical, as well as the variation of the average speed and the average group polarization, although the V-shape of the average polarization in real U-turns is significantly sharper than in artificial U-turns.
An additional, albeit expected, result of our null model is the homogeneous (isotropic) spatial distribution of "influential neighbors", while in real collective U-turns influential neighbors are mostly located in front of the focal fish; see S10A and S10B Fig, compared with Fig 10A and 10B.

Discussion
By sharing information with other group members, schooling fish and other collectively moving animals can potentially improve their navigational accuracy (e.g. the many wrongs principle [32]), take better decisions (e.g. to avoid a predator [33]), or improve their abilities to sense the environment [34]. However, there are both physical and practical reasons why information is expected to be shared with only a few neighbors. Physical reasons involve material limitations, such as visual occlusions. Practical reasons often refer to trade-offs between sharing information, so that the group collectively selects a direction of motion, and deciding independently [35,36].
Assuming that correlations between fish behavior rely to some extent on a causal influence, our analysis reveal that in groups of H. rhodostomus, during a collective U-turn, at any moment in time each fish only pays attention to a small number of neighbors whose identity regularly changes. We also find that the phases during which a focal fish is affected by one or two influential neighbors are interspersed with other phases during which its movement appears uninfluenced by the movement of neighbors. Moreover, influential fish are mostly located in front of the focal fish. The distance between a focal fish and its influential neighbors is about two body-lengths and the relative exposure angle is smaller than 60 degrees.
Our results bring insights on the way information on the neighborhood is processed by fish. Instead of having a synchronous update based on a fixed number of neighbors (topological neighborhood) or on all neighbors located within a fixed distance (metric neighborhood), our results suggest an asynchronous updating that does not depend on the distance between a focal fish and its influential neighbors. A similar asynchronous updating scheme has been previously introduced by Bode et al. [37] in a flocking model showing that it can give rise to emergent topological interactions consistent with the measures done on starling flocks [38].
It is however worth noting that our experiments, performed on small group sizes, may have prevented us from detecting any influence of the distance, since each of the four neighbors are located between one and three body lengths. In larger groups of fish moving in an unconstrained space, we expect the effective neighborhood of fish to result from the interplay between an asynchronous updating on a small number of neighbors and a modulation of the strength of interactions with the distance between fish [15].
Previous studies on the number and the spatial arrangement of influential neighbors led to different results depending on the species and on the procedure used to analyse the data. The work by Ballerini et al. [39] provides evidence that each bird within a starling flock (Sturnus vulgaris) coordinates its motion with a fixed number of closest neighbors, irrespective of their distance, while in mosquitofish (Gambusia holbrooki), one single nearest neighbor was sufficient to account for the large majority of the observed interaction responses [12]. In barred flagtails (Kuhlia mugil), it has been shown that different kinds of neighborhoods (Voronoi neighborhood and the k nearest neighbors (k % 6 * 8) were compatible with experimental data in a tank [13]. Our study points to a low number of influential neighbors. There are multiple possible explanations for the differences in the number of interacting neighbors found across the scientific literature. (i) It is possible that different animal groups interact with different numbers of neighbors. (ii) Temporal factors are also important [37], as interactions can be integrated in time to produce effectively larger neighborhoods. Here, we propose a third explanation (iii) based on the consideration that interaction responses such as attraction, alignment and avoidance are qualitatively different mechanisms that rely on different sensory-motor responses and, consequently, on different interacting neighborhoods. In particular, attraction and repulsion require to process information about the position of neighbors, while alignment is intrinsically a response dependent on orientation and velocity. These different interactions are likely to rely on different neural circuits (motion and form are typically processed by different brain areas in many animal groups [40,41]) and hence might depend on different sets of influential neighbors: for instance, a focal individual could avoid collisions with its Voronoi neighbors, be attracted towards a different neighborhood of visually salient individuals and only process alignment information for one or two selected neighbors. It might also depend on different sets of influential neighbors: for instance a focal individual could avoid collisions with its Voronoi neighbors, be attracted towards a different neighborhood of visually salient individuals and only process alignment information for one or two selected neighbors.
It is thus natural to suggest that influential neighbors are intrinsically associated with different interaction mechanisms, which might also explain why fish point to different neighborhoods.
Our method for identifying influential neighbors is based on the computation of the timedependent directional correlation between a focal fish and its neighbors. Of course, correlation does not imply causation, so that inferring causal influence between fish from directional correlation requires an extremely cautious methodology.
The methodology we proposed here is based on two solid procedural cornerstones. First, the data used in our study were carefully selected from a clearly recognizable behavior, the collective U-turns, where influence from neighbors undoubtedly exists, and thus should be, to some extent, responsible for a fundamental part of the correlations detected by our method. Time-delay between individuals' direction choices has already been used to measure the interactions between group members in animal flocking. Specifically, Nagy et al. [23] used correlation delay times to reconstruct flight hierarchies in flocks of pigeons. Their approach consisted in integrating delay times over the entire trajectory to obtain a "leadership mark" for each individual. Our assumption is instead that the time-delay results from the individuals' behavior and their environment, which varies in time depending on the information being gathered. To detect the response delay of each individual, we have instead followed the approach employed in [26] that allows for a change of delay over time. In fact, it is easy to show that the time delay between the same pair of fish is not constant, as revealed by our analysis of pair of fish (see Material and methods). Applying Nagy et al.' method to different subsets of data in the same experiment, we found that the time delays between the same pair of fish vary substantially (see S2 Fig). The second methodological cornerstone is provided by the results of the null model that clearly show that the correlations we detected come from causal influence between neighbors and not from spurious random coincidences. The results of the null model also confirm that distance rank has no effect.
Identifying the number and position of influential neighbors is an essential step towards reconstructing behavioral cascades of information propagation across a group. Our method provides an accurate basis for mapping interaction network that does not rely on any assumption about the channel (e.g., vision, sound or hydrodynamic interactions) mediating information transfer. We are confident that by adopting our technique to map interactions in different species and different experimental contexts we will gain a much more detailed understanding of the distributed information processing taking place in fish schools.

Ethics statement
Our experiments have been approved by the Ethics Committee for Animal Experimentation of the Toulouse Research Federation in Biology N˚1 and comply with the European legislation for animal welfare.

Experimental procedures and data collection
Hemigrammus rhodostomus (rummy-nose tetras, Fig 12A) were purchased from Amazonie Labège (http://www.amazonie.com) in Toulouse, France. Fish were kept in 150 L aquariums on a 12:12 hour, dark:light photoperiod, at 27.7˚C (±0.5˚C) and were fed ad libitum with fish flakes. The average body length of the fish used in these experiments was 31 mm (± 2.5 mm). The experimental tank (120 × 120 cm) was made of glass and was set on top of a box to isolate fish from vibrations. The setup was placed in a chamber made by four opaque white curtains surrounded by four LED light panels to provide an isotropic lighting. A ring-shaped corridor was set inside the experimental tank filled with 7 cm of water of controlled quality (50% of water purified by reverse osmosis and 50% of water treated by activated carbon) heated at 28.1˚C (±0.7˚C) (Fig 12B). The corridor was made of a vertical circular outer wall of radius 35 cm and a circular inner wall with a conic shape of radius 25 cm at the bottom, so that the effective width of the corridor available to fish for swimming ranges from 10 cm at the bottom to 12 cm at the surface. The conic shape was chosen to avoid the occlusion on videos of fish swimming too close to the inner wall. Fish were randomly sampled from their breeding tank for a trial and were used at most in only one experiment per day. Groups of 2 or 5 fish were introduced in the experimental tank and acclimatized to their new environment for a period of 10 minutes. Their behavior was then recorded for one hour by a Sony HandyCam HD camera filming from above the setup at 50 images per second in HDTV resolution (1920x1080p). We performed 10 trials for each group size of 2 and 5 fish.

Data extraction and pre-processing
The positions of each fish on each frame were tracked with idTracker 2.1 [10]. Fish were sometimes misidentified by the tracking software, for instance when two fish were swimming too close to each other for a long period of time. In those cases, the missing positions were corrected manually. All sequences with 50 consecutive missing positions or less were interpolated. Larger sequences of missing values were checked by eye to determine whether interpolating was reasonable or not; if not, namely the trajectory doesn't look like a straight line, then merging positions with closest neighbors were considered. Time series of positions were converted from pixels into meters. The origin of the coordinate system was set to the center of the ringshaped tank. Body orientation of fish were measured using the first axis of a principal component analysis of the fish shapes detected by idTracker 2.1.

Detection and quantification of collective U-turns
Since the experiments were performed in an annular setup, the direction of rotation can be converted into a binary value: clockwise or anti-clockwise. We choose the anti-clockwise direction as the positive values for angular position. Before a U-turn event, all fish move in the same direction, say clockwise. Then, one fish, not necessarily the one located at the front of the group, changes its direction of motion to anti-clockwise direction. After a short transient, the other fish of the group display the same direction change, from clockwise to anti-clockwise. We defined the whole process of changing direction as a collective U-turn (see examples in Fig 1 and in S8 Fig). After data extraction and pre-processing, we found 1111 and 475 collective U-turns in groups of 2 and 5 fish, respectively. The duration distribution of collective U-turns in groups of 2 fish is shown in S3  The procedure used to define an individual U-turn for a fish F i is as follows: we first determine the time t m,i at which the sign of the angle of incidence of fish F i changes sign (from negative to positive or vice versa). Then, starting from t m,i , we reverse time step by step until the first time at which the absolute value of the angle of incidence is higher than a threshold " y s;i is reached. We denote this time by t s,i . Similarly, we start again from t m,i and go forward step by step until the first time at which the absolute value of the angle of incidence is higher than a second threshold " y e;i is reached.

Position rank in a group
The relative position of a fish F i in a group of N fish is calculated by projecting the vector position of the fishũ i on the average group velocity vectorz ¼ ð1=NÞ P N i¼1ṽ i . This allows us to define a group centroid in the direction ofz, with respect to which the fish are ranked: the first fish in the group is the fish whose projection onz is the most advanced one in the direction of motion of the group (given byz), the second fish in the group is the second most advanced, and so on. Relative distance between fish are not taken into account when establishing the rank.

Optimal setting parameters for influential neighbors identification
Four parameters are used to identify influential neighbors: the time-delay τ, the window size w, the correlation threshold C min above which individuals are supposed to be interacting, and the threshold ε for selecting more than one influential fish.
The time delay must be specified along the whole trajectory of the focal fish: it is thus a series of values ft Ã k g M k¼0 , where M is the number of time-steps or frames in the individual Uturn. The parameters C min , ε and w are in turn given for all time and for all fish by means of a sensitivity analysis described in the next section.
Assume by now that the three values C min , ε and w are known, and denote by F i the focal fish and by F j one of its neighbors. Then, the series of time-delays ft Ã k g M i k¼0 is built recursively as follows (actually only w is required to extract the time delays).
Denote by Γ i (t k ) the highest value of the pairwise directional correlation C ij of the velocity of fish F i at time t k with the velocity of F j at each time-step in the range of the previous ðt Ã kÀ 1 þ 1Þ time-steps R k ¼ ½0; t Ã kÀ 1 þ 1: Then, the time-delays t Ã k , k = 1, . . ., M i , are determined by the smallest value of the time-delay τ r 2 R k where Γ i (t k , w) reaches its maximum. For t 1 , the maximum correlation is reached at C ij ðt 1 ; t Ã 1 ; wÞ, for some time-delay t Ã 1 2 R 1 ¼ ½0; t Ã 0 þ 1. We set t Ã 0 ¼ 50 for the initial value of the recurrence. For the rest of time-delays t Ã k , k = 2, . . ., M i , the size of R k is based on the assumption that if, at some time t, F i copies the behavior that F j displayed at a previous time t − τ, then, after time t, F i will not copy the behavior that F j displayed at any time earlier than t − τ.
Time-delays obtained with more complicated and time consuming procedures such as the time-ordered technique developed in [26] or through the similarity analysis based on Fréchet distances [25] would in principle produce similar values. Fig 13B shows the distribution of time-delays obtained with this procedure in groups of two fish. The distribution is clearly bimodal with a first peak when τ = 0 and a second one around τ = 0.4 s. Considering a reaction time threshold of 50-100 ms for a fish to integrate information and reach a decision [42], we cannot attribute small values of time-delays to situations where the behavioral decision of the focal fish has been influenced by its neighbors. This is confirmed by the analysis of the spatial distribution of the extracted time-delays (Fig 13A), where we show that the lowest average values of τ are found mostly when the neighbor was behind the focal fish, in a zone with the lowest perception [15], while the highest values of τ > 0.4 s are found when the neighbor is located in front of the focal fish. This has lead us to consider in our analyzes only situations where τ > τ R = 0.04 s.

Parameter selection
Although the time-delays ft Ã k g M k¼0 are determined once w is known, they also strongly depend on C min and ε, as the value of these three parameters must be fixed at the same time. This is done by means of a sensitivity analysis in which we have tested the following 40 combinations of parameter values: w 2 {0, 1, 2, 3, 4}, ε = {3, 5}, and C min 2 {0.995, 0.99, 0.95, 0.5}.
Each combination (C min , ε, w) gives rise to four histograms like those depicted in Fig 7. These histograms constitute the solution of our method of analysis, and can be characterized by a vectorSðC min ; ε; wÞ in 19 dimensions: (i) the 5 proportions of the number of influential neighbors in groups of 5 fish, (ii) the 4 proportions of their distance rank, (iii) the 5 proportions of their position rank, and (iv) the 5 proportions of their turning rank. This allows us to determine how similar are the results arising from two combinations (C min , ε, w) and ðC 0 min ; ε 0 ; w 0 Þ, by computing the cosine similarity of the two vectorsSðC min ; ε; wÞ and S 0 ðC 0 min ; ε 0 ; w 0 Þ. The cosine similarity of two vectorsã andb, denoted cos sim ðã;bÞ, is the cosine of the angle between these two vectors. Thus, two colinear vectors are such that cos sim ðã;bÞ ¼ AE1 independently of their magnitude, while two perpendicular vectors are such that cos sim ðã;bÞ ¼ 0. In our case, the components of the vectors are positive, so cos sim ðS;S 0 Þ ! 0 for all (C min , ε, w) and ðC 0 min ; ε 0 ; w 0 Þ. Moreover, as the components are proportions, colinearity implies identity, both in direction and magnitude. Thus, cos sim ðS;S 0 Þ ¼ 1 means that both results are identical, while cos sim ðS;S 0 Þ ¼ 0 means that they differ as much as possible.
S5 Fig shows the cosine similarity matrix for the 40 combinations we have tested. Note that the matrix is symmetric with respect to the diagonal, where cos sim ðS;SÞ ¼ 1. Except for C min = 0.5, all similarity values are in the thin range [0.96, 1], showing that all combinations yield practically the same results. The higher dissimilarity is found in the white-yellow lines, where one of the combinations is (C min , ε, w) = (0.5, 3, 2).
The selection of parameter values is thus done as follows. We choose w = 2, which corresponds to the higher dissimilarity regions. The selected time window size is sufficiently large so that the jagged nature of the movement data is smoothed out but not too large so that the actual turns gets washed out from the data. Using ε = 3 or ε = 5 yields very similar results and we have arbitrarily chosen ε = 3.
The selection of C min is done by a specific procedure, which consists in calculating the number of data points that remain available for our analysis for each value of C min . S6 and S7 Figs exhaustively demonstrate that the larger C min is, the less data points remain available, and vice versa. We might be prone to choose a sufficiently small C min in order to get the maximum number of data points. However, according to our definition of influential neighbor, C min should be sufficiently large to select only the real influential neighbors. We have thus chosen the highest value which provides a sufficiently large number of data points, that is, the largest value before the fall of the number of data points in S11 Fig, C min = 0.95. This value preserves 61% (23830) and 76% (69703) of data points for N = 2 and N = 5 respectively.

Null model of collective U-turns
We want to design artificial collective U-turns in groups of 5 fish where all fish perform an individual U-turn at more or less the same place and more or less the same time, and in the same direction (clockwise or anti-clockwise). Fish must coincide in time and space to constitute a "group", but individual U-turns must happen in an absolutely independent way. Correlations at hand in this paper are thus reduced to a minimum, while preserving the general aspect of a group of fish changing direction.
Our experimental data provide us with 5 × 475 = 2375 trajectories of individual fish, which we have conveniently normalized and combined to build 1000 groups of 5 fish changing direction in the same spatiotemporal interval. This is done as follows.
The whole trajectory of a fish F i during a U-turn takes place in an interval of time [t s,i , t e,i ], where t s,i is the instant at which the individual U-turn of fish F i starts, and t e,i is the time at which the individual U-turn ends. See the paragraph above Eq (1). The trajectory of fish F i in radial coordinates is given by where ρ i (t k ) is the radius (distance of the fish from the center of the tank), ψ i (t k ) the already defined angle position (computed anticlockwise as positive), and N i is the number of timesteps t k in the trajectory. Denote by T i the instant at which fish F i effectively turns, i.e., F i is perpendicular to the wall: sin(θ wi (T i )) = 0. In well defined individual U-turns as the ones we are using in our data, this happens only once per U-turn. Accordingly, (ρ i (T i ), ψ i (T i )) denotes the fish position at time T i .
Although we would like to have absolutely uncorrelated fish, it would not make sense to use groups of trajectories that do not reproduce a consistent U-turn, e.g., if one fish makes its Uturn much later than another, or on the other side of the tank. We thus try to decorrelate fish trajectories as much as possible, while preserving at the same time the typical spatiotemporal shape of real collective U-turns.
The decorrelation of all individual U-turns is done with the following two steps: • Spatial rotation: For all individual fish F i in all U-turns, we rotate its trajectory an angle −ψ i (T i ) + π/2 + ξ i , where ξ i is a random number in [−π/12, π/12] sampled uniformly, so that the new location of fish F i at the time T i when it performs its individual U-turn is in the upper part of the tank around π/2, in [5π/12, 7π/12].
• The artificial collective U-turn is thus built as follows: 1. Select randomly 5 real collective U-turns, and, from each collective U-turn, select randomly one trajectory. Rotate and time-shift trajectories according to the process described above.
2. Select randomly one of the 5 fish as the fish of reference F ref for building the artificial Uturn. If necessary, mirror the trajectories of other fish so that all fish move in the same direction as F ref with respect to the center of the tank, i.e., clockwise or anti-clockwise.
Then, the fish of reference of the artificial U-turn will make its individual U-turn at time Three specific values are shown by arrows: 0.6, 0.95 and 0.995. The value highlighted in red corresponds to the value we chose and is denoted by a star instead of a circle. Each vertical line corresponds to the fish that is taken as being the focal fish: F 1 (red) and F 2 (cyan). For instance, selecting C min = 0.6 in the upper-left small panel, 700 data points will be available for both fish. For C min = 0.95, around 450 points will be available for both fish. Leftmost higher panel: Total number of data points available from all fish from all the experiments (summary of the 10 small panels, i.e., there is only one -pink-line). Vertical axis: ratio between the available number of data points for C min and the number of data points available for C min = 0.5. Total data points available from all the experiments (for C min = 0. Three specific values are shown by arrows: 0.6, 0.95 and 0.995. The value highlighted in red corresponds to the value we chose and is denoted by a star instead of a circle. Each vertical line corresponds to the fish that is taken as being the focal fish: F 1 (red), F 2 (yellow), F 3 (green), F 4 (blue) and F 5 (magenta). For instance, selecting C min = 0.6 in the third small panel of the upper row, 55 data points will be available for each one of the 5 fish. For C min = 0.95, around 75 points will be available for each fish. Leftmost higher panel: Total number of data points available from all fish from all the experiments (summary of the 10 small panels, i.e., there is only one -pink-line). Vertical axis: ratio between the available number of data points for C min and the number of data points available for C min = 0.5. Total data points available from all the experiments (for C min = 0.