Dynamical modeling of collective behavior from pigeon flight data: flock cohesion and dispersion

Several models of flocking have been promoted based on simulations with qualitatively naturalistic behavior. In this paper we provide the first direct application of computational modeling methods to infer flocking behavior from experimental field data. We show that this approach is able to infer general rules for interaction, or lack of interaction, among members of a flock or, more generally, any community. Using experimental field measurements of homing pigeons in flight we demonstrate the existence of a basic distance dependent attraction/repulsion relationship and show that this rule is sufficient to explain collective behavior observed in nature. Positional data of individuals over time are used as input data to a computational algorithm capable of building complex nonlinear functions that can represent the system behavior. Topological nearest neighbor interactions are considered to characterize the components within this model. The efficacy of this method is demonstrated with simulated noisy data generated from the classical (two dimensional) Vicsek model. When applied to experimental data from homing pigeon flights we show that the more complex three dimensional models are capable of predicting and simulating trajectories, as well as exhibiting realistic collective dynamics. The simulations of the reconstructed models are used to extract properties of the collective behavior in pigeons, and how it is affected by changing the initial conditions of the system. Our results demonstrate that this approach may be applied to construct models capable of simulating trajectories and collective dynamics using experimental field measurements of herd movement. From these models, the behavior of the individual agents (animals) may be inferred.


Introduction
The collective behavior exhibited by interacting individuals in a population has recently attracted interest in scientific and engineering communities. Many different definitions have been used to formally describe this behavior, in order to establish new theories and further advance the contributions to this new field. In simple words, collective behavior can be described as local actions taken by individuals in a socially interacting group, which are directly related to the conditions of the group and somehow affect the overall behavior of the group as a single global entity. Many kinds of systems from different areas of application are known to exhibit such behavior, ranging from areas like sociology, psychology, zoology, and all the way to more technical disciplines like bioengineering, computer science, and robotics. The objective of analyzing the collective behavior of a system can be either to further understand the system in question, or to apply the observed behavioral structures in other systems or circumstances in order to provide innovative solutions to problems.
The movement of groups of animals is a common and well studied example involving the emergence of collective behavior in an interacting population. It is well known that animals tend to work in groups to achieve goals; simple examples that can come to mind are ant colonies, herds, fish schools, and bird flocks. In particular, the collective movement of a group of animals in the same direction is called swarming. Bird flocking in particular, has attracted much recent attention. The ability to gain accurate positional data from GPS devices on pigeons, has opened the door to more advanced and meaningful analysis of flocking [1]. Photographic data has also lead to deeper analysis of the interaction properties of flocking [2]. In general, with accurate 3D positional data, it is now possible to perform statistical analysis which leads to the understanding of the structural and behavioral properties of flocking.
Mathematical models, especially dynamical models, have been used by biologists, physicists, and mathematicians, to illustrate animal movement or interaction, though usually the dynamics have been observed to be not amenable to linear models. The classical Lotka-Volterra equations are a good example of early attempts to model the growth and decay of populations of predators and preys over time, using nonlinear ordinary differential equations (ODEs). Later efforts to characterize social animal movement involved models that use physical laws and diffusion equations to describe the movement of groups of fish, insects, and herds [3]. Discrete-time generic models of collective systems, with no particular application, but which can be used to simulate swarming with complex behavior using very simple mathematical rules, are an initial step towards the full understanding of nonlinear properties in real systems, or even graphic visualization. The Boids model [4] is a well known example, and it has been used in movies and video games to generate 3D animated collective movement of animals. Another such case is the Vicsek model [5], which is a simpler 2D model of particles, but capable of describing behaviors ranging from the swarming of small groups moving in random directions, to a global directed motion of the whole population. Recent approaches using more complicated but realistic dynamics for swarming with parameter tuning, include metric distance models with informed leaders [6], zonal interaction models [7], molecular physics models with geometric and topological interactions [8], and predator-prey models using radial force laws [9].
Most of the typical approaches to modeling collective behavior using dynamical systems, have involved developing models using physical laws or well known mathematical functions that are known to resemble the phenomena in question. Later these are tested or tuned with observations or data, in order to verify whether they resemble the original behavior. The opposite approach is to use time series data from experiments to build a full model that fits the data as well as possible, using the power of computers. This methodology, commonly referred as system identification, is used in areas like control engineering and econometrics, although most of the available literature is linked to linear models. Some new generic techniques use data to identify nonlinear ODE models [10], while others have focused on inferring the natural laws of physical systems [11]. Efficient computer algorithms for structural ODE building have been proposed especially in biological systems literature [12,13], due to the necessity of scaling the automated building of models to large complex systems involving many variables. The usage of generic and flexible modeling paradigms to capture a wider range of complex behaviors from different fields, can also be considered when designing approaches for automated model building. The automated construction of discrete-time models using radial basis functions, has shown the ability to adequately model chaotic dynamics in systems such as infant respiration, vibrating strings, and lasers [14][15][16][17].
Due to the complex dynamics observed in collective systems, we suggest taking this data-driven approach. We use computational methods to process time series data and automatically build nonlinear dynamical models that are able to carry out simulations for further analysis. Compared to previous approaches that use a fixed mathematical structure and fit parameters to swarming data [7,8], we provide the first modeling scheme capable of fully building a model, using only limited prior information. In our previous study of homing pigeon flight data [18], we found that in addition to the discovered hierarchical structure by Nagy et al. [1], there are very important local interaction rules that the pigeons follow to maintain a cohesive and synchronized unit. Our objective here is to infer these essential behavioral rules that characterize collective behavior, i.e. the dynamics involved locally between individuals which create emergent global behavior of the whole flock. The question we wish to address is whether these local interactions can be used to provide a mechanistic explanation of flocking behavior, using a simpler egalitarian modeling structure in concordance with the classical models of swarming [4,5,19]. Consequently, the prediction of individual trajectories, the influence of ''leaders'', and the shapes of the flock are not the focus of our study.
As a first step, we build models from simulated data of the well studied Vicsek model [5], in order to confirm that our approach is adequate for modeling collective behavior. The approach is extended progressively to handle experimental 3D positional data of pigeon flocks [1] and then used to construct realistic models capable of performing simulations that emulate the collective dynamics of the data. By evaluating simulations of the retrieved models, new data is generated and used to perform analysis of the system and quantify hypothesized collective behavioral properties such as the separation, attraction, and speed of the flock. Our main contributions in this paper include the modeling scheme itself, which can be used with any adequate function fitting method, for constructing the models based on positional data from the whole flock; a simulation methodology based on the dynamics of neighbor separations in order to visualize and compare the collective behavior; and, the inference of an averaged rule from our model simulations which summarizes basic attraction and repulsion between neighbors in accordance to previous observations and models of collective animal movement [3,4,19].

Input data
Both simulated and real experimental data were used as input to build flocking models. The former were generated using the well known Vicsek model [5], capable of performing 2D simulations of swarming using simple interaction rules. The latter data set was obtained from pigeon flights using GPS devices attached to the pigeons, and has been previously presented and analyzed [1].
The Vicsek model. The Vicsek model [5], is a simple nonlinear model capable of simulating swarming behavior. The model is essentially a discrete-time system of several particles in a square domain, with their 2D positions updated according to: The velocities have a constant speed v and an orientation defined by an angle h: Author Summary The construction of mathematical models from experimental time-series data has been considered with some success in many areas of science and engineering, using the power of computer algorithms to build model structures and suitably tuning their parameters. When considering complex systems with nonlinear or collective behavior, computational models built from real data are the alternative to emulating the system as best as possible, since classic modeling approaches based on observation could prove difficult for complex dynamics. In this study, we provide a method to build models of collective dynamics from homing pigeon flight data. We show that our models follow the source dynamics well, and from them we are able to infer that significant collective behavior occurs in pigeon flights. Our results are consistent with the basic principles of previous hypotheses and models that have been proposed. Our approach serves as an initial outline towards the usage of experimental data to construct computational models to understand many complex phenomena with hypothesized collective behavior.
In this paper we consider a slightly different update for the angles of the particles: In equation (3), S:T r denotes average over the M nearest neighbors and itself and Dh is a uniformly distributed random number, which is basically the noise of the system. The former differs from the original Vicsek model [5] that considered the average of all neighbors within a fixed radius r of the particle. Our modification is inspired by a recent study on the topological distance in flocks [2], and therefore the model considers a fixed number of nearest neighbors regardless of the separation distance. Other important parameters are N (number of particles), L (the linear size of the cell with the particle), and g (the range of the noise). The periodic boundary conditions of the original model in [5] were removed for this study in order to have more realistic data, and thus the boundaries of the square cell were extended to infinity to allow continuous motion. From the Vicsek model, we are interested in generating data for cases of high and low density initial conditions (with low noise) since these correspond to a global directed motion of the whole population and the formation of small groups moving in random directions respectively. These two behaviors are good examples of dynamics that can be observed in real collective systems. Pigeon flight data from GPS. Relevant research in flocking has provided 3D positional data of pigeon flights, with a very fast sampling rate [1]. These data were obtained from lightweight GPS devices attached to ten homing pigeons. The datasets include eleven free flights of pigeons near their roost and four homing flights which basically involve the flock moving from one position to another. In this study we shall consider the data from the four homing flights, due to the simpler flight patterns that are followed. In addition, our previous study [18] confirmed that the homing flights have higher correlations between individuals, and thus are a better source of information about how pigeons interact, in accordance to our objective defined in the introduction.
The data from the four homing flights was further sampled to provide smaller datasets that are easier to handle computationally, but still with a rate that is fast enough to capture adequate flight dynamics. With this in mind, sampling rates of one and two seconds were considered depending on the particular properties of each of the flights. In addition to this, the flights were cut to remove idle moments with no significant movement of the birds. Stranded pigeons were also removed from the input data in order to have datasets that resemble a fully interacting population as far as possible. All these edits were made based on thorough manual visualization of the flight data. Some specific details of each of the flights will be explained, since their particular properties will be important to interpret some of the results later on: The sampling rates were selected according to the interaction distance between pigeons. That is, for the flights with higher separations (hf2 and hf3), a slower sampling rate of 2 seconds was used, while the flights with closer interactions (hf1 and hf4) were sampled every second. This was done to have a more precise account of the movement when shorter separations are involved in the flight patterns. The different sampling rates also imply that the models built in each case will be specific to that single flight, and this is actually expected, since each flight is different and might contain different terrain, weather, and behavioral properties.

Modeling schemes
Recent investigations have led to conclusions about hierarchical structures present in pigeon flights [1], which points to some ''leader'' birds having a stronger influence in the decisions of the flock. If we take this into account in our modeling scheme, we would need to build a separate model for each pigeon, and this is certainly possible. Nevertheless, in order to reach our established objective, our focus is on capturing the basic local interactions that all the individuals follow, and thus we build a single general model from the data of all the birds. This produces a model akin to classical approaches [4,5,19], where the same mathematical function is used to update the movement of every individual.
To capture a wide range of complex behaviors, black box discrete-time nonlinear dynamical systems can be used to build arbitrary mathematical functions. Their good function fitting properties make such models very useful for performing simulations and getting conclusions and statistics from newly generated data. In our modeling scheme we present a methodology to build models of collective dynamics using an efficient black box modeling method. We emphasize that the method we chose for our paper is not central to our current contribution, and that any other capable fitting algorithm could be used with our approach. Also of importance is the selection of an adequate embedding scheme for the data to model, i.e. selecting the past values from the data that will be used by the model to predict new positions. The values to consider should be inspired by known physical properties of the phenomena in question (in this case pigeon flights). In addition, we use a fixed number of nearest neighbors (M) for interactions between individuals, as prior information to build our models. We use this assumption as our neighborhood strategy because previous analyses of bird flocking from experimental data have lead to this conclusion [1,2]. First the general framework and method used to build the functions of the models will be introduced, and after that, the three different kinds of models to be considered will be outlined.
Radial basis functions. Any competent nonlinear modeling algorithm could be used to fit a dynamical system to the data. We selected the discrete-time radial basis approach originally presented in [14,16,17] as the modeling framework and algorithm to use, because of its proven capability of modeling highly nonlinear systems. In summary, the method receives as an input a scalar time series y(t), and attempts to build the best model of the form: where z(t)~½y(t),y(t{1),:::,y(t{d) is the embedding of the system and e(t) is the model prediction error. The former corresponds to the past values from the time series data y(t) that the model will consider for calculating the prediction for tz1. The samples used for optimization of the model are built from time series y(t) using the embedding z(t). The structure of the function to build follows: where l i represents a time lag of the time series, n is the number of time lags in the model, m is the number of radial basis functions, and r j , l j are scalar constants. The c j parameters denote randomly chosen points, known as centers of the radial basis functions. The first sum (and constant l 0 ) in (5) is the linear part of the system, equivalent to commonly used reduced autoregressive models. The second sum is the nonlinear part of the function, and it is characterized by radial basis functions w j , which can be of different types, as shown in [16]. The algorithm that builds and optimizes the model, described in detail in [4,16], basically consists of generating a new set of random candidate radial basis functions at each iteration, estimating parameters, and keeping the ones that minimize residuals (all the others are discarded). This is continued incrementally until the Minimum Description Length (MDL) [20] of the model is obtained. The algorithm itself is not the focus of this paper, but we can rely on its effectiveness because it has been previously used to model highly nonlinear data from infant respiration, lasers, and vibrating strings [15][16][17].
Relative position modeling for Vicsek data. The first and simplest model to be built is the one based on the data of the Vicsek model. These models will be referred to as type R1, symbolizing the first variant of relative models. As mentioned in the introduction to this section, the idea is to use the same general model for each particle of the system. Since the Vicsek data is twodimensional, we require two different functions for a single model (to predict each coordinate). Therefore the single model that all individuals follow can be defined as: Since the Vicsek model considers relative positioning (the numerical positional value of a particle does not influence its movement), instead of trying to predict the absolute position x i at tz1, we can re-define equation (4) by predicting the relative change in position where e(t) is an array with the model prediction errors for each coordinate. To capture the collective behavior of the source system in our model, the embedding z(t) of a particle i should consider enough information from the nearest neighbors that influence its motion. An adequate embedding would be to consider the average change in position Dx i (t) of i and its neighbors, since it gives enough information about the magnitude and direction of velocity v i (t), which actually is enough to model the Vicsek rules (excluding the noise, see equations (1)- (3)). In contrast to the original temporal embedding of the radial basis method, this new embedding considers data from the nearest neighbors, but it does not affect the modeling algorithm since the method fits a function to emulate the output samples from given embedding instances (inputs), regardless of the embedding form. Taking this into consideration, the embedding would be different for each particle due to the difference in nearest neighbors: Of importance here is that the neighborhood of the average in (8) considers the fixed-number interaction introduced previously for the modified Vicsek model in (3), which is the averaging over particle i and a fixed number of M nearest neighbors. Also of relevance is that positions are two-dimensional, and thus the embedding in (8) consists of two variables. Absolute position modeling for pigeon homing flights. When modeling the real 3D homing flights, we must take into account that the pigeons are following a trajectory, which can be summarized as a flight from point A to point B, with some terrain information on the way that will influence their flight patterns. This means that for adequate modeling of a homing flight, we require absolute positioning in our model. In other words, in contrast to the Vicsek model, the position x i (t) of pigeon i is necessary for an adequate prediction of its value at tz1, due to the terrain information being absolute. Equation (9) shows the absolute model structure to be used.
This model type will be referred to as A, symbolizing the word absolute. Even though we do not explicitly define an update for the velocity of an individual (or change in position), our model function f calculates it implicitly ''within the black box'' from the embedding z i (t), and consequently affects the positional update in equation (9). It is also worth mentioning that since positions are now three-dimensional, the model consists of three functions instead of the two that were required for the Vicsek data (6). The usage of experimental data requires a more complete embedding for adequate modeling. First of all, physical common sense should be taken into consideration to define a model. That is, at least second-order components must be considered, which translates into the necessity of using velocity information (values at time t{1 and t) to calculate a prediction for tz1. In addition to this, we are interested in modeling the collective behavior, and thus neighbor interactions must be included. Taking all of this into consideration, the proposed embedding is displayed in (10).
This embedding scheme for a pigeon i considers its absolute position at the two previous time intervals (t and t{1) and the average position of its M nearest neighbors in the same intervals. Despite the fact that interactions between pigeons are likely not isotropic [2] (i.e. a pigeon might only interact with the neighbors within its sight), for simplicity we compute the M nearest neighbors with no restrictions. Note that each of the four components in (10) is three-dimensional, which translates into a 12 variable embedding, making it much more complex than the two variable embedding used for the Vicsek data. Another distinction from the Vicsek embedding, is that the nearest neighbor averaging in (10) considers only the M nearest neighbors and not itself. This is a reasonable thing to do, since the position of pigeon i is already being directly considered in the embedding, taking into account the more realistic consideration of a significant difference between its own position and that of its neighbors. From this, we will denote ½x i (t),x i (t{1) and ½Sx i (t)T M ,Sx i (t{1)T M as the individual and collective components of the model respectively.
Relative position modeling for general flocking model.
The modeling scheme presented in the previous subsection was designed for navigational flights, where absolute positioning is important due to the influence of the terrain in the flight. In addition to these models, it is of our interest to build a general flocking model using relative positioning similar to the Vicsek model, but based on real experimental data. This final model type shall be referred as R2, which refers to the second variant of relative models. As a first step, we should mention that the desired relative position model structure is the same as in (7), and introduce a new nine-variable embedding: Here the first component represents the positional change of bird i at time t: , the second component is the average positional change of the nearest neighbors, and the third component is the averaged positional difference between i and its nearest neighbors. The first two components resemble the Vicsek embedding introduced in (8), though now separating the change in position of bird i from that of its neighbors. The third component symbolizes a directional separation between i and its neighbors, which should be useful to characterize collective behavior. For example, we can expect that the separation distance to its neighbors in some direction (front, back, left, right) will surely have an effect on the movement of bird i. This component introduces a dependence on the metric distance between neighbors, which makes the model a hybrid inspired by both topological and metric distance approaches (see [2]). Essentially the metric separation distance to its M nearest neighbors will influence an individual's movement, and therefore we can expect the interaction strength of a good model to be weak or near zero at very long distances, when the pigeons no longer interact. We must emphasize that the plain homing flight data is not appropriate for building this general flocking model, due to the navigational bias that it has. To exemplify this, the four homing flights previously introduced consist of pigeon flocks moving from a point A to B in a loosely southwest direction. This means that if data of one or all the flights is used to build the model, it will undoubtedly be biased with southwest movement. In order to attenuate the bias, we performed uniform 2D rotation transformations to produce new embedded data z i (t) and prediction values Dx i (tz1). For simplicity, these rotations were only done for the latitude and longitude coordinates, leaving the altitude component intact. Figure 1 shows a graphical example of a 290 degree rotation of a single embedding and prediction instance. Each rotation angle for an instance of (11) is calculated so that the orientations of the prediction, Dx i (tz1), span a full circle (2p) in the whole dataset. These transformations are done with respect to Dx i (tz1) because this vector is the actual navigational force of the model, which has the previously mentioned bias in the original data. In summary, by spanning a full circle in the navigational direction of the samples, we are attempting to remove the bias.

Measuring flock dynamics
As a valuable tool for the analysis of flocking data, a measure that can characterize and illustrate the dynamics of the collective behavior should be used. Velocity correlation functions have been considered in previous studies [3] to calculate the positional variation of individuals in swarming populations. Since our models and analyses are based on the local interactions between neighbors, the velocities or positions of individuals are not the most straightforward way to see how they are interacting on both a local and global scale. Therefore we propose using dynamic separation measures between individuals to properly characterize how they flock over time. The average separation of individuals from the mean position of the whole population (N) at a time interval t is defined as: This measure should be relevant when analyzing the global dynamical properties of a whole flock, but it could also be important to measure the separations in local neighborhoods instead of the whole population, especially when swarming in small groups is significant. A slight modification to (12) gives us the average separation of individuals from the centroid of their local neighborhood of neighbors: Besides the analysis of each data set, the main idea of introducing these measures is to perform qualitative comparisons between the d 0 s of the input data and data obtained from model simulations, in order to verify the dynamics of a retrieved model. Another important feature of using these time-dependent measures is that transient and steady state properties of the system can be easily visualized, in a way that is very similar to analyses commonly done in control theory. Essentially a stable steady state in the d(t) signal represents ordered synchronized movement of the whole flock (or a small group if considering the local measure). For some cases, it might be more convenient and compact to express the separation of a system as a single numeric quantity, and therefore the average separation over all time intervals T will be defined as d g~S d g (t)T T .

Methodology
The automated modeling process shall now be outlined. As described previously, radial basis functions are used to build three different model types (R1, A, R2), according to the introduced modeling schemes. Essentially the same procedure is followed to build the three types of models, with the notable differences being the input data, number of functions per model, and embedding. The general process to build a single model will be described, with additional special comments for each model type: Input: Time series matrix containing positional data: x(t) N Vicsek model simulations for R1 models N Homing pigeon data from a single flight for A models N Homing pigeon data from multiple flights for R2 models For each positional coordinate j: 1. Build samples for the function f j using as output Dx (j) i (tz1) (for R1 and R2 types) or x (j) i (tz1) for (A types) with their respective embedding z i (t), using data from all individuals/ pigeons. 2. Run the radial basis modeling algorithm. 3. Set the retrieved function as f j .
The randomness in the radial basis modeling algorithm used to construct the models (see [14,16]), makes it necessary to run the algorithm several times to retrieve several models and find the most appropriate one, or average their statistics. Specific details on the number of retrieved models, and model selection criteria will be presented in the results section.

Results
Using the methodology we presented, several models were retrieved for each dataset. To analyze the models, the main philosophy followed in this study was to perform simulations with different sets of initial conditions and verify their behavior using separation measures d g (t) and d l (t), by either comparing with the source input data or simply by analyzing the simulated data itself. Since we are discriminating models based on the separation measures, differing from the MDL that the modeling algorithm uses, we argue that our chosen models do not overfit the dataset, but give a good emulation of the separation dynamics. More specific details for each model type shall be discussed in each subsection.

R1 models (Vicsek model data)
From samples of five different instances of low density simulations of the modified Vicsek model (L~25,g~0:1 see [5], M~4), five different R1 models were obtained. From there, we decided to select a single ''best'' model by comparing the global behaviors of the models under high density conditions (L~5,g~0:1 [5], M~30), which was done by averaging d g~S d g (t)T K,T over ten simulations (K) of high density initial conditions, each with 500 time intervals (T). The model with least absolute error: Dd D was selected with this criterion. We chose to discriminate the models by comparing their extrapolative capabilities to simulate initial conditions with different density than their input data, i.e. the phase transition in the classic Vicsek model. The number of nearest neighbors (M) in the model structures and for calculating d l (t) were chosen to be M~4 for low density initial conditions and M~30 for the high density case, which from simulations resemble the steady state number of neighbors from the original radius interaction in the Vicsek paper [5].
In Figure 2 we can see a comparison between the global separation dynamics of the modified Vicsek model and the R1 model. The retrieved model closely emulates the behavior of the source model. The low density cases show an expected higher separation rate than the high density simulations, which is related to swarming occurring in small groups and moving away. Even for the extrapolating case of high density initial conditions, the model exhibits close following of the dynamics. Figure 3 shows a comparison of the local neighborhood dynamics of the same two models. In (a), the Vicsek data reaches a pseudo-steady state for low density initial conditions, with the R1 model closely following it but reaching a more stable state. The high density case in Figure 3(b) considers closer interactions, and thus both models reach a stable state, with a small transient in the source model, and a small steady state error. The deviations between the Vicsek and R1 models, especially for the local separations in the high density simulations, are likely due to the fact that the former has a noise term (see Dh in equation (3)), which is not being modeled in the latter, i.e. the R1 model emulates the deterministic component of the Vicsek model, and thus the small errors are reasonable.
As inferred with the d g (t) curves, the low density initial conditions of the Vicsek model feature swarming in separate groups moving in different directions. Figure 4 shows snapshots of simulations of both models for the same low density initial conditions (see Video S1). Qualitatively the retrieved model follows the behavior of the source model quite well, and this was verified through several simulations. The only noticeable issue observed through the simulations is a slightly biased default direction followed by some individuals, but this should be expected when constructing a model with imperfect data. Using more data to build the model, this bias is removed, but the trade-off is longer computation time. Nevertheless, the observed bias was small, and many different directions were still seen in the simulations. The same conclusions were observed for the high density simulation comparisons, which involves ordered movement of the swarm in only one or two big groups (see Video S2). Figure 5 shows how a split in the population was emulated well qualitatively by the R1 model, but with an alignment difference.
In order to verify how well the R1 model is capturing the basic local neighbor alignment rule of the Vicsek model, in Figure 6 we show how the orientation of the neighborhood of an individual: Sh i (t)T M,i , affects the orientation of the individual at the next time interval: h i (tz1). The statistics for this relationship were averaged over 50 simulations of 250 time intervals in the R1 model under low density initial conditions (L~25,M~4). In the ideal case of a noiseless Vicsek model (Dh~0), we can infer from equation (3) that these two quantities are equal, and actually represent the rule which produces swarming behavior. Figure 6 shows how the model is mostly capturing this essential relationship, with only some discrepancy in a region between p and 3p 2 . As pointed out previously, the variation likely occurs because of the noise in the source data, and can be reduced by using more input data to build the model. The modeling scenario of the Vicsek data and its corresponding results, are a good introduction to the task of building models from real experimental data, which is noisy and even more imperfect, as shall be considered in the next subsections.

A models (Homing pigeon flight data)
The main purpose of the A models is to emulate their respective flight (input data) through simulations, and to confirm that there is collective behavior influencing the models and not simply an individual navigational force. In order to see the effect of the collective components and the interaction structures, models with different number of nearest neighbors (M) were obtained for comparison. In addition to those, models with no collective components (M~0) and thus a six variable embedding (see  , were also retrieved for the same purpose. To obtain better conclusions of the collective effects, five different 3D models were obtained for each value of M considered, and all statistics averaged over the five. For the model analysis, two different simulation scenarios were considered for each homing flight: same initial conditions as the input data and random initial conditions. The latter were calculated from a normal distribution with the same mean and variance as the absolute initial positions of the input data (for each of the three coordinates), in order to preserve similar flight conditions, but with no initial velocity, (x i (1)~x i (0)). The global separation measure d g (t) calculated in the simulations, was used as the comparison statistic between model structures and input data.
For simulations of the homing flight 1 (hf1) models (from M~0 to M~4) with five pigeons, Figure 7(a) shows that models with low M (0, 1 and 2) on average do not follow the input data for the same initial conditions. This deviation in d g (t) was observed to happen early in the simulations at tv100s and the opposite velocities in the initial conditions of some of the individuals is what likely causes this difficulty. Nevertheless, the models with higher M (3 and 4) follow the input data neatly (Figure 7(a) shows M~3, see Video S3). When considering random initial conditions with no initial velocity, in Figure 7(b) we can see how the individual models (M~0) do not keep cohesion of the flock as good as the collective models. This confirms that a collective force is modeled, and that an interaction structure with M~3 offers both accurate path simulation and more cohesion (The M = 4 curves are not displayed in Figure 7 for better visualization; they follow M~3 closely).
The models of homing flight 2 present probably the most difficult case to analyze. As can be seen in Figure 8(a), the input data considers the separation increasing at around tw500, and this happens because the data has two pigeons moving together away from the main flock at that time. This causes an interesting modeling case, since these particular data contradict the cohesive tendencies found at tv500. All this has repercussions with the retrieved models. Figure 8(a) shows how with the input data initial  conditions, the individual models (M~0) follow this separation better than the others. This can be reasoned with the fact that it is purely using positional information to estimate the trajectory, and the small variations between positions at t&500 are what cause the divergence that mimics the input data. Nevertheless, the collective models are again expressing a significant collective force, since they give preference to the group tendencies and thus do not follow the separation increase (see Video S4). To better visualize how the interaction structure affects the collective component for this flight, d g (t) was averaged over all time intervals for each model structure. Figure 8(b) shows that the simulations with random initial conditions found the most cohesive interaction structure to be at M~4, and a surprising separation for Mw4. This likely stems from the fact that the two-bird deviation in the input data causes models with larger interaction neighborhoods to be more sensitive to the deviations of some individuals. Overall, this particular flight illustrates how the input data can also affect what the models will try to capture, as well as the trade-offs between modeling the navigational trajectory or the collective behavior more closely.
Differing from the previous two homing flights, the third one has separations of up to 1 km between pigeons, and thus it considers the longest interaction range of all the flights. The number of pigeons in this data set is 6, and models with M~0 to M~4 were retrieved. It was found that every single model structure follows the input data closely when using the same initial conditions. Figure 9(a) shows it for M~0,1,3. When using random initial conditions as shown in Figure 9(b), no significant decrease was found for the average global separation d g as a function of adding nearest neighbors, with all values very close to 1.1 km. This implies that the large separations in this flight could be causing no interaction (or a very weak one) between birds, and therefore the models mostly consider the navigational component as the driving force of the individuals. From this, we infer that there is no significant collective behavior for this flight, likely due to the large separations, and an individual model with M~0 should be as good as any to simulate the dynamics (see Video S5).
Finally we arrive with homing flight 4, which considers eight pigeons with the shortest average separations of them all, at around 50 m. This causes d g (t) to appear quite noisy due to the expected sensitivity of separation distances in a higher density flock. With M ranging from 0 to 6, we found that the models with highest value at M~6 produced both the best following of d g (t) for the input data initial conditions and the highest cohesion when simulating random initial conditions (see Video S6). Figure 10 shows a comparison of models with M~0,3,6. As a difference with the previous three flights, for the random initial conditions we considered a normal distribution with two times the standard deviation of the input initial conditions, instead of one. This was done in order to start with larger separations and verify the dynamic attraction properties of the models. From the results, all the collective models clearly had more attraction, and in Figure 10(b) we can observe how two of them show sharp convergence tendencies that the individual models do not have.  In general, from analyzing the results of the four homing flight models, we can confirm that our approach is adequate for simulating pigeon flock trajectories that are qualitatively close to the experimental data. For flights with short interaction ranges (less than 700 m from mean position per bird), the collective models can better capture the flight properties and offer the best flock cohesion when changing initial conditions. For long interaction ranges (around 1 km), simple individual models that consider only own positional information, were enough for a simulation of the flight.
As a final step with our A models, we estimate the ''optimal'' value of M for each homing flight with collective behavior (all except hf3), by calculating the mean absolute error of each individual model and its source data MAE1 T X T t~1 Dd g (t) (model) {d g (t) (data) D, from a simulation with the same initial conditions. After that, we averaged the MAE for each type of model (each value of M) in order to verify which interaction structure follows the separation dynamics of the source data better on average. For hf2 we only considered up to t~400 to exclude the section of the data where the flock splits. Table 1 shows our results, and interestingly for all flights M~4 gave the least average MAE values. This does not mean that M~4 models were always the best for each flight, but that on average they performed better than the others. This tells us that considering an interaction neighborhood of the four nearest neighbors is a reasonable assumption for the modeling of these particular homing pigeon flights.

R2 models (homing pigeon flight data)
The data from hf1, hf2, and hf4 was used to build five models for each structure with M~1 to M~4, and using two second sampling on the data. The third homing flight was excluded since  our results in the previous subsection confirm that there is no strong collective behavior in that dataset, and M was limited to a maximum value of 4 because of the hf1 data that only considers five pigeons. In contrast to the A models, the R2 models were set to be 2D for simplification, with the height component removed from predictions and simulations. To select the ''best'' model, ten different sets of initial conditions that resemble the properties of hf1 (mid-range interaction around 300 m) were used to simulate flights of N~9 individuals, and d g measures calculated. Since the data that we use for our R2 model comes from different flights with different separation, navigational, and environmental properties, it is not adequate to select a ''best'' model based on a direct comparison of separations with the data, and thus we decided to select based on best flock cohesion, due to better flocking capabilities. Not surprisingly, the selected model had a structure of M~4, which is consistent with our calculation of the ''optimal'' M value in the previous subsection. The selected model was used for all the analysis and it shall now be referred to as the R2 model in general terms. Also important to note is the fact that the R2 model has a slightly biased overall direction, which was reduced as much as possible using the rotations in the data that we described in the modeling section. MATLAB files for our five R2 models with M~4 have been made available on the internet for usage [21].
Many different variations of initial conditions or even values of M and N could be considered on the R2 model simulations for analysis. For this study we decided to fix N and M at 300 and 4 respectively. We varied the initial positional density of the individuals according to the formula r~250r c , where r is the radius (in meters) of the circle in which the initial positions of the individuals are distributed (following a uniform random distribution), and r c is the actual coefficient that we change. Additionally. we alter the initial speed of the individuals (the magnitudes of Dx i (1)) using v i~3 0v c , and spanning v c from 0 to 1. All the individuals are initialized with the same speed but with different orientations, the latter obtained from a random uniform distribution U(0,2p). Considering that the R2 model has twosecond updates (input data with two second samples), this limits the initial speed to a maximum of 15 m/s, which is roughly the average of the speeds in the pigeon homing flights. Even though speed and density could be seen as rescalings of each other, the model is built from real data and therefore the actual numerical values for each quantity affect the model considerably and in different ways. From the R2 embedding in equation (11), we can see that separations and velocities will influence the model, and thus both quantities are important for a complete analysis of the dynamics. Seven different r c values and eleven v c values were considered for a total of 77 combinations of initial condition parameters. Each parameter setting was considered for ten different simulations, in order to get better averaged statistics.
In Figure 11, we can see a comparison of d g (t) for four extreme cases of high and low densities and velocities. Figure 11(a) shows that cases of low velocity settle approximately into steady states, while the high velocity cases have increases in global separation; more drastically in the high density case. This tells us that global flocking is highly dependent on the velocities of the individuals, and not so much on the population density. In Figure 11(b), the local separation properties can be observed. Basically for all cases, the individuals tend to converge into local groups with less than 20 m separation, with the low density cases having drastic drops in d l (t) that symbolize strong attraction. This means that larger separations provoke individuals to move strongly toward their neighbors. The high density and high speed case shows an interesting initial increase in d l (t) and then decrease to settle down into its steady state. This is likely due to velocity synchronization: individuals are initially close together with random directions but Figure 10. Homing flight 4: Flock separation for models with different interaction structure. In (a), simulations consider initial conditions from the input data, while (b) averages over ten simulations of random initial conditions. From the plots, M~3 shows the best resemblance to the input data (see (a)) and M~6 the strongest collective component (see (b) later on they separate and align with neighbors with similar orientations and synchronize their velocities. Figure 12 shows how in the low density and low speed simulation, the individuals attract into small groups and globally move together in the main direction (the biased direction: roughly southwest, see Video S7). For the high speed case, the flock has less cohesion, including some stranded individuals moving in other directions, and therefore their alignment is not fully synchronized (see Video S8). From here we can see that when individuals are separated by a longer distance, if they are moving at slower speeds then they have a better chance of finding each other and aligning. The higher speeds make it more difficult, and thus provoke less cohesion and stranded groups of individuals moving on their own. These tendencies are consistent with Figure 11. For the two cases of high density initial conditions we have a drastic difference. Figure 13 shows how for low speeds, the flock stays together, spaces out, and then slightly moves in the main direction (see Video S9). For high speeds, small groups are formed and they move away from the center independently (see Video S10). This shows that the velocities are usually roughly maintained within nearest neighbors, and thus cause the significant difference in system behavior.
To generalize how the initial densities and speeds affect the system behavior, Figure 14 shows a plot of the averaged separation d in shades of gray, for the different initial parameters. In Figure 14(a), we can see how by increasing the speed, the separation increases for every density value, though in a lower rate for the low density cases. For a fixed speed value, when decreasing the density (increasing the radius coefficient r c ), the separation tends to decrease and then increase again, which implies that there is a critical density value with highest cohesion for each different speed value (marked on the figure). As the speed increases, a lower density (higher r c ) value will be required to achieve highest cohesion. Figure 14(b) shows how the local separation follows the Figure 11. Comparison of separation measures between extreme cases of initial conditions for the R2 model. Statistics were averaged over 10 simulations for each case. Plot (a) shows that the initial speed v i plays an important part in the global separation of the system d g (t). In plot (b), the local separations d l (t) tend to similar steady states regardless of the initial conditions. doi:10.1371/journal.pcbi.1002449.g011 expected pattern of higher cohesion at higher densities and lower cohesion at lower densities, with no strong dependency on the speed. It is relevant to notice that the cohesive force does not decrease so drastically until it passes a distance threshold of no interaction between individuals. From these observations, we can conclude that speed has a higher influence on the global dynamics of the system, while density influences the local interactions. In other words, the density seems to have a greater influence on the directional alignment of individuals: lower separations causing almost immediate neighbor alignment, while larger separations require a transient to first converge and then align (see Figure 11(b)).
As final illustrations of the behavior of the R2 model, averaged distributions of how the separation of a particle i from its nearest neighbors affects its attraction and speed at the next time interval, were calculated from all the simulations considered in this subsection (all 77 combinations of parameter values and their ten simulations). The average nearest neighbor separation of a particle i at a time interval is defined as j Figure 15(a) shows that there is a small repulsion force spanning up to slightly less than 20 m of separation and then the strong attraction force with a maximum strength at around 120 m and ending near 500 m, which tells us that the maximum interaction range is approximately 500 m. This limit in the interaction range is consistent with our conclusion of no collective behavior in hf3, which had separations of more than 1 km. The distribution is much smoother for short range interactions, likely because cohesive scenarios dominated the simulations. Figure 15(b) shows how the speed of a particle is highest within 20 and 120 meters of separation, with a maximum near 35 m. This region is strongly correlated with the strong attraction region also roughly covering that span until the minimum (maximum attraction) is reached in Figure 15(a). For higher separations, the speed decreases steadily, which follows that  individuals are attracted less to their neighbors as their interaction decreases. The maximum speed at 35 m presents an interesting interpretation, because it only amounts to a weak attraction force, implying an interaction range where individuals are aligned and moving fast together.

Discussion
The contribution of our work encompasses both the modeling scheme and methodology to build flocking models which can emulate the collective behavior (represented by separation dynamics) of input experimental data, and the inference of essential dynamical properties of the models using simulations. As a first step to verify our approach, we inferred the basic neighborhood alignment principle of the Vicsek model, since our R1 model mostly captured this synchronizing relationship. From the simulations of our more complete R2 model, built from data of three flights of pigeon flocks, we averaged attraction and speed distributions based on the separation of an individual to its nearest neighbors. Our results shown in Figure 15 are consistent with classical swarming models that consider a short range repulsion force followed by attraction at longer ranges within the sphere of interaction of an individual [4,19]. This interaction range between the fixed number of nearest neighbors was found to have a maximum attraction force near 120 m of separation and its limit at around 500 m. We emphasize that our model and results are not a replacement to approaches that consider ''leaders'' as the driving force of the flock [6], but an essential complement to the hierarchical structure [1]. This augmentation is composed of the essential interaction mechanisms that are the foundations of flocking [4]. Our results are consistent with our previous study, where we analyzed the reciprocal relationships of the flock from the same datasets, and found that these basic local dynamics are fundamental for the collective behavior in pigeon flights [18].
The methodology presented in this paper is capable of obtaining models of collective systems exhibiting swarming properties, from which simulations and statistics can be obtained. A related method using Gaussian processes to model pigeon trajectories and identify terrain landmarks has been recently proposed [22], with the main difference being that we build a multi-agent model that emphasizes the collective dynamics of a flock instead of a model for the trajectory followed by a single pigeon. The Vicsek model served as an introductory example of a simulated system with noisy data, to verify the efficiency of our approach on modeling dynamics with collective behavior. The results showed adequate qualitative emulation of the dynamics for two extreme density scenarios, even when the sampled data used to build the model was only based on a single density case, thus showing the capability for extrapolation. When modeling the four homing pigeon flock trajectories with the A models, different interaction structures were retrieved in order to test and verify both the trajectory and the collective dynamics. The best retrieved models showed the capability of qualitatively simulating the collective dynamics of the input data, and we found that on average, models that consider each individual interacting with its four nearest neighbors (M~4) gave the best emulation of the global separation dynamics of the flock. We do not claim that pigeons strictly interact with only their four nearest neighbors, but our results do show that this is a good premise for our models to follow the collective dynamics of the data.
The R2 model showed the ability to represent significant swarming behaviors. In general, it illustrated that global swarming of a population is largely dependent on the speeds of the individuals, with low speeds favoring unified organized movement, and high speeds causing local swarming and separation in different directions. The population density also had an effect on the simulations, by essentially having a critical value of highest cohesion for a fixed value of initial velocities. This implies that for given velocities, a certain density distribution will offer the best global cohesion and synchronization between individuals. From the same observations we concluded that lower speeds facilitate pigeon flocking, and this is consistent with common sense, since the pigeons can easily converge and synchronize with no major effort. Higher speeds will cause only pigeons with similar alignments to synchronize their directions and move together, which is synonymous to flocking in small groups. Nevertheless, certain densities can still cause global flocking with high speeds, as long as there is enough space for these small pigeon groups to converge and synchronize their velocities.
Our approach has the potential of being used to model other kinds of complex systems where similar spatiotemporal measurements are available. Depending on the properties of the systems in question, we consider that continuous improvements to the embedding schemes, are a worthy direction to continue our work and infer other essential relationships between interacting entities of collective systems. In addition, another possible future direction is to construct simplified mathematical models manually which use these extracted rules as a basis for the dynamics, and thus be able to have a wider range of simulations that could help understand new behaviors that can emerge from local interactions of individuals.

Supporting Information
Video S1 Low density simulations of the modified Vicsek model and the ''best'' R1 model. Individuals move away in small groups.