Efficient Multiple Object Tracking Using Mutually Repulsive Active Membranes

Studies of social and group behavior in interacting organisms require high-throughput analysis of the motion of a large number of individual subjects. Computer vision techniques offer solutions to specific tracking problems, and allow automated and efficient tracking with minimal human intervention. In this work, we adopt the open active contour model to track the trajectories of moving objects at high density. We add repulsive interactions between open contours to the original model, treat the trajectories as an extrusion in the temporal dimension, and show applications to two tracking problems. The walking behavior of Drosophila is studied at different population density and gender composition. We demonstrate that individual male flies have distinct walking signatures, and that the social interaction between flies in a mixed gender arena is gender specific. We also apply our model to studies of trajectories of gliding Myxococcus xanthus bacteria at high density. We examine the individual gliding behavioral statistics in terms of the gliding speed distribution. Using these two examples at very distinctive spatial scales, we illustrate the use of our algorithm on tracking both short rigid bodies (Drosophila) and long flexible objects (Myxococcus xanthus). Our repulsive active membrane model reaches error rates better than per fly per second for Drosophila tracking and comparable results for Myxococcus xanthus.


Introduction
A broad range of biological problems on many length scales, from cells to whole animals, require the ability to track moving individuals with a group. Advances in computer vision in the past two decades have enabled computed-aided automatic or semiautomatic tracking programs to greatly boost the capacity to analyze large amounts of data and reduce the involvement of human observers. However, many traditional tracking algorithms struggle when objects come into close physical contact or even overlap within an image. Recent work addressed this problem by separating objects using a Gaussian mixture model with an area prior followed by identity matching and successfully applied this approach to track walking Drosophila with minimal human supervision [1]. Tsai and Huang further extended this approach by refining the segmentation of the Drosophila images into different body parts which enables more detailed measurements to be made [2]. A non-Bayesian framework was used by Chaumont et al. to track multiple mice by modeling the animal body with a series of physical ''primitives'' connected by joints and elastic springs that can interact with each other [3]. All of these algorithms are capable of handling large amounts of images, w10 4 frames, with relatively little tracking error that is then corrected manually. However, this error rate increases with reduced image quality or when the objects move close to one another more frequently.
In most tracking solutions, image segmentation is performed on each frame to identify individual objects. This is then followed by an identity matching scheme between frames based on distance, object birth and death probabilities, and other estimated parameters [1,4]. Active contours (snakes) are a popular image segmentation approach that is widely applied in analyzing biological and medical images. The contour of the compartment boundary is treated as an elastic band that interacts with the image and exhibits a damped relaxation to the minimum energy state [5]. In addition to closed-contour uses for measuring parameters like object area, open active contours can be used to detect filamentous objects such as blood vessels [6][7][8], neurons [9][10][11], actin filaments [12,13], and many biomedical and non-biomedical applications. In these approaches, active contours can be allowed to merge, break, fork and recombine.
In this work, we solve the tracking problem using a deformable membrane model, which is an extension of the active contour model to the temporal dimension. In order to prevent merging of multiple objects, we add a repulsive interaction between neighboring contours. We test this method on two practical tracking problems in animal behavior and microbial ecology: walking Drosophila and gliding bacteria Myxococcus xanthus are tracked at high density with low error rate (v5|10 {6 per fly per second, or 10 {5 per cell per second), and at high efficiency (better than 50 frames per second when tracking 5 flies). Drosophila has become a popular model organism to study neurobiology and animal behavior for its ease of genetic manipulation [14][15][16][17][18][19], and Myxococcus xanthus is a gram-negative bacteria that exhibit gliding motility using molecular motors [20,21], which allow the bacteria to exhibit complex group behaviors [22][23][24][25]. Using these two tracking problems at very distinctive spatial scales, shapes and morphology, we demonstrate the application of our algorithm on a broad range of problems. Finally, we discuss the close connection between Bayesian techniques and the energy minimization approach in our active membrane algorithm.

The Classical Active Contour Model
In the classical active contour model [5], a feature in an image, usually a line, area boundary or edge, is located by relaxing an elastic contour (snake) that interacts with the image to the contour's minimum energy state. The energy of the contour consists of the internal elastic energy and the image energy term E im based on the location of the contour in the image, calculated along the contour x(s) in N-dimensional space, as a function of arc position s: where the a term penalizes the energy when the contour is deviated from a uniform straight line and the b term adds an additional cost to bending. A and B are the termini of an open contour, which we define as 0 and 1, or in case of a closed contour, the integral path is closed. Minimizing E is equivalent to solving which can be written in the discrete form [5]: where each x i is an M-element vector that defines the contour using M discrete points that are distributed along s. A Ã is the M|M circular pentadiagonal discrete equivalent of the operator {ad 2 =ds 2 zbd 4 =ds 4 : ::: 0 ::: ::: 0 ::: can then be solved iteratively via where t is the iteration index, c sets the time step, and + i is the spatial derivatives L=Lx i . Additional forces can be conveniently incorporated into the model by adding a force term F i , which we will discuss in the coming sections: Numerous variations of this active contour model have been developed for specific image-analysis and tracking problems, including the use of variable stretching and bending stiffness a and b, sophisticated image potentials E im , and the inclusion of additional forces for specific purposes. Here, we build our model aiming to solve the multiple object tracking problem in time-lapsed movies.

Open Contours
While a circular pentadiagonal matrix A Ã is suitable for dealing with a closed contour, for open contours, the motion of the two tips needs to be considered separately. We modify the first and last two rows of A Ã in equation (3) to A for open contours such that the internal force acting on the endpoints is equal to half of the internal force on the nearest neighbor, but in the opposite direction: a 1 a 0 a 1 0 ::: 0 ::: ::: 0 ::: 0 a 1 a 0 a 1 0 ::: Here We set b~0 in our two following sample applications because penalizing the contour length in open contour configuration adds an effective energetic cost to path curvature. We find that a non-zero value for beta does not qualitatively change the behavior of the open contours analyses for, but not limited to these two examples.
Without an additional constraint, the tips are left free to interact with the image. Tip forces can be added to elongate or shorten the contour in order to control the contour length. In many multiple object time-lapse tracking problems, the characteristic length scale of the tracked objects is known and does not change, even though the position, orientation, and specific shape may alter over time. In this work, the length of the contours are maintained constant by a harmonic tip-stretching force, F str depending on the target contour length l 0 : wherer r is the tangential direction of the contour at the endpoints pointing outwards, l is the apparent contour length l~Ð 1 0 Ddx=dsDds, and g governs the magnitude of the stretching force.
Another form of applying the tip force is to add attraction forces from the constraint tip position: where x is the apparent position of the tips and x 0 is the anticipated position, which can be predicted based on the current contour length and position as discussed in the following sections. Both forms in (8) and (10) are used in the two examples below.

Track Contour Motion Over Time
The kymograph of a contour that moves in N-dimensional space over time is a continuous surface in Nz1 dimensions. In certain types of problems where the objects' trajectories are less predictive, such as the case of Brownian motion, a global optimization strategy such as the framework to solve the linear assignment problem (LAP) is usually necessary [4]. However, in common cases where the trajectories are smooth relative to the object density, local optimization is sufficient and more efficient than global approaches. Here, we apply the concept of active contours to the temporal dimension as well as the spatial dimension within each frame, and optimize the localization of the two-dimensional active membrane in the (Nz1)-dimensional kymograph. Different from freely moving contours in all spatial dimensions, the object has one unique location at each time point, and no localization information is carried between frames. For these reasons, we allow the control points of the active membrane to move within each time slice but not along the temporal axis. For active membranes, x it is a M|T matrix that determines the location of the contour at all time, where M is the number of control points along the contour, and T is the size of the temporal frame stack. We seek the solution of where the T|T pentadiagonal matrix B effects the derivatives in time and has the same structure as A but transposed. Similar to the strategy for solving equation (5), this equation can be solved iteratively by which is in the form of a Sylvester equation and can be solved following standard routines. In practice, instead of optimizing the active membrane on all frames, we slice the kymograph into overlapping time-blocks in the temporal dimension and then sequentially obtain the optimized solutions in short blocks in accordance with the results from previous blocks in time. This is a practical simplification as long as the block size is comparable to the persistence time of motion of the objects. When the temporal projections of the trajectory of one or multiple objects overlap, incorrect initial condition can cause slow convergence or trap the solution at local minima where the registration of traces to image potential minima is swapped. By solving the problem in a block-wise manner, we avoid incorrect initial placement of the contours, and greatly reduce the computation time using the local initial guess based on previous results. Because our model is used to process time-lapsed movies, we use the term contour and membrane interchangeably in the following text.

Repulsion between Multiple Contours
Active contour and surface models can be applied to an image or time-lapse movie of multiple objects by starting at different initial locations. However, when the objects are in close proximity to one another, the barrier that separates the two objects in the image energy landscape can diminish below a significant level due to noise and other image-based effects ( Figure 1). As a result, two contours (or membranes) with different initial locations can converge to the same image energy minimum. To avoid different contours from collapsing into the same potential well, we optimize the energy of multiple contours simultaneously with the addition of a mutual repulsive force between the contours. Because the repulsive force resembles the physical exclusion, we limit the range of the repulsion to the approximate size of the objects, and set the force magnitude to match the depth of the typical depth of the image energy minima. The goal of adding repulsion between interacting contours is to prevent them from overlapping due to the possible degraded image quality, while keeping the dynamics of the contours well-behaved and easy to implement. Specifically, the force is defined to have a quadratic form of the distance between two contours with a cutoff distance w 0 , which is chosen to be the same as the typical width of the tracked objects: where F repel lm (s l ) is the unit arc length density of the repulsive force acted on the l th contour at arc position s l due to the presence of the m th contour. m is a scalar that sets the magnitude of the repulsion so that the repulsive potential matches the attractive image potential. The discrete version of equation (13) takes the form where x it;l is the coordinate of the i th control point of the l th contour at the t th frame. Here, the repulsive force is not normalized to the actual length of the contour. Because the lengths of the flies and bacteria cells vary within 20% of the averaged length, we did not observe convergence issues in solving the relaxed positions due to the unnormalized forces. A cubic potential is a sufficiently close approximation to the overall shape of the attractive potential generated by Gaussian smoothing of the original image. In our actual algorithm, the external force includes both the terminal force and the repulsive force:

Drosophila Walking Behavior
We first demonstrate the repulsive, active membrane model on movies of walking fruit flies. Drosophila melanogaster has become a popular model system for studying complex behaviors such as courtship, aggression, and learning through the analysis of timelapse movies of fly position. In these experiments, individual flies often come physically close to each other causing their images to merge. The repulsive membrane model is particularly adept at resolving the positions of flies during these events with high chance of contacting and overlapping.
We recorded a movie of five male flies walking in a circular, 2.5cm diameter arena at a density of 1 fly/cm 2 at 30 frames per second and a camera magnification of 0.1 mm/pixel. Our tracking algorithm models each fly as an open contour with 3 control points ( Fig. 2 and Movie S1). The length of each fly is determined from the initial image and remains fixed throughout the tracking task. Our tracker is able to follow five flies correctly in all 20,203 frames (a total time of 673 seconds). The oval shaped fly image potential constraints the orientation of the open contour, so that the head direction is correctly resolved in most frames. Occasionally, a fly can exhibit escape behavior that causes a quick change in location and head direction. For generality, our algorithm does not consider image details specific for fruit flies, and we do not distinguish head from tail based on images. Consequently, jumps may reverse the head-tail orientation ( Figure 2b). However, because flies mostly move forward in the direction of the head with low jumping frequency on the order of 0:1 min {1 , a Hidden Markov method is suitable for detecting jumps. We use this information and adopt the Viterbi algorithm [26] to determine the forward-backward head direction as an addition step of our fly tracking software. Figure 2(c-g) shows the 2D, log-probability distribution function of walking velocity for each of the five flies after the head direction correction. Each fly has an individual signature distribution that is slightly different from other's.
To increase the number of fly-fly encounters and test the ability of the repulsive algorithm to distinguish individual flies, we placed one female and two male flies in the arena. Both male flies spend a large amount of time attempting to court the female fly, resulting in frequent merging of the fly images. 85% of the recording time, at least one of the male flies is within 6 mm of the female fly and 31% of time both male flies are within this distance (Figure 3(a)). Images are analyzed with our tracking algorithm with 100% correct registration of flies and only one orientation reversal caused by a jumping event (Figure 3(b)). To demonstrate the tracking result, we calculated the position of each fly relative to one of the other two flies for all pairs. Similar to previous analyses [1], we find that male flies tend to approach the female fly from the rear (Figure 3(c),(d)), while maintaining a head direction oriented towards the female (Figure 3(e),(g)). The relative position of the two male flies uniformly distribute around each fly (Figure 3(f),(h)), indicating no preferred orientation between males.
We compared the performance of our algorithm to the output of the CTRAX fly tracker [1] using the 5-male assay, the one female, two male (1F2M) assay, and a high fly density movie trial with 16 male flies in a 5 cm 2 arena (Table 1). At high densities, flies frequently walk in contact with each other and jumping is more frequent (Movie S2). In our repulsive contour model, where the number of flies is fixed, identity swapping is the only kind of error. In comparison, because CTRAX allows the number of objects to vary, it has three types of tracking error: identity swap, lost objects, and spurious detection. In addition, we compare the speed of the two tracking packages when run on an Intel i5 processor. In the 5-male movie, where flies rarely come in close proximity to each other (3% of the time), both trackers are able to track all the flies without any error. At a higher fly density of 3 cm {2 , both trackers are able to distinguish individual flies, although CTRAX has a small portion of spurious detections and fly jumping becomes the primary source of identity swap error. When the image flies constantly stay in close proximity, such as with frequent courtship attempts, our model is still able to locate the flies from the boundary contour, whereas CTRAX suffers from overlapping detections. Moreover, we observe that the tracking quality of CTRAX is crucially sensitive to the userspecified input parameters such as the image value threshold and the Gaussian oval shape prior that requires multiple trials to optimize. In comparison, the repulsive contour model only requires the knowledge of the spatial scale, either the fly length or width, and the number of flies. The repulsive contour tracker is also 3-20 times faster than CTRAX.

Myxococcus xanthus Gliding Motility
To highlight the ability of the repulsive contour technique to track densely packed objects, we analyzed movies of 2D swarms of Myxococcus xanthus cells. Myxococcus xanthus is a soil bacterium that forms complex, 3D group structures by gliding along solid surfaces [22]. We confined Myxococcus xanthus (strain DZ2) cells between an agarose gel and a glass surface such that cells form a single layer at the interface. Using bright field microscopy, the bacterial cell body appears dark and is surrounded by a bright halo ( figure 4(a)). Cells are often tightly packed,resulting a poor image contrast between the neighboring cells.
Contrast-enhancement techniques such as Differential Interference Contrast (DIC) and Phase Contrast (PC) allow for the visualization of low-contrast objects but produce images whose intensity patterns are complex, usually containing both bright halos and dark regions for each object in a non-linear representation. For that reason, we did not adopt these microscopy techniques. For the bright field microscopy images, we preprocessed the raw microscopy images to enhance the image contrast before calculating the image-based potential for contour relaxation. We first calculated the eigenvalues and eigenvectors of the Hessian of the images to quantify features such as valleys (the cell bodies) and ridges (the halos around the cells). We observe that the two eigenvalues of the Hessian matrix and the pixel value of all pixels fall in a plane in three-dimensional parameter space. The smallest variance of the principle components is less than 5% of the other two, which are comparable. Hence, the first two principle components of these three parameters are used to specify image features. Qualitatively, the pixels are distributed in two clusters within this two-dimensional projection (the two lobes in the projection in figure 4(f)), with valley pixels residing in one cluster and the ridge pixels in the other, and all background pixels within the junction region of the two clusters. To better distinguish pixels from the background and from the high cell density area where the pixel intensity contrast is low, we consider the alignment of the features in neighbouring areas. Analogous to the quanti- fication of magnetization, we treat the difference of the two eigenvalues as the magnitude of a dipole, whose direction is defined by the eigenvector associated with the larger eigenvalue. The locally averaged dipole moment is the order parameter that describes how ordered nearby features are aligned. The alignment order parameter, together with the two principle components of the eigenvalue-image intensity space, forms a three-dimensional space in which the valley, ridge and background pixels are separated ( figure 4(f)). The orientation of the three sets is determined using an expectation-maximization algorithm and the pixels are clustered into three groups according to the coordinates along the major direction of each group (Figure 4). We undersegment the valley and the background to prevent areas of high image potential, because the repulsive contour model is robust against undersegmented ridges but prone to errors caused by oversegmented valleys. The enhanced image used to calculate the image potential is computed using with where P b is the probability of a pixel being the background, P r being the ridge, and P v being the valley. I is Gaussian blurred with s~1:5 pixels before taking the gradient to smooth the image force field (Figure 4d). Contours of 25 control points are evolved in an image gradient calculated from the preprocessed images (Figure 4d). The repulsive force between control points on nearby cells is taken to be a quadratic function of the distance where x ij is the distance vector between two control points on two different contours. w 0 is a cut-off distance that sets the length scale of the repulsive force. We set w 0 to be 70% of the averaged cell width, about 7 pixels. To prevent contours from starting the relaxation procedure at an initial position that crosses a ridge in the image ( Figure 5), we reduce the length of all contours to 70% of its original length in the first iteration of each time step to eliminate possible crossings between contours and ridges, and let the length grow back to their normal value during iterations along the image potential valleys. The length growth is implemented by adjusting the targeted contour length for each iteration by an increment where l t is the apparent length at iteration t, and l 0 is the normal length of each cell. l is a convergence parameter that governs the number of iterations for the contours to grow back to their original length. In practise, we observe that l~1 is a good value for contours of l 0 *50 pixels. We also adopt the form in equation (10) to implement the length constraint force on the termini. The length increment is split between the two termini of the contours according to the resistant forces exerted on the termini: where F head w0 and F tail v0 are the tangential components of the image and repulsive forces exerted on the termini. The anticipated position of the tips are obtained by extrapolating the termini along the tangential directions by Dl t head and Dl t tail , and then F str is calculated using (10). Figure 5 demonstrates how this inchwormlike motion corrects misplacement of the contours. With the length For all three movies tested (5M: five males; 1F2M: one female and two males; 16M: 16 males), the number of fly pairs that have a distance closer than twice the fly width is counted as a fly-fly contact. In speed comparisons, both trackers ran on the same single CPU core on the same platform. doi:10.1371/journal.pone.0065769.t001 initially shortened, the faulty crossing configuration is eliminated (Figure 5b). The upper tip grows until it hits the ridge and is unable to grow further due to steric constraints (Figure 5c, d). The lower tip keeps growing along a narrower valley until the length reaches the target length (Figure 5e, f). We have also included several non-critical optimizations to the dynamics of the contours for better convergence. The tangential component of the force exerted on each control point is individually calculated and the average value is applied on each control point. The arc distance between control points is also uniformly redistributed every 5 iterations. These optimizations improve the relaxation in the tangential direction without qualitatively changing the behavior of the contours.
We used the repulsive contour model as described above to track Myxococcus xanthus motion within a 41 mm|41 mm field of view consisting of 512|512 pixels (Figure 4a and Movie S3). 205 cells are initially located in the field and we tracked these cells for 400 frames at 12 frames/min. During the time course of the movie, cells typically move a distance about equal to 10 times their cell length and the local adjacency order is completely altered for most of the cells. Our algorithm successfully tracks 204 cells of the total 205 cells before they leave the field of view. Because we do not apply force or constraints out of the field, contours can quickly leave the field of view at the border or leak to other potential wells. Minor human intervention was shown to improve the convergence but were not critical for the results. Figure 6(a) shows the temporal projection of traces of all 205 cells, labeled in different colors and overlaid on the first frame of the movie. These traces indicate that cells are capable of smoothly turning while gliding and reversing direction of motion. The tangential speed of a selected cell with high motility shows the velocity estimation from position data, and indicates a directional reversal approximately every 7 minutes (Figure 6(b)). Figure 6(c) shows a histogram of the tangential speed of all 205 tracked cells while remaining in the field of view. The speed approximately follows an exponential distribution with a mean value at 0:49 mm/min. This value is significantly smaller than previously reported results [21,27], because we included the cells that are not motile at all, and the Gaussian smoothing altered the shape of the speed distribution. The high cell density also reduces the mean speed as shown in [27].

Discussion
Active contour models are widely applied in detecting features with high contrast such as boundaries in an image and is shown to be successful especially for closed contours. Several studies applied the concept of active contour in detecting open contours with finite length, for instance actin filaments. On the other hand, along with other machine vision techniques that segment targets from an image, active contours can be applied on individual images in time-lapsed movies of particle tracking problems. The solution is usually separated into two stages: the objects or particles are first identified in each frame individually, then the correspondence is assigned between objects in different frames. Merging/disappearing or splitting/emerging are allowed with specific statistical properties, usually in a maximum-likelihood fashion based on the pixel intensity level, shape changes or moving distance. In these two-stage approaches, the object assignment is limited by the object detection quality and it is not straight-forward to use the inter-frame object assignment information to assist particle detection. Optimizing assignments across multiple frames is robust agains single frame tracking error but dramatically increases computational cost. In our approach, we treat the temporal dimension as an extrusion of the spatial dimension of the images and the object trajectories as elastic continua that interact with both the image time-series and each other in terms of repulsive forces. Similar to the penalty based on moving distance in the linear assignment problem, the elastic energy in the temporal dimension penalizes trajectories that deviate from linear motion. One major difference between this method and the two-stage solutions is that the object identity correspondence assignment is combined with the spatial localization using image intensities.
As a closely related method to the active contour but with certain advantages, a level set method is insensitive to the topology of the contours, and therefore ideal in dealing with contours with unspecified topological features. It has been shown previously that level set methods can be successfully used in tracking moving bacteria [28]. One drawback that limits the application of active contour and level set methods is the difficulty in applying shape constraints to the contours. This problem can be partly solved by using a shape prior in the level set approaches at the cost of computational time. However in many applications, the characteristic topological and geometrical features remain unchanged through the entire movie, for instance the fly shape and size in the first example, cell length in the second example and the number of objects (flies, cells) in the both examples above. These conserved geometric quantities can be predetermined and used as constraints on the solution that effectively acts as an explicit shape prior but with a simpler form. In addition to resembling the physical shapes with explicit lengths of the contours, our model also captures the mutually exclusive nature of these objects by adding a repulsive force between objects. This effectively prevents trajectories from crossing or collapsing into one image potential minima.
Another drawback of active contour methods is the typical sensitivity to the initial position and local minima in the image potential. In our second example, ambiguous segmentation of cells takes place frequently due to the poor quality of the raw images, causing faulty gaps between cells and bumps in the images along the cell body, especially in area of high cell density. Upon sudden cell acceleration, linear prediction may place the initial position into a faulty local minimum as shown in figure 5(b). Among various segmentation algorithms, the watershed is particularly suitable in dealing with varying boundary values, at the cost of the convenience to control the geometric properties of the segmented area. We adopt the idea of a watershed algorithm in our active contour model in terms of dynamically varying the contour length, where the tip growth is governed by the resistant forces. This tip growth scheme is directly analogous to the case of one-dimensional watershed method, where the expansion of the two boundary points is inversely proportional to the steepness of the potential well. The combination of active contours and tip growth is essentially a watershed algorithm with an explicit shape prior that has the capability to correct mistakes caused by improper initial positions ( figure 5(c-f)).
Image preprocessing can be beneficial, sometimes essential, in order for the active contours to relax into the correct potential minima. Because efficient dynamics of the contours requires a smooth potential landscape with slowly varying gradients, we use Gaussian smoothing before taking the gradient, or equivalently convolved the images with a derivative of gaussian (DoG) kernel. Several factors determine the blurring radius, including the spatial scale of the objects (fly or cell width in the examples shown) and the distance between the predicted and the true position of the objects, usually at the same scale of the moving distance between frames. The blurring kernel can be set variable depending on the motion of the objects, such as in the fly tracking case, where the fly undergoes normal walking behavior and occasional large distance jumps. We use a Gaussian kernel with the radius the same as the fly width for the former case to achieve fast performance, and if a jump is detected, a global linear attractive potential that is only visible to the jumping flies is added on top of the original kernel for these jumping flies. The image preprocessing is more essential and subtle in the second example. Comparing to the raw images, the pixel intensity at the cell body, cell boundary and background is more uniform after we convert the pixel intensity into assignment probability of three pixel categories. The elevated background intensity prevents contours from leaking to the empty background, which is a severe problem if the image is not preprocessed (data not shown). Pixel classification also enhances the barrier between cells and normalizes the pixel values on the cells.
Since the evolution of the contours follows explicit dynamics indicated in equation (3), it is convenient to implement additional factors that affect the tracking results in terms of explicit forces, or to modify the elastic property to fit particular shape requirements. The length constraints and tip growth are applied in the form of explicit forces in the case of tracking flies and Myxococcus xanthus cells. In the first example, we also add energy costs to penalize sharp turns, which is observed to cause identity swap in rare ambiguous frames. In addition to applying explicit forces, the internal degrees of freedom of the contour curvature allows extra flexibility in modeling more complex shapes. For instance, the bending elasticity can be set as a variable along the contour instead of a constant, allowing parts of the contour to act as soft joints and  the rest as hard stems. This is especially suitable in tracking objects with internal motion such as head turning in mice.
Because the image information in adjacent frames is considered as a whole in the contour localization, minor corruptions in image quality can be corrected by the adjacent frames. On the other hand, major corruption can cause propagating errors in the following frames. The relaxation to the right image potential minimum suffers especially when the image potential energy landscape is highly curved with a large number of local minima, such as in the case of Myxococcus xanthus images where cells are bent and intertwined with identical local statistical character of pixel intensities at cell width length scale. Inchworm tip growth has the ability to correct initial misplacement given that the cells only move tangentially by a small amount (20%) relative to the cell length within two frames. Our algorithm also does not consider the case of new objects entering the image. New objects can be detected using other methods such as level set and then treated as a regular repulsive active contour.
Comparing our active contour tracking model that uses the explicit dynamics of the contours with previous Bayesian approaches such as Branson et al. [1], the essential principles of our method are correspondingly similar to models of the physical properties of the objects. Active contour models treat images as a potential energy landscape, and the objects are described by a set of contours with specific lengths and bending properties to define the shape. Contours move according to the image potential with certain damping factor until relaxing at a minimum, and we add repulsion forces between objects to prevent merging. Correspondingly in Bayesian approaches, each image is treated as a spatial distribution of pixel vales, and an object is a parameterized distribution model with particular shape priors, such as the covariance and centroid of a Gaussian distribution. Tracking maximizes the posterior likelihood of the modeled distribution explaining the image, which involves a particular numerical gradient descending scheme similar to the damped dynamics of contours. A Gaussian mixture model is adopted for the case of multiple object tracking, which clusters pixels into groups with small or no overlaps resembling repulsion between Gaussian mixtures.
In conclusion, we developed a multiple-object tracking algorithm for time-lapsed movie based on active contour model that takes into account the physical exclusion of the objects. In our model, multiple oval-shaped or elongated curved objects are represented by open elastic contours with individually fixed length in each frame, and the motion of objects in time is treated as an extrusion in the temporal dimension, thus the spatio-temporal kymograph are modeled as mutually repulsive elastic membranes. We illustrated the application of the repulsive active membrane model on two sets of realistic experimental data, tracking multiple Drosophila walking and chasing, and tracking curved gliding bacteria Myxococcus xanthus at high density. Individual objects are tracked successfully with high efficiency at video frequencies and low error rate (10 {4 ) that can be conveniently corrected in separate steps.

Supporting Information
Movie S1 Tracking results of 5 male flies in a circular arena.