Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Efficient Multiple Object Tracking Using Mutually Repulsive Active Membranes

  • Yi Deng,

    Affiliation Department of Physics, Princeton University, Princeton, New Jersey, United States of America

  • Philip Coen,

    Affiliation Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, United States of America

  • Mingzhai Sun,

    Affiliation Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America

  • Joshua W. Shaevitz

    Affiliation Department of Physics and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America


Studies of social and group behavior in interacting organisms require high-throughput analysis of the motion of a large number of individual subjects. Computer vision techniques offer solutions to specific tracking problems, and allow automated and efficient tracking with minimal human intervention. In this work, we adopt the open active contour model to track the trajectories of moving objects at high density. We add repulsive interactions between open contours to the original model, treat the trajectories as an extrusion in the temporal dimension, and show applications to two tracking problems. The walking behavior of Drosophila is studied at different population density and gender composition. We demonstrate that individual male flies have distinct walking signatures, and that the social interaction between flies in a mixed gender arena is gender specific. We also apply our model to studies of trajectories of gliding Myxococcus xanthus bacteria at high density. We examine the individual gliding behavioral statistics in terms of the gliding speed distribution. Using these two examples at very distinctive spatial scales, we illustrate the use of our algorithm on tracking both short rigid bodies (Drosophila) and long flexible objects (Myxococcus xanthus). Our repulsive active membrane model reaches error rates better than per fly per second for Drosophila tracking and comparable results for Myxococcus xanthus.


A broad range of biological problems on many length scales, from cells to whole animals, require the ability to track moving individuals with a group. Advances in computer vision in the past two decades have enabled computed-aided automatic or semi-automatic tracking programs to greatly boost the capacity to analyze large amounts of data and reduce the involvement of human observers. However, many traditional tracking algorithms struggle when objects come into close physical contact or even overlap within an image. Recent work addressed this problem by separating objects using a Gaussian mixture model with an area prior followed by identity matching and successfully applied this approach to track walking Drosophila with minimal human supervision [1]. Tsai and Huang further extended this approach by refining the segmentation of the Drosophila images into different body parts which enables more detailed measurements to be made [2]. A non-Bayesian framework was used by Chaumont et al. to track multiple mice by modeling the animal body with a series of physical “primitives” connected by joints and elastic springs that can interact with each other [3]. All of these algorithms are capable of handling large amounts of images, frames, with relatively little tracking error that is then corrected manually. However, this error rate increases with reduced image quality or when the objects move close to one another more frequently.

In most tracking solutions, image segmentation is performed on each frame to identify individual objects. This is then followed by an identity matching scheme between frames based on distance, object birth and death probabilities, and other estimated parameters [1], [4]. Active contours (snakes) are a popular image segmentation approach that is widely applied in analyzing biological and medical images. The contour of the compartment boundary is treated as an elastic band that interacts with the image and exhibits a damped relaxation to the minimum energy state [5]. In addition to closed-contour uses for measuring parameters like object area, open active contours can be used to detect filamentous objects such as blood vessels[6][8], neurons[9][11], actin filaments [12], [13], and many biomedical and non-biomedical applications. In these approaches, active contours can be allowed to merge, break, fork and recombine.

In this work, we solve the tracking problem using a deformable membrane model, which is an extension of the active contour model to the temporal dimension. In order to prevent merging of multiple objects, we add a repulsive interaction between neighboring contours. We test this method on two practical tracking problems in animal behavior and microbial ecology: walking Drosophila and gliding bacteria Myxococcus xanthus are tracked at high density with low error rate ( per fly per second, or per cell per second), and at high efficiency (better than 50 frames per second when tracking 5 flies). Drosophila has become a popular model organism to study neurobiology and animal behavior for its ease of genetic manipulation[14][19], and Myxococcus xanthus is a gram-negative bacteria that exhibit gliding motility using molecular motors [20], [21], which allow the bacteria to exhibit complex group behaviors[22][25]. Using these two tracking problems at very distinctive spatial scales, shapes and morphology, we demonstrate the application of our algorithm on a broad range of problems. Finally, we discuss the close connection between Bayesian techniques and the energy minimization approach in our active membrane algorithm.


1. The Classical Active Contour Model

In the classical active contour model [5], a feature in an image, usually a line, area boundary or edge, is located by relaxing an elastic contour (snake) that interacts with the image to the contour's minimum energy state. The energy of the contour consists of the internal elastic energy and the image energy term based on the location of the contour in the image, calculated along the contour in -dimensional space, as a function of arc position :(1)where the term penalizes the energy when the contour is deviated from a uniform straight line and the term adds an additional cost to bending. and are the termini of an open contour, which we define as 0 and 1, or in case of a closed contour, the integral path is closed. Minimizing is equivalent to solving(2)which can be written in the discrete form [5]:(3)where each is an -element vector that defines the contour using discrete points that are distributed along . is the circular pentadiagonal discrete equivalent of the operator : (4)with , , , , . Equation (3) can then be solved iteratively via(5)where is the iteration index, sets the time step, and is the spatial derivatives . Additional forces can be conveniently incorporated into the model by adding a force term , which we will discuss in the coming sections:

(6)Numerous variations of this active contour model have been developed for specific image-analysis and tracking problems, including the use of variable stretching and bending stiffness and , sophisticated image potentials , and the inclusion of additional forces for specific purposes. Here, we build our model aiming to solve the multiple object tracking problem in time-lapsed movies.

2. Open Contours

While a circular pentadiagonal matrix is suitable for dealing with a closed contour, for open contours, the motion of the two tips needs to be considered separately. We modify the first and last two rows of in equation (3) to for open contours such that the internal force acting on the endpoints is equal to half of the internal force on the nearest neighbor, but in the opposite direction:(7)

Here We set in our two following sample applications because penalizing the contour length in open contour configuration adds an effective energetic cost to path curvature. We find that a non-zero value for beta does not qualitatively change the behavior of the open contours analyses for, but not limited to these two examples.

Without an additional constraint, the tips are left free to interact with the image. Tip forces can be added to elongate or shorten the contour in order to control the contour length. In many multiple object time-lapse tracking problems, the characteristic length scale of the tracked objects is known and does not change, even though the position, orientation, and specific shape may alter over time. In this work, the length of the contours are maintained constant by a harmonic tip-stretching force, depending on the target contour length :(8)where is the tangential direction of the contour at the endpoints pointing outwards,

(9) is the apparent contour length , and governs the magnitude of the stretching force.

Another form of applying the tip force is to add attraction forces from the constraint tip position:(10)where x is the apparent position of the tips and is the anticipated position, which can be predicted based on the current contour length and position as discussed in the following sections. Both forms in (8) and (10) are used in the two examples below.

3. Track Contour Motion Over Time

The kymograph of a contour that moves in -dimensional space over time is a continuous surface in dimensions. In certain types of problems where the objects' trajectories are less predictive, such as the case of Brownian motion, a global optimization strategy such as the framework to solve the linear assignment problem (LAP) is usually necessary [4]. However, in common cases where the trajectories are smooth relative to the object density, local optimization is sufficient and more efficient than global approaches. Here, we apply the concept of active contours to the temporal dimension as well as the spatial dimension within each frame, and optimize the localization of the two-dimensional active membrane in the ()-dimensional kymograph. Different from freely moving contours in all spatial dimensions, the object has one unique location at each time point, and no localization information is carried between frames. For these reasons, we allow the control points of the active membrane to move within each time slice but not along the temporal axis. For active membranes, is a matrix that determines the location of the contour at all time, where is the number of control points along the contour, and is the size of the temporal frame stack. We seek the solution of(11)where the pentadiagonal matrix effects the derivatives in time and has the same structure as but transposed. Similar to the strategy for solving equation (5), this equation can be solved iteratively by

(12)which is in the form of a Sylvester equation and can be solved following standard routines.

In practice, instead of optimizing the active membrane on all frames, we slice the kymograph into overlapping time-blocks in the temporal dimension and then sequentially obtain the optimized solutions in short blocks in accordance with the results from previous blocks in time. This is a practical simplification as long as the block size is comparable to the persistence time of motion of the objects. When the temporal projections of the trajectory of one or multiple objects overlap, incorrect initial condition can cause slow convergence or trap the solution at local minima where the registration of traces to image potential minima is swapped. By solving the problem in a block-wise manner, we avoid incorrect initial placement of the contours, and greatly reduce the computation time using the local initial guess based on previous results. Because our model is used to process time-lapsed movies, we use the term contour and membrane interchangeably in the following text.

4. Repulsion between Multiple Contours

Active contour and surface models can be applied to an image or time-lapse movie of multiple objects by starting at different initial locations. However, when the objects are in close proximity to one another, the barrier that separates the two objects in the image energy landscape can diminish below a significant level due to noise and other image-based effects (Figure 1). As a result, two contours (or membranes) with different initial locations can converge to the same image energy minimum. To avoid different contours from collapsing into the same potential well, we optimize the energy of multiple contours simultaneously with the addition of a mutual repulsive force between the contours. Because the repulsive force resembles the physical exclusion, we limit the range of the repulsion to the approximate size of the objects, and set the force magnitude to match the depth of the typical depth of the image energy minima. The goal of adding repulsion between interacting contours is to prevent them from overlapping due to the possible degraded image quality, while keeping the dynamics of the contours well-behaved and easy to implement. Specifically, the force is defined to have a quadratic form of the distance between two contours with a cutoff distance , which is chosen to be the same as the typical width of the tracked objects:(13)where is the unit arc length density of the repulsive force acted on the contour at arc position due to the presence of the contour. is a scalar that sets the magnitude of the repulsion so that the repulsive potential matches the attractive image potential. The discrete version of equation (13) takes the form(14)where is the coordinate of the control point of the contour at the frame. Here, the repulsive force is not normalized to the actual length of the contour. Because the lengths of the flies and bacteria cells vary within 20% of the averaged length, we did not observe convergence issues in solving the relaxed positions due to the unnormalized forces. A cubic potential is a sufficiently close approximation to the overall shape of the attractive potential generated by Gaussian smoothing of the original image. In our actual algorithm, the external force includes both the terminal force and the repulsive force:(15)

Figure 1. Schematic illustration of the principle of repelling active contours.

(a) When objects are far away and the attractive image potential fields (blue solid lines) don't interact, two active objects (red and cyan circles) correctly fall into the image energy minima. (b) When objects are close, the potential fields (blue dashed lines) overlap and cause dislocated or merged minima. Two active objects converge into the same minimum. (c) With the repulsive potential added (red solid line), the total field the other object is in is recovered (cyan solid line). (d) The same principle can be applied to two and higher dimensions as shown. The black solid ovals represent two contacting objects. The images are blurred to give smooth attraction potential, and with repulsion, two minima can be resolved (green and red).


1. Drosophila Walking Behavior

We first demonstrate the repulsive, active membrane model on movies of walking fruit flies. Drosophila melanogaster has become a popular model system for studying complex behaviors such as courtship, aggression, and learning through the analysis of time-lapse movies of fly position. In these experiments, individual flies often come physically close to each other causing their images to merge. The repulsive membrane model is particularly adept at resolving the positions of flies during these events with high chance of contacting and overlapping.

We recorded a movie of five male flies walking in a circular, 2.5-cm diameter arena at a density of 1 fly/cm at 30 frames per second and a camera magnification of 0.1 mm/pixel. Our tracking algorithm models each fly as an open contour with 3 control points (Fig. 2 and Movie S1). The length of each fly is determined from the initial image and remains fixed throughout the tracking task. Our tracker is able to follow five flies correctly in all 20,203 frames (a total time of 673 seconds). The oval shaped fly image potential constraints the orientation of the open contour, so that the head direction is correctly resolved in most frames. Occasionally, a fly can exhibit escape behavior that causes a quick change in location and head direction. For generality, our algorithm does not consider image details specific for fruit flies, and we do not distinguish head from tail based on images. Consequently, jumps may reverse the head-tail orientation (Figure 2b). However, because flies mostly move forward in the direction of the head with low jumping frequency on the order of min, a Hidden Markov method is suitable for detecting jumps. We use this information and adopt the Viterbi algorithm [26] to determine the forward-backward head direction as an addition step of our fly tracking software. Figure 2(c–g) shows the 2D, -probability distribution function of walking velocity for each of the five flies after the head direction correction. Each fly has an individual signature distribution that is slightly different from other's.

Figure 2. The trajectories and velocities exacted from the tracking results of five male flies.

(a) A sample image of five flies in the circular arena. Three seconds of walking trajectories of the flies are labeled in different colors. The control points are labeled as the yellow squares. (b) The parallel component of the walking velocity of one fly is plot as a function of time. When the fly jumps (indicated by arrows), the tracker may reverse the orientation and negate the velocity. The reversal can be detected off-line using a Hidden Markov Model (HMM). (c–g) Velocity histograms of five individual flies. The color indicates the the probability density distribution plotted on a logarithmic scale.

To increase the number of fly-fly encounters and test the ability of the repulsive algorithm to distinguish individual flies, we placed one female and two male flies in the arena. Both male flies spend a large amount of time attempting to court the female fly, resulting in frequent merging of the fly images. 85% of the recording time, at least one of the male flies is within 6 mm of the female fly and 31% of time both male flies are within this distance (Figure 3(a)). Images are analyzed with our tracking algorithm with 100% correct registration of flies and only one orientation reversal caused by a jumping event (Figure 3(b)). To demonstrate the tracking result, we calculated the position of each fly relative to one of the other two flies for all pairs. Similar to previous analyses [1], we find that male flies tend to approach the female fly from the rear (Figure 3(c),(d)), while maintaining a head direction oriented towards the female (Figure 3(e),(g)). The relative position of the two male flies uniformly distribute around each fly (Figure 3(f),(h)), indicating no preferred orientation between males.

Figure 3. Tracking results of in-contact and jumping Drosophila as exceptional conditions, and the histograms of the relative positions between flies.

The positional information of flies from the movies are shown in the three-dimensional kymograph, where the boundary of the fly images are drawn in false color depending on time. The positions of the flies are indicated as short open contours in the same color and connected by lines as visual aid. (a) Two flies moved in proximity to each other and then moved apart, causing the mask and contours to merge and split again. (b) A missing fly image was caused by the a jumping event, followed by re-orientation of the head direction. (c)-(h) The distribution of the relative position between two flies is shown as the logarithm of the distribution probability density. The displacement is measured relative to the first fly on the perpendicular (x) and the parallel direction relative to the head direction (y). (c) Male 1 (M1) relative to female (F), (d) male 2 (M2) relative to female (F), (e) F relative to M1, (f) M2 relative to M1, (g) F relative to M2, and (h) M1 relative to M2.

We compared the performance of our algorithm to the output of the CTRAX fly tracker [1] using the 5-male assay, the one female, two male (1F2M) assay, and a high fly density movie trial with 16 male flies in a 5 cm arena (Table 1). At high densities, flies frequently walk in contact with each other and jumping is more frequent (Movie S2). In our repulsive contour model, where the number of flies is fixed, identity swapping is the only kind of error. In comparison, because CTRAX allows the number of objects to vary, it has three types of tracking error: identity swap, lost objects, and spurious detection. In addition, we compare the speed of the two tracking packages when run on an Intel i5 processor. In the 5-male movie, where flies rarely come in close proximity to each other (3% of the time), both trackers are able to track all the flies without any error. At a higher fly density of 3 cm, both trackers are able to distinguish individual flies, although CTRAX has a small portion of spurious detections and fly jumping becomes the primary source of identity swap error. When the image flies constantly stay in close proximity, such as with frequent courtship attempts, our model is still able to locate the flies from the boundary contour, whereas CTRAX suffers from overlapping detections. Moreover, we observe that the tracking quality of CTRAX is crucially sensitive to the user-specified input parameters such as the image value threshold and the Gaussian oval shape prior that requires multiple trials to optimize. In comparison, the repulsive contour model only requires the knowledge of the spatial scale, either the fly length or width, and the number of flies. The repulsive contour tracker is also 3–20 times faster than CTRAX.

Table 1. Comparison between CTRAX and the repulsive contour model in error rate and computation speed.

2. Myxococcus xanthus Gliding Motility

To highlight the ability of the repulsive contour technique to track densely packed objects, we analyzed movies of 2D swarms of Myxococcus xanthus cells. Myxococcus xanthus is a soil bacterium that forms complex, 3D group structures by gliding along solid surfaces [22]. We confined Myxococcus xanthus (strain DZ2) cells between an agarose gel and a glass surface such that cells form a single layer at the interface. Using bright field microscopy, the bacterial cell body appears dark and is surrounded by a bright halo (figure 4(a)). Cells are often tightly packed,resulting a poor image contrast between the neighboring cells.

Figure 4. Bright field images of Myxococcus xanthus cells are taken, and transformed into the probability map that is then used to generate the image potential to interact with the contours.

(a) A field of view of 41 m 41 m (512512 pixels) contains over 200 Myxococcus xanthus cells. The relaxed position of the contours are overlaid on top of the bright field image with false color labeling. (b) Zoomed-in image of a portion in (a). (c,d) The large and the small eigenvalues of the Hessian matrix of (b). (e) The locally averaged eigenvectors indicate the alignment magnitude of features. The sign is chosen such that higher value indicates less order, thus high chance to be the background and vise versa. (f) The distribution of pixels in the classifier coordinate: two principle components from the eigenvalue-intensity space (horizontal axes), and the alignment magnitude (vertical axis). Pixels are categorized into three groups along the three axes, and color-coded in red (ridge), green (valley) and blue (background). The projections along three axis are shown as guides for viewing. (g) After classification, each pixel in the image is color coded in the same way as in (f) according to the probability of being ridge, valley or background. (h) The enhanced image for repulsive active contour model is calculated from the classification probability map shown in (g). (i) Image intensity profile on a line segment (red dashed line in (a)) illustrates the nonuniform contrast at the edge and at the inside of a cell cluster.

Contrast-enhancement techniques such as Differential Interference Contrast (DIC) and Phase Contrast (PC) allow for the visualization of low-contrast objects but produce images whose intensity patterns are complex, usually containing both bright halos and dark regions for each object in a non-linear representation. For that reason, we did not adopt these microscopy techniques. For the bright field microscopy images, we preprocessed the raw microscopy images to enhance the image contrast before calculating the image-based potential for contour relaxation. We first calculated the eigenvalues and eigenvectors of the Hessian of the images to quantify features such as valleys (the cell bodies) and ridges (the halos around the cells). We observe that the two eigenvalues of the Hessian matrix and the pixel value of all pixels fall in a plane in three-dimensional parameter space. The smallest variance of the principle components is less than 5% of the other two, which are comparable. Hence, the first two principle components of these three parameters are used to specify image features. Qualitatively, the pixels are distributed in two clusters within this two-dimensional projection (the two lobes in the projection in figure 4(f)), with valley pixels residing in one cluster and the ridge pixels in the other, and all background pixels within the junction region of the two clusters. To better distinguish pixels from the background and from the high cell density area where the pixel intensity contrast is low, we consider the alignment of the features in neighbouring areas. Analogous to the quantification of magnetization, we treat the difference of the two eigenvalues as the magnitude of a dipole, whose direction is defined by the eigenvector associated with the larger eigenvalue. The locally averaged dipole moment is the order parameter that describes how ordered nearby features are aligned. The alignment order parameter, together with the two principle components of the eigenvalue-image intensity space, forms a three-dimensional space in which the valley, ridge and background pixels are separated (figure 4(f)). The orientation of the three sets is determined using an expectation-maximization algorithm and the pixels are clustered into three groups according to the coordinates along the major direction of each group (Figure 4). We undersegment the valley and the background to prevent areas of high image potential, because the repulsive contour model is robust against undersegmented ridges but prone to errors caused by oversegmented valleys. The enhanced image used to calculate the image potential is computed using(16)with(17)where is the probability of a pixel being the background, being the ridge, and being the valley. is Gaussian blurred with pixels before taking the gradient to smooth the image force field (Figure 4d).

Contours of 25 control points are evolved in an image gradient calculated from the preprocessed images (Figure 4d). The repulsive force between control points on nearby cells is taken to be a quadratic function of the distance(18)where is the distance vector between two control points on two different contours. is a cut-off distance that sets the length scale of the repulsive force. We set to be 70% of the averaged cell width, about 7 pixels.

To prevent contours from starting the relaxation procedure at an initial position that crosses a ridge in the image (Figure 5), we reduce the length of all contours to 70% of its original length in the first iteration of each time step to eliminate possible crossings between contours and ridges, and let the length grow back to their normal value during iterations along the image potential valleys. The length growth is implemented by adjusting the targeted contour length for each iteration by an increment(19)where is the apparent length at iteration , and is the normal length of each cell. is a convergence parameter that governs the number of iterations for the contours to grow back to their original length. In practise, we observe that is a good value for contours of pixels. We also adopt the form in equation (10) to implement the length constraint force on the termini. The length increment is split between the two termini of the contours according to the resistant forces exerted on the termini:(20)(21)where and are the tangential components of the image and repulsive forces exerted on the termini. The anticipated position of the tips are obtained by extrapolating the termini along the tangential directions by and , and then is calculated using (10). Figure 5 demonstrates how this inchworm-like motion corrects misplacement of the contours. With the length initially shortened, the faulty crossing configuration is eliminated (Figure 5b). The upper tip grows until it hits the ridge and is unable to grow further due to steric constraints (Figure 5c, d). The lower tip keeps growing along a narrower valley until the length reaches the target length (Figure 5e, f).

Figure 5. A contour can ride across a ridge and cause a marginally stable configuration. Allowing tips to grow solves this problem.

(a) The total growth of the two ends of a contour is determined by the current length and the normal length of the contour. The growth is distributed unevenly to two termini according to the tangential resistance force. The dashed line indicates a hypothetical contour and the two ends are indicated by the round dots. (b) Without the ends shrinking and regrowing, the contours are trapped in a marginally stable configuration, where the contour on the right (indicated by the red dotted line) leaks to the left potential well and squeezed left contour short due to repulsion. (c–f) The length of the contours are shortened initially and let grow to lead to the correct contour positions. The white arrow indicates the direction of the tip growth, and the white dashed lines are the visual guide to help illustrate the growth.

We have also included several non-critical optimizations to the dynamics of the contours for better convergence. The tangential component of the force exerted on each control point is individually calculated and the average value is applied on each control point. The arc distance between control points is also uniformly redistributed every 5 iterations. These optimizations improve the relaxation in the tangential direction without qualitatively changing the behavior of the contours.

We used the repulsive contour model as described above to track Myxococcus xanthus motion within a 41 m41 m field of view consisting of 512512 pixels (Figure 4a and Movie S3). 205 cells are initially located in the field and we tracked these cells for 400 frames at 12 frames/min. During the time course of the movie, cells typically move a distance about equal to 10 times their cell length and the local adjacency order is completely altered for most of the cells. Our algorithm successfully tracks 204 cells of the total 205 cells before they leave the field of view. Because we do not apply force or constraints out of the field, contours can quickly leave the field of view at the border or leak to other potential wells. Minor human intervention was shown to improve the convergence but were not critical for the results. Figure 6(a) shows the temporal projection of traces of all 205 cells, labeled in different colors and overlaid on the first frame of the movie. These traces indicate that cells are capable of smoothly turning while gliding and reversing direction of motion. The tangential speed of a selected cell with high motility shows the velocity estimation from position data, and indicates a directional reversal approximately every 7 minutes (Figure 6(b)). Figure 6(c) shows a histogram of the tangential speed of all 205 tracked cells while remaining in the field of view. The speed approximately follows an exponential distribution with a mean value at m/min. This value is significantly smaller than previously reported results [21], [27], because we included the cells that are not motile at all, and the Gaussian smoothing altered the shape of the speed distribution. The high cell density also reduces the mean speed as shown in [27].

Figure 6. Tracking results and speed statistics of Myxococcus xanthus cells.

(a) Trajectories of 205 cells over 2000 seconds are indicated in different colors, overlaid on the first frame of the movie. (b) The tangential gliding speed along a selected cell is plot against time. Raw speed trace (light red) is smoothed by Gaussian kernel with seconds (dark red). Six directional reversals are identified by the zero-crossings of the smoothed speed curve (pointed by arrows). (c) The histogram of gliding speed magnitude of 205 cells approximately follows an exponential distribution with the mean value at 0.49 m/min.


Active contour models are widely applied in detecting features with high contrast such as boundaries in an image and is shown to be successful especially for closed contours. Several studies applied the concept of active contour in detecting open contours with finite length, for instance actin filaments. On the other hand, along with other machine vision techniques that segment targets from an image, active contours can be applied on individual images in time-lapsed movies of particle tracking problems. The solution is usually separated into two stages: the objects or particles are first identified in each frame individually, then the correspondence is assigned between objects in different frames. Merging/disappearing or splitting/emerging are allowed with specific statistical properties, usually in a maximum-likelihood fashion based on the pixel intensity level, shape changes or moving distance. In these two-stage approaches, the object assignment is limited by the object detection quality and it is not straight-forward to use the inter-frame object assignment information to assist particle detection. Optimizing assignments across multiple frames is robust agains single frame tracking error but dramatically increases computational cost. In our approach, we treat the temporal dimension as an extrusion of the spatial dimension of the images and the object trajectories as elastic continua that interact with both the image time-series and each other in terms of repulsive forces. Similar to the penalty based on moving distance in the linear assignment problem, the elastic energy in the temporal dimension penalizes trajectories that deviate from linear motion. One major difference between this method and the two-stage solutions is that the object identity correspondence assignment is combined with the spatial localization using image intensities.

As a closely related method to the active contour but with certain advantages, a level set method is insensitive to the topology of the contours, and therefore ideal in dealing with contours with unspecified topological features. It has been shown previously that level set methods can be successfully used in tracking moving bacteria [28]. One drawback that limits the application of active contour and level set methods is the difficulty in applying shape constraints to the contours. This problem can be partly solved by using a shape prior in the level set approaches at the cost of computational time. However in many applications, the characteristic topological and geometrical features remain unchanged through the entire movie, for instance the fly shape and size in the first example, cell length in the second example and the number of objects (flies, cells) in the both examples above. These conserved geometric quantities can be predetermined and used as constraints on the solution that effectively acts as an explicit shape prior but with a simpler form. In addition to resembling the physical shapes with explicit lengths of the contours, our model also captures the mutually exclusive nature of these objects by adding a repulsive force between objects. This effectively prevents trajectories from crossing or collapsing into one image potential minima.

Another drawback of active contour methods is the typical sensitivity to the initial position and local minima in the image potential. In our second example, ambiguous segmentation of cells takes place frequently due to the poor quality of the raw images, causing faulty gaps between cells and bumps in the images along the cell body, especially in area of high cell density. Upon sudden cell acceleration, linear prediction may place the initial position into a faulty local minimum as shown in figure 5(b). Among various segmentation algorithms, the watershed is particularly suitable in dealing with varying boundary values, at the cost of the convenience to control the geometric properties of the segmented area. We adopt the idea of a watershed algorithm in our active contour model in terms of dynamically varying the contour length, where the tip growth is governed by the resistant forces. This tip growth scheme is directly analogous to the case of one-dimensional watershed method, where the expansion of the two boundary points is inversely proportional to the steepness of the potential well. The combination of active contours and tip growth is essentially a watershed algorithm with an explicit shape prior that has the capability to correct mistakes caused by improper initial positions (figure 5(c-f)).

Image preprocessing can be beneficial, sometimes essential, in order for the active contours to relax into the correct potential minima. Because efficient dynamics of the contours requires a smooth potential landscape with slowly varying gradients, we use Gaussian smoothing before taking the gradient, or equivalently convolved the images with a derivative of gaussian (DoG) kernel. Several factors determine the blurring radius, including the spatial scale of the objects (fly or cell width in the examples shown) and the distance between the predicted and the true position of the objects, usually at the same scale of the moving distance between frames. The blurring kernel can be set variable depending on the motion of the objects, such as in the fly tracking case, where the fly undergoes normal walking behavior and occasional large distance jumps. We use a Gaussian kernel with the radius the same as the fly width for the former case to achieve fast performance, and if a jump is detected, a global linear attractive potential that is only visible to the jumping flies is added on top of the original kernel for these jumping flies. The image preprocessing is more essential and subtle in the second example. Comparing to the raw images, the pixel intensity at the cell body, cell boundary and background is more uniform after we convert the pixel intensity into assignment probability of three pixel categories. The elevated background intensity prevents contours from leaking to the empty background, which is a severe problem if the image is not preprocessed (data not shown). Pixel classification also enhances the barrier between cells and normalizes the pixel values on the cells.

Since the evolution of the contours follows explicit dynamics indicated in equation (3), it is convenient to implement additional factors that affect the tracking results in terms of explicit forces, or to modify the elastic property to fit particular shape requirements. The length constraints and tip growth are applied in the form of explicit forces in the case of tracking flies and Myxococcus xanthus cells. In the first example, we also add energy costs to penalize sharp turns, which is observed to cause identity swap in rare ambiguous frames. In addition to applying explicit forces, the internal degrees of freedom of the contour curvature allows extra flexibility in modeling more complex shapes. For instance, the bending elasticity can be set as a variable along the contour instead of a constant, allowing parts of the contour to act as soft joints and the rest as hard stems. This is especially suitable in tracking objects with internal motion such as head turning in mice.

Because the image information in adjacent frames is considered as a whole in the contour localization, minor corruptions in image quality can be corrected by the adjacent frames. On the other hand, major corruption can cause propagating errors in the following frames. The relaxation to the right image potential minimum suffers especially when the image potential energy landscape is highly curved with a large number of local minima, such as in the case of Myxococcus xanthus images where cells are bent and intertwined with identical local statistical character of pixel intensities at cell width length scale. Inchworm tip growth has the ability to correct initial misplacement given that the cells only move tangentially by a small amount (20%) relative to the cell length within two frames. Our algorithm also does not consider the case of new objects entering the image. New objects can be detected using other methods such as level set and then treated as a regular repulsive active contour.

Comparing our active contour tracking model that uses the explicit dynamics of the contours with previous Bayesian approaches such as Branson et al. [1], the essential principles of our method are correspondingly similar to models of the physical properties of the objects. Active contour models treat images as a potential energy landscape, and the objects are described by a set of contours with specific lengths and bending properties to define the shape. Contours move according to the image potential with certain damping factor until relaxing at a minimum, and we add repulsion forces between objects to prevent merging. Correspondingly in Bayesian approaches, each image is treated as a spatial distribution of pixel vales, and an object is a parameterized distribution model with particular shape priors, such as the covariance and centroid of a Gaussian distribution. Tracking maximizes the posterior likelihood of the modeled distribution explaining the image, which involves a particular numerical gradient descending scheme similar to the damped dynamics of contours. A Gaussian mixture model is adopted for the case of multiple object tracking, which clusters pixels into groups with small or no overlaps resembling repulsion between Gaussian mixtures.

In conclusion, we developed a multiple-object tracking algorithm for time-lapsed movie based on active contour model that takes into account the physical exclusion of the objects. In our model, multiple oval-shaped or elongated curved objects are represented by open elastic contours with individually fixed length in each frame, and the motion of objects in time is treated as an extrusion in the temporal dimension, thus the spatio-temporal kymograph are modeled as mutually repulsive elastic membranes. We illustrated the application of the repulsive active membrane model on two sets of realistic experimental data, tracking multiple Drosophila walking and chasing, and tracking curved gliding bacteria Myxococcus xanthus at high density. Individual objects are tracked successfully with high efficiency at video frequencies and low error rate () that can be conveniently corrected in separate steps.

Supporting Information

Movie S1.

Tracking results of 5 male flies in a circular arena. Five male flies are placed in an circular arena and captured in video. The Tracking results are indicated by a short line connecting three control points representing each fly. The waling trajectories are indicated by traces with decaying color saturation, and the frame number is shown on the top left corner.


Movie S2.

Tracking results of 16 male flies in a circular arena. Movie of sixteen male flies walking is taken the same as described in S1. Due to the high density, flies jump more frequently than at lower density. The head direction is not corrected in this case.


Movie S3.

Tracking results of Myxococcus xanthus cells. Individual cells are represented by control points indicated as round dots, and drawn on top of the pre-processed bright field microscopy images. Raw movie was taken at 1 frame per second for 2000 seconds. This movie shows 1 frame of every 5 frames.


Movie S4.

Raw image sequence of the data shown in movie S3.



We thank Mala Murthy and Gordon Berman for providing support in data acquisition and helpful discussion.

Author Contributions

Conceived and designed the experiments: YD PC MS JWS. Performed the experiments: YD PC MS. Analyzed the data: YD MS. Contributed reagents/materials/analysis tools: YD MS JWS. Wrote the paper: YD JWS.


  1. 1. Branson K, Robie AA, Bender J, Perona P, Dickinson MH (2009) High-throughput ethomics in large groups of drosophila. Nat Methods 6: 451–457.
  2. 2. Tsai HY, Huang YW (2012) Image tracking study on courtship behavior of drosophila. PLoS One 7: e34784.
  3. 3. de Chaumont F, Coura RDS, Serreau P, Cressant A, Chabout J, et al. (2012) Computerized video analysis of social interactions in mice. Nat Methods 9: 410–417.
  4. 4. Jaqaman K, Loerke D, Mettlen M, Kuwata H, Grinstein S, et al. (2008) Robust single-particle tracking in live-cell time-lapse sequences. Nat Methods 5: 695–702.
  5. 5. Kass M, Witkin A, Terzopoulos D (1988) Snakes: Active contour models. International Journal of Computer Vision 1: 321–331.
  6. 6. Can A, Shen H, Turner JN, Tanenbaum HL, Roysam B (1999) Rapid automated tracing and feature extraction from retinal fundus images using direct exploratory algorithms. IEEE Trans Inf Technol Biomed 3: 125–138.
  7. 7. Kirbas C, Quek F (2004) A review of vessel extraction techniques and algorithms. ACM Comput Surv 36: 81–121.
  8. 8. Altinok A, Manjunath B, Rose K (2011) Variable length open contour tracking using a deformable trellis. IEEE Transactions on Image Processing 20: 1023–1035.
  9. 9. Vasilkoski Z, Stepanyants A (2009) Detection of the optimal neuron traces in confocal microscopy images. J Neurosci Methods 178: 197–204.
  10. 10. Meijering E (2010) Neuron tracing in perspective. Cytometry A 77: 693–704.
  11. 11. Wang Y, Narayanaswamy A, Tsai CL, Roysam B (2011) A broadly applicable 3-d neuron tracing method based on open-curve snake. Neuroinformatics 9: 193–217.
  12. 12. Li H, Shen T, Vavylonis D, Huang X (2010) Actin filament segmentation using spatiotemporal active-surface and active-contour models. Med Image Comput Comput Assist Interv 13: 86–94.
  13. 13. Xu T, Li H, Shen T, Ojkic N, Vavylonis D, et al. (2011) Extraction and analysis of actin networks based on open active contour models. Proc IEEE Int Symp Biomed Imaging 2011: 1334–1340.
  14. 14. Chan YB, Kravitz EA (2007) Specific subgroups of frum neurons control sexually dimorphic pat-terns of aggression in drosophila melanogaster. Proc Natl Acad Sci U S A 104: 19577–19582.
  15. 15. Ho KS, Sehgal A (2005) Drosophila melanogaster: an insect model for fundamental studies of sleep. Methods Enzymol 393: 772–793.
  16. 16. Frye MA, Dickinson MH (2004) Closing the loop between neurobiology and ight behavior in drosophila. Curr Opin Neurobiol 14: 729–736.
  17. 17. Greenspan RJ, Ferveur JF (2000) Courtship in drosophila. Annu Rev Genet 34: 205–232.
  18. 18. Robinson GE, Fernald RD, Clayton DF (2008) Genes and social behavior. Science 322: 896–900.
  19. 19. Iyengar A, Imoehl J, Ueda A, Nirschl J, Wu CF (2012) Automated quantification of locomotion, social interaction, and mate preference in drosophila mutants. J Neurogenet.
  20. 20. Mignot T, Shaevitz JW, Hartzell PL, Zusman DR (2007) Evidence that focal adhesion complexes power bacterial gliding motility. Science 315: 853–856.
  21. 21. Sun M, Wartel M, Cascales E, Shaevitz JW, Mignot T (2011) Motor-driven intracellular transport powers bacterial gliding motility. Proc Natl Acad Sci U S A 108: 7559–7564.
  22. 22. Kuner JM, Kaiser D (1982) Fruiting body morphogenesis in submerged cultures of myxococcus xanthus. J Bacteriol 151: 458–461.
  23. 23. Monds RD, O'Toole GA (2009) The developmental model of microbial biofilms: ten years of a paradigm up for review. Trends Microbiol 17: 73–87.
  24. 24. Xie C, Zhang H, Shimkets LJ, Igoshin OA (2011) Statistical image analysis reveals features affecting fates of myxococcus xanthus developmental aggregates. Proc Natl Acad Sci U S A 108: 5915–5920.
  25. 25. Zhang Y, Ducret A, Shaevitz J, Mignot T (2012) From individual cell motility to collective behav-iors: insights from a prokaryote, myxococcus xanthus. FEMS Microbiol Rev 36: 149–164.
  26. 26. Cormen TH (2001) Introduction to Algorithms. Cambridge, Massachusetts, USA: MIT Press.
  27. 27. Spormann AM, Kaiser AD (1995) Gliding movements in myxococcus xanthus. J Bacteriol 177: 5846–5852.
  28. 28. Vallotton P, Sun C, Wang D, Turnbull L, Whitchurch C, et al.. (2009) Segmentation and tracking individual pseudomonas aeruginosa bacteria in dense populations of motile cells. In: Image and Vision Computing New Zealand, 2009. IVCNZ '09. 24th International Conference. pp. 221–225. doi:10.1109/IVCNZ.2009.5378409.