Fig 1.
Training data is a mix of straightforward and challenging cases.
(A) DeepTangle’s input consists of clips with central frames annotated. Other models were simply trained on individual annotated frames. (B) Isolated non-intersecting worms can be tracked using Tierpsy’s existing algorithm. (C) There are several categories of more challenging cases where simple skeletonization algorithms fail including self-intersecting worms, multiple overlapping worms, and worms with complex non-uniform backgrounds. Clips for these cases were manually annotated.
Fig 2.
(A) Histograms of root mean square deviation between held-out skeletons and predicted skeletons for each tested model. The vertical dashed line shows the RMSD between three manual annotators. Skeletons above the histogram are examples that illustrate the corresponding RMSD visually. (B) Bar chart showing the number of cases in the held-out test data where a model fails to make a prediction (e.g., Tierpsy fails on coiled worms or a neural network model does not identify any worms above a confidence threshold). (C) Examples of the kinds of errors each model makes. PAF and Omnipose often over-segment worms. DTC fails when worms are fully coiled in circles or in tight parallel contact for an extended time. (D) Computation time per input frame for the different models as a function of worm number/well. Each camera records 16 wells in a 96-well plate so these correspond to 48 and 240 worms per video. Tierpsy only uses CPU computation while Omnipose uses GPU and CPU because we use Tierpsy’s skeletonization algorithm to convert segmented regions to skeletons.
Fig 3.
Tracking in challenging conditions.
(A) Multiple overlapping worms with high density of eggs. Inset images show higher magnification images of two sets of overlapping worms where all individuals are successfully skeletonized. (B) Examples of continuous tracks that preserve worm identity during and through collisions. (C) For highly curved worms that form long-lived coiling shapes, long gaps in the data can be present in Tierpsy-derived data. Here, there is a long gap with a high curvature that is recovered using DTC. (D) Improved skeletonization leads to longer tracks from DTC compared to Tierpsy. Note the log scale. The duration is longer for videos with 3 and 15 worms per well. (E) The number of skeletons/frame averaged over a video. DTC tracking produces numbers closer to the nominal number of worms per well. (F) The fraction of objects that is correctly tracked across frames compared to manually corrected trajectory data for videos with 3 and 15 worms per well.
Fig 4.
DTC improves the signal to noise ratio in a phenotypic screen.
All data from ref. [26]. (A) Speed calculated from Tierpsy skeletons and from DTC skeletons for a random sample of 290 wells from a previously published drug screen. The dashed line is y = x. (B) Correlation coefficients for each of the Tierpsy 256 behavioural features for the data from ref. [26] (left). The red line indicates the modal value of 0.71. Correlation coefficient plotted against the F-statistic for each feature calculated over the entire dataset. (C) Tail curvature as a function of dose for worms treated with a spiroindoline known to cause coiling. (D) A comparison of the mean absolute value of Hedge’s d effect size calculated using features derived from Tierpsy and DTC tracking data. Each point is the mean Hedge’s d across all doses of a drug compared to DMSO controls for a single feature. Any feature with ‘curvature’ in the name except for time derivatives of curvature are shown in blue. All other features are shown in yellow.