Fig 1.
Example of a trajectory (top view) performed by a dog inside its pen.
Fading lines are performed earlier, intense lines are more recent. Red dots represent the barycentre of the animal at a given instant. This bird eye view is achieved by re-projecting the acquired points through a homographic projection onto the pen’s ground plane. The projection matrix is computed automatically from the videos without manual calibration of the Kinect 3D sensor.
Fig 2.
A grid divides the floor of the pen.
The time (seconds) spent by the dog on each square is calculated. Different scales of grey also quantify the amount of time spent in each square (i.e. black = never entered the square, lighter shades = more time spent in that square).
Fig 3.
(a) Visual representation of the alignment of two sequences using the Dynamic Time Warping (DTW). The DTW stretches the sequences in time by matching the same point with several points of the compared time series. (b) The Needleman Wunsh (NW) algorithm substitutes the temporal stretch with gap elements (red circles in the table) inserting blank spaces instead of forcefully matching point. The alignment is achieved by arranging the two sequences in this table, the first sequence row-wise (T) and the second column-wise (S). The figure shows a score table for two hypothetical sub-sequences (i, j) and the alignment scores (numbers in cells) for each pair of elements forming the sequence (letters in head row and head column). Arrows show the warping path between the two series and consequently the final alignment. The optimal alignment score is in the bottom-right cell of the table.
Table 1.
Global alignment and score table algorithms.
Fig 4.
Comparison between different frames of the dog skeleton.
For each frame a descriptor score for the dog skeleton is computed (top). Then, all the descriptors (from the same or different dogs) are compared and matched. The NW algorithm creates the similarity scores and aligns the segments.
Fig 5.
Each dot represents a video segment. Distance between dots is computed by the alignment algorithm either on coordinates (i.e. trajectories) or body parts (i.e. actions). In this example, the system created three clusters (green, blue and red clouds) where the centroid (black cross) is the most representative sequence of the cluster. The more the sequences are close together the more the alignment produces a high similarity score. Sequences distant from the centroid are less similar than the computed action prototype.
Fig 6.
Qualitative Examples of Body part detection.
Each image shows an example of the extracted dog body part in different conditions. Different line colours correspond to different body parts.
Table 2.
Variation in the Percentage of Correctly estimated body Parts (PCP) with increasing training size and different light conditions.
Table 3.
Comparative results of our PCP against the state of the art methods.
Bold values are the best.
Fig 7.
Confusion Matrix of the B.A.R.K. posture classification (predicted class) compared to the manual annotation (Ground Truth class).
Numbers and percentages of samples corresponding to the system outputs are reported in the cells. Ideally, values in the green diagonal should aim at 100% accuracy.
Fig 8.
Bar-plot of the percentage of time dogs spent in a specific posture:
The y axis is the percentage of the video while the bars graphically depict how the video was scored in terms of behaviours. On the X axis the B.A.R.K. results and the Ground Truth are shown for every video.
Table 4.
Correlation between the manual annotation and automated scoring of behaviour using B.A.R.K. Spearman’s rho and p-values are presented.
Fig 9.
Example of automated cluster analysis (a) dialog window showing the cluster analysis results. On the right menu the 5sec sequences are grouped into 4 clusters that were manually renamed after visualising them all (i.e. double-click on the sequence to play). The different dog’s trajectory performed in the sequences in cluster 1 (b), cluster 2 (c), cluster 3 (d) and cluster 4 (e) give an idea of how the B.A.R.K. automatically groups different patterns of behaviour expressed by the dog in the clip.