Fig 1.
General flow chart of the proposed approach.
Five phases are involved: video preprocessing, motion region segmentation, feature extraction, feature processing and prediction.
Fig 2.
An example of optical flow images.
Two consecutive frames, flowx and flowy of the latter frame.
Fig 3.
General process of motion region segmentation.
(a) Motion magnitude image, (b) Canny edge image of (a), (c) sharpened motion magnitude image, (d) Canny edge image of (c), (e) closing operation on (d), (f) filling holes of (e), (g) deburring for (f), (h) segmented motion region of (a).
Fig 4.
An example of low-level feature extraction, where Mag is the motion magnitude image.
Here, a block consists of 2 × 2 cells and the block steps by half length of a block. Each cell contains 4 × 4 pixels and 12 bins are selected for each cell, forming a 48-element vector for a LHOG (LHOF) descriptor.
Fig 5.
Frame examples of three datasets employed in the experiment.
They are extracted from the Hockey Fight dataset (first row), the BEHAVE dataset (second row) and the Crowd Violence dataset (third row). And the left two columns list violence frames and the right two columns show non-violence samples.
Table 1.
Comparison of accuracy rate based on BoW method on the Hockey Fight dataset.
Table 2.
Accuracy comparison of MoWLD using KED and sparse coding method and proposed features based BoW model on the Hockey Fight dataset.
Table 3.
Results of violence detection on the BEHAVE dataset.
Table 4.
Results of violence detection on the Crowd Violence dataset.