Video-based hand gesture recognition via SPD manifold spatial representation and optical flow motion features

doi:10.1371/journal.pone.0348122

Table 1.

Summary of representative vision-based hand gesture recognition methods (2020–2025).

More »

Expand

Fig 1.

Overall architecture of the proposed framework.

The pipeline consists of four main components: (A) keyframe extraction from input video; (B) feature extraction, including regional spatial features via SPD covariance matrices and temporal features via optical flow histograms on a grid; (C) feature unification and vectorization, where SPD features are mapped to Euclidean space and concatenated with temporal features; and (D) classification using the combined feature representation.

More »

Expand

Fig 2.

Region-based Covariance Matrix Computation for SPD Manifold Spatial Features.

More »

Expand

Fig 3.

Grid-based Optical Flow Histogram Extraction for Temporal Features.

More »

Expand

Fig 4.

Illustration of the feature unification and classification process.

Spatial features modeled on the SPD manifold are mapped to Euclidean space using Log-Euclidean metrics and then fused with temporal motion features extracted from optical flow histograms. The combined feature vector is used for robust hand gesture classification.

More »

Expand

Fig 5.

Cambridge hand gesture dataset [46]: combinations of three hand shapes (flat, spread, V-shape) with three motions (leftward, rightward, contract) for each shape.

More »

Expand

Fig 6.

Northwestern University hand gesture dataset [24]: ten dynamic hand gestures including directional movements, rotations, circles, and symbolic gestures such as ‘Z’ and cross.

More »

Expand

Fig 7.

Classification accuracy for different keyframe selection methods.

More »

Expand

Table 2.

Parameter settings for the algorithms used in this paper.

More »

Expand

Table 3.

Precision, recall, and F1 scores for hand gesture classification on Northwestern and Cambridge datasets.

More »

Expand

Table 4.

Comparison of classification accuracies (%, mean ± std over 20 runs) on the Cambridge and Northwestern datasets.

More »

Expand

Fig 8.

Confusion matrices for analyzing the classification results: (A) Combined Features – Cambridge, (B) Combined Features – Northwest, (C) SPD Features – Northwest.

More »

Expand

Table 5.

Accuracy comparison with existing methods on the Cambridge dataset.

More »

Expand

Table 6.

Accuracy comparison with existing methods on the Northwestern dataset.

More »

Expand