Table 1.
Summary of representative vision-based hand gesture recognition methods (2020–2025).
Fig 1.
Overall architecture of the proposed framework.
The pipeline consists of four main components: (A) keyframe extraction from input video; (B) feature extraction, including regional spatial features via SPD covariance matrices and temporal features via optical flow histograms on a grid; (C) feature unification and vectorization, where SPD features are mapped to Euclidean space and concatenated with temporal features; and (D) classification using the combined feature representation.
Fig 2.
Region-based Covariance Matrix Computation for SPD Manifold Spatial Features.
Fig 3.
Grid-based Optical Flow Histogram Extraction for Temporal Features.
Fig 4.
Illustration of the feature unification and classification process.
Spatial features modeled on the SPD manifold are mapped to Euclidean space using Log-Euclidean metrics and then fused with temporal motion features extracted from optical flow histograms. The combined feature vector is used for robust hand gesture classification.
Fig 5.
Cambridge hand gesture dataset [46]: combinations of three hand shapes (flat, spread, V-shape) with three motions (leftward, rightward, contract) for each shape.
Fig 6.
Northwestern University hand gesture dataset [24]: ten dynamic hand gestures including directional movements, rotations, circles, and symbolic gestures such as ‘Z’ and cross.
Fig 7.
Classification accuracy for different keyframe selection methods.
Table 2.
Parameter settings for the algorithms used in this paper.
Table 3.
Precision, recall, and F1 scores for hand gesture classification on Northwestern and Cambridge datasets.
Table 4.
Comparison of classification accuracies (%, mean ± std over 20 runs) on the Cambridge and Northwestern datasets.
Fig 8.
Confusion matrices for analyzing the classification results: (A) Combined Features – Cambridge, (B) Combined Features – Northwest, (C) SPD Features – Northwest.
Table 5.
Accuracy comparison with existing methods on the Cambridge dataset.
Table 6.
Accuracy comparison with existing methods on the Northwestern dataset.