Fig 1.
A comprehensive HAR method hierarchy.
HAR is usually divided between techniques based on sensors and approaches based on vision. The two main division separation is subject to the nature of sensors, where each sensor data is usable by multiple domains.
Fig 2.
The data modalities that are frequently employed in vision-based HAR techniques.
Each modality has multiple uses, thus vision-based approaches further classification might take place and associated for tasks that follow. The most noteworthy, but not limited to the ones are posture estimation, action identification, and human motion prediction.
Fig 3.
NTU RGB+D depth data distinguished in Train and Test for each action and representing number of samples available separately.
The number of samples for each action is unequal in count.
Fig 4.
(a) Represents I2I Experimental Threshold Variant Feature Selection Mechanism with detail of number of features qualified in tabular form. While (b) represents the graphical view of number of features qualified by HoD.
The minimum number of features qualified by I2I@10 and maximum qualified by I2I@90.
Fig 5.
The schematic threshold configuration data visualization for each train and test feature samples.
The whole data features qualified by HoD consisting of each action samples. The number of features varies for each scheme and represents the whole corpus range visualization in 2D scatter plot.
Fig 6.
The configuration description of the proposed model I2I.
It is comprising of variant layers and its parametric details. From input layer to the classification layer, their types, activations and learnable parameter details are elaborated sufficiently.
Fig 7.
The proposed model I2I specification and its complete system with depth data.
The system initiates with depth data images feature input to the first layer of proposed model I2I. The input layer passes the received features to the next two LSTM layers, which is followed by a series of fully connected layers. Those fully connected layers are consisting of Leaky Relu layers which produces final outcomes of the input features to classify the feature in their respective classes.
Fig 8.
The Depth data recognition with the I2I model in a complete system.
The flow of system initiates with depth data images input and features selected with help of HoD. Only a single schematic configuration features are provided to proposed model I2I for classification of medical actions in their respective classes.
Table 1.
The I2I model performance comparison with the state-of-the-art research exploiting similar studies of Depth image data for action recognition. Each study has performed the evaluation on similar set of data using their peculiar technique. Whilst, the proposed I2I model out performs better than state-of-the-art techniques.
Table 2.
The I2I model performance at various Schematic Threshold Configurations in terms of Accuracy, Precision, Recall and F1 Score. Each scheme’s performance is captured in the table to compare the performance with each other.
Fig 9.
The Confusion matrices reveal the performance of all the variants of Schematic Threshold Configuration.
The matrices are consisting of True and Predicted classes for the specific Schematic Threshold Configurations.