Fig 1.
The architecture of FaceTouch.
Fig 2.
A sample of the collected data that belongs (Faces are blurred for privacy).
Table 1.
The datasets used for different tasks.
Table 2.
Evaluation metrics of the trained models of the FaceTouch framework.
Fig 3.
Evaluation of the object detection model.
(a) describes the relationship between F1 and confidence for the different classes of the model. (b) describes the relationship between Recall and confidence for the different classes of the model. (c) describes the relationship between Precision and confidence for the different classes of the model. (d) describes the relationship between Precision and Recall for the different classes of the model, highlighting the average curve (in blue colour).
Fig 4.
ROC curves for trained action recognition models.
(a) represents the trained models with supervised learning. (b) represents the trained models with Supervised Contrastive Learning.
Fig 5.
Examples of the predicted positive and negative cases for face touches.
Fig 6.
Examples of overlaying the learned attention of the model with the images, highlighting a high accuracy of localising the attention on the faces and hands.
Fig 7.
Examples of s incorrectly identified cases (highlighted in red) in comparison to correctly labelled ones (highlighted in green).
Fig 8.
Deploying the FaceTouch tool in video streams of several complex settings such as video calls, bus CCTV footage, and street CCTV.
Fig 9.
The shortcomings of applying pose estimation as the backbone for FaceTouch.
The figure shows several real-world cases under different environmental conditions and complex urban scenes (beyond a single face person).