Distinguishability of keystroke dynamic template

When keystroke dynamics are used for authentication, users tend to get different levels of security due to differences in the quality of their templates. This paper addresses this issue by proposing a metric to quantify the quality of keystroke dynamic templates. That is, in behavioral biometric verification, the user’s templates are generally constructed using multiple enrolled samples to capture intra-user variation. This variation is then used to normalize the distance between a set of enrolled samples and a test sample. Then a normalized distance is compared against a predefined threshold value to derive a verification decision. As a result, the coverage area for accepted samples in the original space of vector representation is discrete. Therefore, users with the higher intra-user variation suffer higher false acceptance rates (FAR). This paper proposes a metric that can be used to reflect the verification performance of individual keystroke dynamic templates in terms of FAR. Specifically, the metric is derived from statistical information of user-specific feature variations, and it has a non-decreasing property when a new feature is added to a template. The experiments are performed based on two public keystroke dynamic datasets comprising of two main types of keystroke dynamics: constrained-text and free-text, namely the CMU keystroke dynamics dataset and the Web-Based Benchmark for keystroke dynamics dataset. Experimental results based on multiple classifiers demonstrate that the proposed metric can be a good indicator of the template’s false acceptance rate. Thus, it can be used to enhance the security of the user authentication system based on keystroke dynamics.

1. Abstract is not able to convey what is the technical contribution of this paper. I suggest to rewrite it.
Thank you for your comment. The following statements have been added to the abstract to describe the technical contribution of this paper.
"… That is, in behavioral biometric verification, the user's templates are generally constructed using multiple enrolled samples to capture intra-user variation. This variation is then used to normalize the distance between a set of enrolled samples and a test sample. Then a normalized distance is compared against a predefined threshold value to derive a verification decision. As a result, the coverage area for accepted samples in the original space of vector representation is discrete. Therefore, users with the higher intrauser variation suffer higher false acceptance rates (FAR). This paper proposes a metric that can be used to reflect the verification performance of individual keystroke dynamic templates in terms of FAR. Specifically, the metric is derived from statistical information of user-specific feature variations, and it has a non-decreasing property when a new feature is added to a template…." 2. Improve the quality of figures and explain those properly.
Thank you for your feedback. All the figures have been reworked to improve the quality and all the descriptions have been revised to improve the clarity. The changes are highlighted in blue color in the revised manuscript.
3. Result section is weak and I suggest authors to add more results and compare those with the existing approaches.
Thank you for your comment. We have included a new subsection (5.3.4 Performance comparison to existing approaches) to discuss the verification performance comparison to existing work. Specifically, Table 5 Verification performance comparison of existing keystroke dynamic Approaches and Table 6 Entropy measures of different authentication factors have been added to compare verification performance of our work with existing keystroke dynamic approaches and to compare the entropy of keystroke dynamic with other authentication factors, respectively.

Performance comparison to existing approaches
In this subsection, the efficacy of the proposed quality metric for keystroke dynamic templates when compared to existing keystroke dynamic approaches is demonstrated. In addition, the security of keystroke dynamic in comparison to other authentication factors is evaluated. The performance result of the algorithm used in this paper is reported in Table 5 along with that of existing work. Note that, we have followed the protocol described in [8] to evaluate the overall performance of the algorithm used in this work when compared to existing approaches. That is, the threshold is set individually for each subject and the imposter samples for each subject were from the first five samples of all other subjects. While the reported performance using this protocol can be viewed as the upper bound performance of each algorithm, one challenging issue to be solved when the system is to be used in practical applications is how to set a threshold for each subject to achieve that level of performance. Nevertheless, this result demonstrates that the verification algorithm used in this work is comparable to other existing work when the same dataset and validation protocol is used. In addition, it confirms that the proposed metric can be used to measure quality of the keystroke dynamic template regardless of whether local or global thresholding approaches are used. Lastly, entropy is another important metric to assess the security level of the system. In Table 6, the entropy measures of different authentication factors are reported. With this information, password-based two-factor authentication is estimated at 23.48-26.48 bits. Note that, the biometric entropy reported in this table is the relative entropy computed from KL divergence between genuine and imposter score distributions [28]. While the entropy of the keystroke dynamic is lower than that of the iris and fingerprint, it offers additional security at no user and hardware cost. That is, users are not required to perform additional tasks and user devices are required to equip with additional hardware. 4. Although this paper is well written, there are still some typos in the current version. I would like to suggest the authors carefully proofread this paper and correct all the typos in the revision.
Thank you very much for your suggestion. We have carefully proofread the paper and correct all the typos and spelling. We hope it now matches the journal standard.
5. Some very related and recent work may be discussed to improve the quality of literature. e.g.
-Efficient and secure routing protocol based on artificial intelligence algorithms with UAV-assisted for vehicular Ad Hoc networks in intelligent transportation systems -An overview of Internet of Things (IoT): Architectural aspects, challenges, and protocols -Security in Internet of Things: issues, challenges, taxonomy, and architecture -Improving Road Safety for Driver Malaise and Sleepiness Behind the Wheel Using Vehicular Cloud Computing and Body Area Networks, -A trust infrastructure based authentication method for clustered vehicular ad hoc networks -The optimal path finding algorithm based on reinforcement learning -The Effect of Gender, Age, and Education on the Adoption of Mobile Government Services -IoT-based Big Data secure management in the Fog over a 6G Wireless Network, Using vehicles as fog infrastructures for transportation cyber-physical systems (T-CPS): Fog computing for vehicular networks Thank you for your comment. We have included more discussion on applications of keystroke dynamic in section 2 Related work. In addition, more reference work has been cited. Details are as follows.
2 Related work As the number of approaches for user authentication has been proposed as an alternative to password, Bonneau et al. [16] has proposed a framework to compare and contrast those authentication schemes based on three main factors: usability, deployability, and security. In addition, the study revealed that, while traditional biometric-based authentication schemes (fingerprint, iris, voice) have some advantages over password based authentication in terms of usability (memorability, scalable-for-user, and physicallyeffortless) and security (physical observation and throttled-Guessing), its deployability is the main setback for the schemes as it is not server and browser compatible. As such, the current use is limited to on-device authentication such as unlocking the phone or authorizing device access to use services. Consequently, the password still plays an important role for general-purpose user authentication mechanism. On the other hand, many proposals to resolve these issues of the original password authentication scheme have been proposed. For example, password managers that require the users to log in with their master password can be used to resolve usability in terms of memorability, scalable-for-user, physically-effortless, physical observation and but fail to resolve security in terms of physical observation and throttled-guessing. In this case, behavioral biometric gleaned from keystroke information can be used to provide resilience to physical observation and throttled-guessing. In addition, Qiu et al.
[17] has proposed a secure three-factor authentication protocol where all three types of authentication factors are provided by a user. Moreover,Jiang et al. [18] has proposed another secure threefactor remote authentication protocol to preserve privacy of biometric information. These proposals allow behavioral biometric gleaned from keystroke information to be used in remote setting thereby enhancing deployability of password-based two factor authentication (password string and typing pattern). As such, the mechanism can also be used for user-IoT devices authentication where data and services on these devices are becoming more sensitive and the need for secure authentication mechanism is increasing [19,20].
[ 6. The authors are expected to report the running time of the proposed algorithm in the revision.
We have included this information at the beginning of section 5.3 Result as the following.
"... Note that, computational overhead for deriving distinctiveness score is O(n) where n is the number of features used to compute the metric (or the number of input characters). It took 0.127 milliseconds on average to compute the distinctiveness from a set of keystroke samples for each template with 10 characters (or 0.0127 milliseconds per character) using Matlab R2019b running on Windows 10 with AMD Ryzen 7 3800x. …" 7. As an additional remark, references need to be completed with all the required information (e.g. page number, name of journal/conference, vol., issues, etc). Based on the comments above, I would like to accept this paper if my following concerns are carefully addressed.
We have carefully reviewed the details of references and revised them accordingly. Thank you very much for your valuable feedback that help us improve the quality of this paper.

============Reviewer 2==============================
Reviewer #2: This paper addresses the issue of varied security level of Keystroke Dynamic by proposing a metric to quantify the quality of keystroke dynamic templates. Specifically, the metric is derived from statistical information of user specific feature variations and it can be used to reflect the verification performance of individual keystroke dynamic template in terms of false acceptance rate (FAR). The results seem nice.
Overall, I like this paper, yet there are some issues to be addressed before publication. Strength: 1) The motivation and idea are good.
2) The comprison of different template groups of keystroke behavior based user authentication in Table is very nice. 3) The experimental results seem to be reseaonable and practical. Weakness: 1. There is no rationale to explain why the proposed template/metric works. The authors shall use one or two key sentenses to explain why the new template/metric works better than existing schemes.
Thank you for your comment. We have included the following statements to explain the concept of the proposed method.
"Using the proposed metric, the template with lower intra-user variation between enrolled samples (lower σ) would have higher distinctiveness as the coverage area for accepted samples in the original space of vector representation would be smaller. In addition, the keystroke template with an additional typing character would always have a higher distinctiveness. This is an important property as the distinctiveness should never decrease when additional sequences are appended to the keystroke." 2. Recently, there have been quite a number of new behavior authentication schemes proposed (as well as other kinds of authentication schemes like multi-factor authentication schemes). The authors shall compare the proposed new template with these existing schemes using a  table like Table I of the IEEE S&P12 paper and Table IV of the IEEE TII'18 paper.
-"The quest to replace passwords-a framework for comparative evaluation of Web authentication schemes", IEEE S&P 2012. Thank you for your suggestion. We have included the discussion on this topic in the section 2 as follows.

Related work
As the number of approaches for user authentication has been proposed as an alternative to password, Bonneau et al. [16] has proposed a framework to compare and contrast those authentication schemes based on three main factors: usability, deployability, and security. In addition, the study revealed that, while traditional biometric-based authentication schemes (fingerprint, iris, voice) have some advantages over password based authentication in terms of usability (memorability, scalable-for-user, and physicallyeffortless) and security (physical observation and throttled-Guessing), its deployability is the main setback for the schemes as it is not server and browser compatible. As such, the current use is limited to on-device authentication such as unlocking the phone or authorizing device access to use services. Consequently, the password still plays an important role for general-purpose user authentication mechanism. On the other hand, many proposals to resolve these issues of the original password authentication scheme have been proposed. For example, password managers that require the users to log in with their master password can be used to resolve usability in terms of memorability, scalable-for-user, physically-effortless, physical observation and but fail to resolve security in terms of physical observation and throttled-guessing. In this case, behavioral biometric gleaned from keystroke information can be used to provide resilience to physical observation and throttled-guessing. In addition, Qiu et al.
[17] has proposed a secure three-factor authentication protocol where all three types of authentication factors are provided by a user. Moreover,Jiang et al. [18] has proposed another secure threefactor remote authentication protocol to preserve privacy of biometric information. These proposals allow behavioral biometric gleaned from keystroke information to be used in remote setting thereby enhancing deployability of password-based two factor authentication (password string and typing pattern). As such, the mechanism can also be used for user-IoT devices authentication where data and services on these devices are becoming more sensitive and the need for secure authentication mechanism is increasing [19,20].
[ Thank you for your suggestion. We have included a new subsection (5.3.4 Performance comparison to existing approaches) to discuss the verification performance comparison to existing work. Specifically, Table 5 Verification performance comparison of existing keystroke dynamic Approaches and Table 6 Entropy measures of different authentication factors have been added to compare verification performance of our work with existing keystroke dynamic approaches and to compare the entropy of keystroke dynamic with other authentication factors, respectively.

Performance comparison to existing approaches
In this subsection, the efficacy of the proposed quality metric for keystroke dynamic templates when compared to existing keystroke dynamic approaches is demonstrated. In addition, the security of keystroke dynamic in comparison to other authentication factors is evaluated. The performance result of the algorithm used in this paper is reported in Table 5 along with that of existing work. Note that, we have followed the protocol described in [8] to evaluate the overall performance of the algorithm used in this work when compared to existing approaches. That is, the threshold is set individually for each subject and the imposter samples for each subject were from the first five samples of all other subjects. While the reported performance using this protocol can be viewed as the upper bound performance of each algorithm, one challenging issue to be solved when the system is to be used in practical applications is how to set a threshold for each subject to achieve that level of performance. Nevertheless, this result demonstrates that the verification algorithm used in this work is comparable to other existing work when the same dataset and validation protocol is used. In addition, it confirms that the proposed metric can be used to measure quality of the keystroke dynamic template regardless of whether local or global thresholding approaches are used. Lastly, entropy is another important metric to assess the security level of the system. In Table 6, the entropy measures of different authentication factors are reported. With this information, password-based two-factor authentication is estimated at 23.48-26.48 bits. Note that, the biometric entropy reported in this table is the relative entropy computed from KL divergence between genuine and imposter score distributions [28]. While the entropy of the keystroke dynamic is lower than that of the iris and fingerprint, it offers additional security at no user and hardware cost. That is, users are not required to perform additional tasks and user devices are required to equip with additional hardware.