Differential privacy for eye tracking with temporal correlations

New generation head-mounted displays, such as VR and AR glasses, are coming into the market with already integrated eye tracking and are expected to enable novel ways of human-computer interaction in numerous applications. However, since eye movement properties contain biometric information, privacy concerns have to be handled properly. Privacy-preservation techniques such as differential privacy mechanisms have recently been applied to eye movement data obtained from such displays. Standard differential privacy mechanisms; however, are vulnerable due to temporal correlations between the eye movement observations. In this work, we propose a novel transform-coding based differential privacy mechanism to further adapt it to the statistics of eye movement feature data and compare various low-complexity methods. We extend the Fourier perturbation algorithm, which is a differential privacy mechanism, and correct a scaling mistake in its proof. Furthermore, we illustrate significant reductions in sample correlations in addition to query sensitivities, which provide the best utility-privacy trade-off in the eye tracking literature. Our results provide significantly high privacy without any essential loss in classification accuracies while hiding personal identifiers.


Introduction
Recent advances in the field of head-mounted displays (HMDs), computer graphics, and eye tracking enable easy access to pervasive eye trackers along with modern HMDs.Soon, the usage of such devices might result in a significant increase in the amount of eye movement data collected from users across different application domains such as gaming, entertainment, or education.A large part of this data is indeed useful for personalized experience and user-adaptive interaction.Especially in virtual and augmented reality (VR/AR), it is possible to derive plenty of sensitive information about users from the eye movement data.In general, it has been shown that eye tracking signals can be employed for activity recognition even in challenging everyday tasks [1][2][3], to detect cognitive load [4,5], mental fatigue [6], and many other user states.Similarly, assessment of situational attention [7], expert-novice analysis in areas such as medicine [8] and sports [9], detection of personality traits [10], and prediction of human intent during robotic hand-eye coordination [11] can also be carried out based on eye movement features.Additionally, eye movements are useful for early detection of anomias [12] and diseases [13].More importantly, eye movement data allow biometric and person identification tasks on publicly available eye movement datasets by using similar configurations to previous works by Steil et al. [20,29].To generate differentially private eye movement data, we use the complete data instead of applying a subsampling step, used by Steil et al. [20] to reduce the sensitivity and to improve the classification accuracies for document type and privacy sensitivity.In addition, the previous work [20] applies the exponential mechanism for differential privacy on the eye movement feature data.The exponential mechanism is useful for situations where the best enumerated response needs to be chosen [30].In eye movements, we are not interested in the "best" response but in the feature vector.Therefore, we apply the Laplace mechanism.In summary, we are the first to propose differential privacy solutions for aggregated eye movement feature signals by taking the temporal correlations into account, which can help provide user privacy especially for HMD or smart glass usage in VR/AR setups.
Our main contributions are as follows.(1) We propose chunk-based and difference-based differential privacy methods for eye movement feature signals to reduce query sensitivities, computational complexity, and temporal correlations.Furthermore, (2) we evaluate our methods on two publicly available eye movement datasets, i.e., MPIIDPEye [20] and MPIIPrivacEye [29], by comparing them with standard techniques such as LPA and FPA using the multiplicative inverse of the absolute normalized mean square error (NMSE) as the utility metric.In addition, we evaluate document type and gender classification, and privacy sensitivity classification accuracies as classification metrics using differentially private eye movements in the MPIIDPEye and MPIIPrivacEye datasets, respectively.Classification accuracy is used in the literature as a practical utility metric that shows how useful the data and proposed methods are.Our utility metric also provides insights into the divergence trend of differentially private outcomes and is analytically trackable unlike the classification accuracy.For both datasets, we also evaluate person identification task using differentially private data.Our results show significantly better performance as compared to previous works while handling correlated data and decreasing query sensitivities by dividing the data into smaller chunks.In addition, our methods hide personal identifiers significantly better than existing methods.

Previous research
There are few works that focus on privacy-preserving eye tracking.Liebling and Preibusch [31] provide motivation as to why privacy considerations are needed for eye tracking data by focusing on gaze and pupillometry.Practical solutions are; therefore, introduced to protect user identity and sensitive stimuli based on a degraded iris authentication through optical defocus [32] and an automated disabling mechanism for the eye tracker's ego perspective camera with the help of a mechanical shutter depending on the detection of privacy sensitive content [29].Furthermore, a function-specific privacy model for privacy-preserving gaze estimation task and privacy-preserving eye videos by replacing the iris textures are proposed by Bozkir and Ünal et al. [33] and by Chaudhary and Pelz [34], respectively.In addition, solutions for privacy-preserving eye tracking data streaming [35] and real-time privacy control for eye tracking systems using area-of-interests [36] are also introduced in the literature.These works lack studying effects of temporal correlations.
For the user identity protection on aggregated eye movement features, works that focus on differential privacy are more relevant for us.Recently, standard differential privacy mechanisms are applied to heatmaps [37] and eye movement data that are obtained from a VR setup [20].These works do not address the effects of temporal correlations in eye movements over time in the privacy context.In the privacy literature, there are privacy frameworks such as the Pufferfish [38] or the Olympus [39] for correlated and sensor data, respectively.These works, however, have different assumptions.For instance, the Pufferfish requires a domain expert to specify potential secrets and discriminative pairs, and Olympus models privacy and utility requirements as adversarial networks.As our focus is to protect user identity in the eye movements, we opt for differential privacy by discussing the effects of temporal correlations in eye movements over time and propose methods to reduce them.It has been shown that standard differential privacy mechanisms are vulnerable to temporal correlations as such mechanisms assume that data at different time points are independent from each other or adversaries lack the information about temporal correlations, leading an increased privacy loss of a differential privacy mechanism over time due to the temporal correlations [40,41].The aggregated eye movement features over time might end up in an extreme case of such correlations due to various user behaviors.Therefore, in addition to discussing the effects of such correlations on differential privacy over time, we propose methods to reduce the correlations so that the privacy leakage due to the temporal correlations are minimal.

Materials and methods
In this section, the theoretical background of differential privacy mechanisms, proposed methods, and evaluated datasets are discussed.

Theoretical background
Differential privacy uses a metric to measure the privacy risk for an individual participating in a database.Considering a dataset with weights of N people and a mean function, when an adversary queries the mean function for N people, the average weight over N people is obtained.After the first query, an additional query for N − 1 people automatically leaks the weight of the remaining person.Using differential privacy, noise is added to the outcome of a function so that the outcome does not significantly change based on whether a randomly chosen individual participated in the dataset.The amount of noise added should be calibrated carefully since a high amount of noise might decrease the utility.We next define differential privacy.Definition 1. -Differential Privacy ( -DP) [22,23].A randomized mechanism M is -differentially private if for all databases D and D that differ at most in one element for all S ⊆ Range(M ) with The variance of the added noise depends on the query sensitivity, which is defined as follows.
Definition 2. Query sensitivity [22].For a random query X n and w ∈ {1, 2}, the query sensitivity ∆ w of X n is the smallest number for all databases D and D that differ at most in one element such that where the L w -distance is defined as We list theorems that are used in the proposed methods.i -differentially private.
We define the Laplace Perturbation Algorithm (LPA) [22].To guarantee differential privacy, the LPA generates the noise according to a Laplace distribution.Lap(λ) denotes a random variable drawn from a Laplace distribution with a probability density function (PDF): Pr[Lap(λ) = h] = 1 2λ e −|h|/λ , where Lap(λ) has zero mean and variance 2λ 2 .We denote the noisy and differentially private values as X i = X i (D) + Lap(λ) for i = 1, 2, . . ., n.Since we have a series of eye movement observations, the final noisy eye movement observations are generated as X n = X n (D) + Lap n (λ), where Lap n (λ) is a vector of n independent Lap(λ) random variables and X n (D) is the eye movement observations without noise.The LPA is -differentially private for λ = ∆ 1 (X n )/ [22].
We define the error function that we use to measure the differences between original X n and noisy X n observations.For this purpose, we use the metric normalized mean square error (NMSE) defined as where We define the utility metric as As differential privacy is achieved by adding random noise to the data, there is a utility-privacy trade-off.Too much noise leads to high privacy; however, it might also result in poor analyses on the further tasks on eye movements.Therefore, it is important to find a good trade-off.

Methods
Standard differential privacy mechanisms are vulnerable to temporal correlations, since the independent noise realizations that are added to temporally correlated data could be useful for adversaries.However, decorrelating the data without the domain knowledge before adding the noise might remove important eye movement patterns and provide poor results in analyses.Many eye movement features are extracted by using time windows, as in previous work [20,29], which makes the features highly correlated.Another challenge is that the duration of eye tracking recordings could change depending on the personal behaviors, skills, or personalities of the users.The longer duration causes an increased query sensitivity, which means that higher amounts of noise should be added to achieve differential privacy.In addition, when correlations Definitive version: 10.1371/journal.pone.0255979 5/26 between different data points exist, is defined as the actual privacy metric instead of [43] that is obtained considering the fact that correlations can be used by an attacker to obtain more information about the differentially private data by filtering.In this work, we discuss and propose generic low-complexity methods to keep small for eye movement feature signals.To deal with correlated eye movement feature signals, we propose three different methods: FPA, chunk-based FPA (CFPA) for original feature signals, and chunk-based FPA for difference based sequences (DCFPA).The sensitivity of each eye movement feature signal is calculated by using the L w -distance such that where X n,(p,f ) and X n,(q,f ) denote observation vectors for a feature f from two participants p and q, n denotes the maximum length of the observation vectors, and w ∈ {1, 2}.
Fourier Perturbation Algorithm (FPA) In the FPA [26], the signal is represented with a small number of transform coefficients such that the query sensitivity of the representative signal decreases.A smaller query sensitivity decreases the noise power required to make the noisy signal differentially private.In the FPA, the signal is transformed into the frequency domain by applying Discrete Fourier Transform (DFT), which is commonly applied as a non-unitary transform.The frequency domain representation of a signal consists of less correlated transform coefficients as compared to the time domain signal due to the high decorrelation efficiency of the DFT.Therefore, the correlation between the eye movement feature signals is reduced by applying the DFT.After the DFT, the noise sampled from the LPA is added to the first k elements of DF T (X n ) that correspond to k lowest frequency components, denoted as Once the noise is added, the remaining part (of size n − k) of the noisy signal F k is zero padded and denoted as P AD n ( F k ).Lastly, using the Inverse DFT (IDFT), the padded signal is transformed back into the time domain.We can show that -differential privacy is satisfied by the FPA for λ = √ n √ k∆2(X n ) unlike the value claimed in previous work [26], as observed independently by Kellaris and Papadopoulos [44].The procedure is summarized in Fig 1, and the proof is provided below.Since not all coefficients are used, in addition to the perturbation error caused by the added noise, a reconstruction error caused by the lossy compression is introduced.It is important to determine the number of used coefficients k to minimize the total error.We discuss how we choose k values for FPA-based methods below.Proof of FPA being differentially private.We next prove that the FPA is differentially private for λ = ( √ n √ k∆ 2 (X n ))/ .First, we prove the inequalities (a) and (b) in the following.
Definitive version: 10.1371/journal.pone.02559796/26 where F n (I) = [ F k (I), 0, 0, . . ., 0] such that n − k zeros are padded.Consider (8)(a), which follows since we have so that by applying Cauchy-Schwarz inequality, we obtain max Consider next (8)(b), which follows since we obtain and since F n has more non-zero elements than F n , we have Recall that F n (I) = DF T (X n (I)), F n (I ) = DF T (X n (I )), and DFT is linear, so we have By applying Parseval's theorem to the DFT, we obtain Combining ( 12) and ( 14), we prove (8)(b) since we have Finally, since the LPA that is applied to F k is -DP for λ = ∆1( F n ) [22], (8) proves Definitive version: 10.1371/journal.pone.02559797/26 Chunk-based FPA (CFPA) One drawback of directly applying the FPA to the eye movement feature signals is large query sensitivities for each feature f due to long signal sizes.To solve this, Steil et al. [20] propose to subsample the signal using non-overlapping windows, which means removing many data points.While subsampling decreases the query sensitivities, it also decreases the amount of data.Instead, we propose to split each signal into smaller chunks and apply the FPA to each chunk so that complete data can be used.We choose the chunk sizes of 32, 64, and 128 since there are divide-and-conquer type tree-based implementation algorithms for fast DFT calculations when the transform size is a power of 2 [45].When the signals are split into chunks, chunk level query sensitivities are calculated and used rather than the sensitivity of the whole sequence.Differential privacy for the complete signal is preserved by Theorem 2 [42] since the used chunks are non-overlapping.As the chunk size decreases, the chunk level sensitivity decreases as well as the computational complexity.However, the parameter that accounts for the sample correlations might increase with smaller chunk sizes because temporal correlations between neighboring samples are larger in an eye movement dataset.On the other hand, if the chunk sizes are kept large, then the required amount of noise to achieve differential privacy increases due to the increased query sensitivity.Therefore, a good trade-off between computational complexity, and correlations is needed to determine the optimal chunk size.

Difference-and chunk-based FPA (DCFPA)
To tackle temporal correlations, we convert the eye movement feature signals into difference signals where differences between consecutive eye movement features are calculated as Using the difference signals denoted by X n,(f ) , we aim to further decrease the correlations before applying a differential privacy method.We conjecture that the ratio / decreases in the difference-based method as compared to the FPA method.To support this conjecture, we show that the correlations in the difference signals decrease significantly as compared to the original signals.This results in lower and better privacy for the same .The difference-based method is applied together with the CFPA.Therefore, the differences are calculated inside chunks.The first element of each chunk is preserved.Then, the FPA mechanism is applied to the difference signals by using query sensitivities calculated based on differences and chunks.For each chunk, noisy difference observations are aggregated to obtain the final noisy signals.This mechanism is differentially private by Theorem 1 [42], and described in Algorithm 1.
Algorithm 1: DCFPA. Inputs: 2) Since Theorem 1 can be applied to the DCFPA when the consecutive differences are assumed to be independent, which is a valid assumption for eye movement feature signals as we illustrate below, there is also a trade-off between the chunk sizes and

Choice of the number of transform coefficients
The proposed methods require a selection of a value for k.A small k value increases the reconstruction error, while a large k value results in an increase in the perturbation error.Therefore, it is important to find an optimal k value that minimizes the sum of the two errors.In this work, we compare a large set of possible k values to choose the best values.We apply the aforementioned differential privacy mechanisms by using 100 noisy evaluations to find optimal k values applied to features or chunks.Optimal k values have the minimum absolute NMSE for each chunk, eye movement feature, and document or recording type.In a distributed setting, each party should know the k values in advance.However, in a centralized setting, it is crucial to choose the k values in a differentially private manner.To evaluate the differential privacy in the eye tracking area while taking the temporal correlations into account, we focus on optimal k values for this work.One shortcoming of this approach is that the optimal k value compromises some information about the data, which leaks privacy [26].Our observation is that the information leaked by optimizing the parameter k is negligible as compared to the privacy reduction due to temporally correlated data.Thus, we illustrate the results with optimal k values.

Datasets
We evaluate our methods on two different publicly available eye movement datasets namely, MPIIDPEye and MPIIPrivacEye that are dedicated to privacy-preserving eye tracking.Both datasets consist of aggregated and timely eye movement feature signals related to eye fixations, saccades, blinks, and pupil diameters which are commonly used in VR/AR applications as they represent individual user behaviors.As all minimum values of wordbook features ranging from 1 to 4 are zeros in both datasets, we exclude Definitive version: 10.1371/journal.pone.0255979them from the utility and privacy calculations.In addition, we remark that both datasets are available for non-commercial scientific purposes.
MPIIDPEye [20]: A publicly available eye movement dataset consisting of 60 recordings that is collected from VR devices for a reading task of three document types (comics, newspaper, and textbook) from 20 (10 female, 10 male) participants.Each recording consists of 52 eye movement feature sequences computed with a sliding window size of 30 seconds and a step size of 0.5 seconds.
MPIIPrivacEye [29]: A publicly available eye movement dataset consisting of 51 recordings from 17 participants for 3 consecutive sessions with a head-mounted eye tracker and a field camera, which is similar to an AR setup.Each recording consists of 52 eye movement feature sequences computed with a sliding window size of 30 seconds and a step size of 1 second, and each observation is annotated with binary privacy sensitivity levels of the scene that is being viewed.The dataset also consists of scene features extracted with convolutional neural networks.We do not evaluate the last part of the recording 1 of the participant 10, as the eye movement features are not available for this region.To detect the privacy level of the scene that is being viewed, we remark that the scene is very important [46]; however, an individual's eye movements can improve the detection rate when they are fused with the information from the scene.

Results
This section discusses data correlations in addition to evaluations using utility and classification metrics.The utility and classification results are averaged over 100 noisy evaluations with the optimal k values in MATLAB.We evaluate and compare the utility of differentially private eye movement feature signals by using absolute NMSE, as this metric provides analytically trackable results.However, it does not provide implications regarding the practical usability of the private eye movement signals.Therefore, we also report classification accuracies of document type, scene privacy sensitivity, gender prediction, and person identification tasks in order to show the usability of the private data and proposed methods.An optimal trade-off between utility tasks (e.g., low absolute NMSE, high classification accuracy in document type prediction) and privacy (e.g., low , low classification accuracy in person identification or gender prediction tasks) is favorable.

Correlation analysis
Using the correlation coefficient as the metric, we first illustrate high temporal correlation between eye movement feature data.Since there are 52 eye movement features in both datasets, it is not feasible to illustrate all correlation results.Thus, in the following we illustrate the correlations for the features ratio large saccade and blink rate in the MPIIDPEye and MPIIPrivacEye datasets, respectively.The correlation coefficients of ratio large saccade and blink rate for three document and recording types over a time difference ∆t w.r.t. the signal samples at, e.g., the fifth time instance for original eye movement feature signals and difference signals for all participants for both datasets are depicted in Figs 3, 4 and Figs 5, 6, respectively.As correlations between the difference signals are significantly smaller than correlations between the original eye movement feature signals, the DCFPA is less vulnerable to privacy reduction due to temporal correlations, thus ensuring that the value of is close to the differential privacy design parameter .

Utility results
We evaluate the utility defined in Eq (6) by applying our methods separately to different document and recording types; therefore, we report the utility results separately.As we apply the proposed methods separately to each eye movement feature, we first calculate the mean utility of each feature and then calculate the average utility over all features.
The utility results for various values for aforementioned methods on the MPIIDPEye and MPIIPrivacEye datasets are given in  While a high absolute NMSE, i.e., low utility, does not necessarily mean that a mechanism is completely useless, higher utility means that the mechanism would perform more effectively than low utility in various tasks.The trend in the utility results of both evaluated datasets are similar.As the query sensitivities are lower in CFPA, utilities of CFPA are always higher than the utilities of the FPA as theoretically Definitive version: 10.1371/journal.pone.0255979 12/26 expected.The DCFPA particularly with small chunks outperforms other methods in the most private settings, namely in the lowest regions.When different chunk sizes are compared within the CFPA and DCFPA, different chunk sizes perform similarly for the CFPA method.For the DCFPA, there is a significant trend for better utilities when the chunk sizes are decreased.However, as temporal correlations in the smaller chunk sizes higher and since a higher chunk size reduces the temporal correlations better, it is ideal to use a higher chunk size if the utilities are comparable.In general, while the LPA, namely the standard Laplace mechanism used for differential privacy, is vulnerable to temporal correlations [41], our methods also outperform it in terms of utilities.In addition to high utilities, the calculation complexities are decreased with the CFPA and DCFPA which is another advantage of chunk-based methods.

Classification accuracy results
We evaluate document type and gender classification results for the MPIIDPEye and privacy sensitivity classification results for the MPIIPrivacEye by using differentially private data generated by the methods which handle temporal correlations in the privacy context.In addition, for both datasets, we evaluate person identification tasks.While a NMSE-based utility metric provides analytically trackable way for comparison, evaluating private data using classification accuracies give insights about the usability of the noisy data in practice.Instead of only using Support Vector Machines (SVM) as in previous works [20,29], we evaluate a set of classifiers including SVMs, decision trees (DTs), random forests (RFs), and k-Nearest Neighbors (k-NNs).We employ a similar setup as in previous work [20] with radial basis function (RBF) kernel, bias parameter of C = 1, and automatic kernel scale for the SVMs.For RFs and k-NNs, we use 10 trees and k = 11 with a random tie breaker among tied groups, respectively.We normalize the training data to zero mean and unit variance, and apply the same parameters to the test data.Although we do not apply subsampling while generating the differentially private data, which is applied in previous work [20], we use subsampled data for training and testing for document type, gender, and privacy sensitivity classification tasks with window sizes of 10 and 20 for MPIIDPEye and MPIIPrivacEye, respectively, to have a fair comparison and similar amount of data.Apart from the person identification task, all the classifiers are trained and tested in a leave-one-person-out cross-validation setup, which is considered as a more challenging but generic setup.For the person identification task, since it is not possible to carry out the experiments in a Definitive version: 10.1371/journal.pone.0255979leave-one-person-out cross-validation setup, we opt for a similar configuration as in previous work [20] by using the first halves of the signals as training data and the remaining parts as test data.Such setup can be considered as one of the hypothetical best-case scenarios for an adversary as this simulates some set of prior knowledge for an adversary on participants' visual behaviors.For the person identification task, in order to use similar amount of data with other classification tasks from each signal, we use window sizes of 5 and 10 for MPIIDPEye and MPIIPrivacEye, respectively.For the MPIIDPEye, we evaluate results both with majority voting by summarizing classifications from different time instances for each participant and recording and without majority voting.Privacy sensitivity classification tasks for MPIIPrivacEye are carried only without majority voting since privacy sensitivity of the scene can change at each time step and applying majority voting to such task in our setup is not reasonable.
While classification results cannot be treated directly as the utility, they provide insights into the usability of the differentially private data in practice.We first evaluate document type classification task in the majority voting setting in Table 1 for MPIIDPEye dataset as it is possible to compare our results with the previous work [20].As previous results quickly drop to the 0.33 guessing probability in high privacy regions, we significantly outperform them particularly with DCFPA and FPA with the accuracies over 0.60 and 0.85, respectively.In the less private regions towards = 48, this trend still exists with the CFPA and FPA with accuracy results over 0.7 and 0.85.Chunk-based methods perform slightly worse than the FPA in the document type classifications even though the utility of the FPA is lower.We observe that the reading patterns are hidden easier with chunk-based methods; therefore, document type classification task becomes more challenging.This is especially validated with DCFPA methods using different chunk sizes, as DCFPA-128 outperforms smaller chunk-sized DCFPAs even though the sensitivities are higher.Therefore, we conclude that the differential privacy method should be selected for eye movements depending on the further task which will be applied.The document type classification results without majority voting are provided in the table in S1 Next, we analyze the gender classification results for MPIIDPEye.All methods are able to hide the gender information in the high privacy regions as it is already challenging to identify it with clean data as accuracies are ≈ 0.7 in previous work [20].While we obtain similar results compared to previous work for the gender classification task, the CFPA method is able to predict gender information correctly in the less private regions, namely = 48, as it also has the highest utility values in these regions.The FPA applied to the complete signal and the DCFPA are not able to classify genders accurately.We observe that higher amount of noise that is needed by the FPA and noising the fine-grained "difference" information between eye movement observations with DCFPA are the reasons for hiding the gender information successfully in all Definitive version: 10.1371/journal.pone.0255979 15/26 privacy regions.Overall, the CFPA provides an optimal equilibrium between gender and document type classification success in the less private regions if gender information is not considered as a feature that should be protected from adversaries.Otherwise, all proposed methods are able to hide gender information from the data in the higher privacy regions as expected.Gender classification results are depicted in Table 2.
Especially in some methods with k-NNs and SVMs, gender classification accuracies are close to zero because of the majority voting and it is validated by the results without majority voting in the table in S2 Lastly for the MPIIDPEye, we evaluate person identification task using differentially private data.The resulting classification accuracies with majority voting are depicted in Table 3.By using the FPA, it is possible to identify the participants very accurately, which means that even though the document type classification accuracies of the FPA are higher than the others, a strong adversary can also identify personal ids when this method is used.The same trend also exists in the without majority voting setting, which is reported in the table in S3 Table .The CFPA and DCFPA perform well against person identification attempts in the high privacy regions.However, when the CFPA is used, it is possible to identify personal ids in the less private regions.Overall, for the MPIIDPEye dataset, the DCFPA performs better than the others due to its resistance against person identification and gender classification and relatively high classification accuracies for the document type predictions.We conclude that this is due to the robust decorrelation effect of the DCFPA.For the MPIIPrivacEye, we report privacy sensitivity classification accuracies using differentially private eye movement features in the Table 4.The FPA performs worse than our methods.The DCFPA, particularly with the chunk size of 32, outperforms all other methods slightly in the higher privacy regions as it is also the case for the utility results.In the lower privacy regions, the CFPA performs the best with ≈ 0.60 accuracy.
Since performance does not drop significantly in the higher chunk sizes, it is reasonable to use higher chunk-sized methods as they decrease the temporal correlations better.While having an accuracy of approximately 0.6 in a binary classification problem does not form the best performance, according to the previous work [29], privacy sensitivity classification using only eye movements with clean data in a person-independent setup only performs marginally higher than 0.60.Therefore, we show that even though we use differentially private data in the most private settings, we obtain similar results to the classification results using clean data.This means that differentially private eye movements can be used along with scene features for detecting privacy sensitive scenes in AR setups.The results of the person identification task in the MPIIPrivacEye dataset are similar to the results of the MPIIDPEye dataset and the results with majority voting are depicted in Table 5.Personal identifiers are predicted very accurately when the FPA is used.The CFPA and DCFPA are resistant to person identification attacks in all privacy regions performing around the random guess probability in almost all cases.Similar to the classification results of the MPIIDPEye dataset, the DCFPA method performs the best when utility-privacy trade-off is taken into consideration.The person identification results without majority voting are presented in the table in S4 Table.

Discussion
We compared our differential privacy methods with the standard Laplace mechanism as well as the FPA method, which is proposed for temporally correlated data, by using the MPIIDPEye and MPIIPrivacEye datasets.The utility results based on the NMSE metric show that due to the reduced sensitivities as a result of the chunking operations, Definitive version: 10.1371/journal.pone.0255979 the CFPA and DCFPA perform better than the FPA and standard Laplace mechanism.While larger chunk sizes applied with the CFPA and DCFPA decrease the effects of temporal correlations on the differential privacy mechanisms, they also increase the sensitivities, leading to higher amount of noise addition to the data and worse utility performance.Utility evaluations represent how much differentially private signals diverge from the original signals.Having eye movement feature signals less diverged from the original values by providing the differential privacy would lead better performance in various tasks.While the FPA, CFPA, and DCFPA are appropriate for temporally correlated data, the DCFPA uses the consecutive differences of eye movement feature signals, which are significantly less correlated than the original feature signals, as illustrated in Figs 4 and 6.Thus, the DCFPA is less vulnerable to temporal correlations in the differential privacy context.
In addition to utility results, we evaluated document type, gender, and person identification tasks for the MPIIDPEye dataset and privacy sensitivity classification of the observed scene and person identification task for the MPIIPrivacEye dataset and compared our results with the previous works especially in the eye tracking literature.The FPA outperforms the CFPA and DCFPA in document type classification task since the chunks perturb "Z"-type reading patterns.However, this might be a task-specific outcome as the CFPA and DCFPA perform better in terms of utility.In addition, when the FPA is used, personal identifiers can be estimated with high accuracies in both datasets.On the contrary, especially the DCFPA provides decreased probabilities for person identification tasks in the MPIIDPEye, and probabilities close to the random guess probability for the MPIIPrivacEye, which are optimal from a privacy-preservation perspective.We remark that this outcome is also related to decreased temporal correlations.Gender information is successfully hidden in all methods and scene privacy can be predicted to some extent using differentially private eye movement signals.In addition, privacy sensitivity detection results on the MPIIPrivacEye are consistent with the utility results based on the NMSE metric.
Due to the significant reduction of temporal correlations and high utility and relatively accurate classification results in different tasks, the DCFPA is the best performing differential privacy method for eye movement feature signals.In addition, it is not possible to recognize the person accurately from eye movement data when the DCFPA is used.From correlation reduction point of view, in both methods namely, CFPA and DCFPA, when the performances are similar, it is reasonable to use higher chunk sizes as such chunks are less vulnerable to temporal correlations as illustrated in Figs 3 and 5. Overall, our methods outperform the state-of-the-art for differential privacy for aggregated eye movement feature signals.

Conclusion
We proposed different methods to achieve differential privacy for eye movement feature signals by correcting, extending, and adapting the FPA method.Since eye movement features are correlated over time and are high dimensional, standard differential privacy methods provide low utility and are vulnerable to inference attacks.Thus, we proposed privacy solutions for temporally correlated eye movement data.Our methods can be easily applied to other biometric human-computer interaction data as well since they are independent of the used data and outperform the state-of-the-art methods in terms of both NMSE and classification accuracy and reduce the correlations significantly.In future work, we will analyze the actual privacy metric which takes the data correlations into account and choose k values in a private manner for the centralized differential privacy setting.

Fig 3 . 3 Fig 4 . 3 Fig 5 .
Fig 3. Correlation coefficients of the raw signals of feature ratio large saccade in the MPIIDPEye dataset.The values are calculated over a time difference of ∆t (Each time step corresponds to 0.5s) w.r.t. the samples at the fifth time instance.

Fig 6 . 3 Fig 7 .Fig 8 .
Fig 6.Correlation coefficients of the difference signals of feature blink rate in the MPIIPrivacEye dataset.The values are calculated over a time difference of ∆t (Each time step corresponds to 1s) w.r.t. the samples at the fifth time instance.
If a large chunk size is chosen, then the total value could be very large, which reduces privacy.Therefore, we choose chunk sizes of 32, 64, and 128 for the DCFPA as well for evaluation We illustrate the CFPA and DCFPA in Fig 2,for instance with three chunks.

Table . Table 1 .
Document type classification accuracies in the MPIIDPEye dataset using differentially private eye movement features with majority voting.

Table 3 .
Person identification accuracies in the MPIIDPEye dataset using differentially private eye movement features with majority voting.

Table 4 .
Privacy sensitivity classification accuracies in the MPIIPrivacEye dataset using differentially private eye movement features.

Table 5 .
Person identification classification accuracies in the MPIIPrivacEye dataset using differentially private eye movement features with majority voting.