Ensuring privacy and confidentiality of cloud data: A comparative analysis of diverse cryptographic solutions based on run time trend

The cloud is becoming a hub for sensitive data as technology develops, making it increasingly vulnerable, especially as more people get access. Data should be protected and secured since a larger number of individuals utilize the cloud for a variety of purposes. Confidentiality and privacy of data is attained through the use of cryptographic techniques. While each cryptographic method completes the same objective, they all employ different amounts of CPU, memory, throughput, encryption, and decryption times. It is necessary to contrast the various possibilities in order to choose the optimal cryptographic algorithm. An integrated data size of 5n*102 (KB (∈ 1,2,4,10,20,40) is evaluated in this article. Performance metrics including run time, memory use, and throughput time were used in the comparison. To determine the effectiveness of each cryptographic technique, the data sizes were run fifteen (15) times, and the mean simulation results were then reported. In terms of run time trend, NCS is superior to the other algorithms according to Friedman’s test and Bonferroni’s Post Hoc test.


Introduction
Human activity has risen, making communication more difficult, and necessitating data protection.[1].A paradigm shift in data storage needs to be implemented in order to secure these enormous amount of data [2].Due to the enormous amount of data produced by numerous social media platforms, including Facebook, Twitter, Instagram, and e-commerce websites, cloud computing is currently the preferred option [3].
Amazon's four-hour cloud computing downtime in 2017 cost S&P 500 Company $150 million, according to a Maeser [4].A network traffic control organization called Apica predicted that the top 54 e-commerce sites will experience a decline in activity of at least 20% [4].According to Ponemon, Fortune 1000 companies lost just over $ 2.5 billion in 2015 as a result of data center shutdowns brought on by hackers.According to Maeser [4], the need for cloud computing will increase by about 266% between 2013 and 2020 as a result of the massive volumes of data that the Internet of Things will produce.
Once more, Maeser [4] stressed that the infrastructure-as-a-service aspect of cloud computing will result in an increase in demand of roughly 85%.
Because of the benefits of agility, scalability, availability, accelerating the development of work, and lowering operating costs by utilizing pay-as-you-use services, cloud computing continues to gain popularity over traditional on-site data centers [5,6].As a result, Information Technology giants are now investing far more money on cloud computing than they did in the past.These advantages have led businesses to use cloud services such as Software-as-a-Service (SaaS), Infrastructure-as-a-Service (IaaS), Platform-as-a-Service, and Container-as-a-Service (CaaS) for their pay-as-you-use activities on the cloud [7][8][9].
The adoption of cloud computing is accompanied by a number of security issues, such as data privacy and confidentiality [10][11][12].In order to guarantee the secrecy and privacy of cloud data, cryptographic techniques have shown to be an effective and efficient methods [12][13][14][15][16]18].
In this study, the symmetric algorithms Enhanced RSA (ERSA) [17], Non-Deterministic Cryptographic Scheme (NCS) [18], Enhanced Homomorphic Scheme (EHS) [19], Chacaha20, and Salsa20 are compared based on run time trend, throughput time, and memory usage to determine which algorithm among them could be used to ensure the confidentiality and privacy of cloud data.

Identified problem
The biggest problem with cloud computing has been data security.Researchers have suggested many encryption technique variations to protect cloud data [20].Enhanced RSA [17], Non-Deterministic Cryptographic Scheme [18], Enhanced Homomorphism Scheme [19], Cha-cha20, and Salsa20 are a few examples of these techniques.These suggested encryption techniques are effective in limiting unauthorized access to sensitive information.
In terms of run time trend, throughput time, and memory complexity, it is unclear which one performs better than the other.Such knowledge is essential for industry startups, other professionals, and researchers who are interested in utilizing effective algorithms to protect privacy and confidentiality of data in the cloud.Therefore, the major goal of this research is to test Enhanced RSA, Non-Deterministic Cryptographic Schemes, Enhanced Homomorphism Schemes, Chacha20, and Salsa20 in order to determine the computational statistics of the best method.Once more, this study offers a solid framework that theoretically and practically combines all of the recognized algorithms into a robust system.The principal contribution of this paper is the proposition of a comprehensive cryptographic scheme(s) that can be used to ensure confidentiality and privacy of cloud data.

Literature review
Cloud computing users save their data in the cloud making it a remote location based system.It compromises secrecy, which is essential for cloud computing to be acceptable.To boost security and trustworthiness, cryptography is often used in cloud computing.
ALmarwani et al. [21] presented a unique tagging approach called Tagging of Outsourced Data (TOD) in an endeavor to protect the secrecy of data stored in the cloud.Their method supported cloud data through verification.Their method had a short run time, enabling for widespread use by mobile devices.Tahir et al. [22] presented a genetic algorithm called Cryp-toGa to help Almarwani et al. achieve data privacy.When compared to state-of-the-art algorithms like AES, RSA, and DES, their approach had shorter execution times.
Shen et al. [23] advocated using proxy re-encryption and Oblivious Random Access Memory (ORAM).Their technique was designed to ensure multi-user data sharing on the cloud.
The ciphertext gained through the proxy re-encryption enabled members to regulate access and, as a consequence, establish data privacy.
Garad et al. [24] suggested a cryptosystem to protect submitted files to the cloud server.AES-CCM, AES-GCM, and CHACHA20_POLY1305 were the asymmetric cryptographic algorithms they employed.They divide the file into N pieces, then use various cryptographic techniques to encrypt each portion.Thabit, et al. [25], ensured the confidentiality and privacy of cloud data by proposing a Lightweight Cryptographic Algorithm.Their algorithm integrated Feistel and substitution schemes to raise the encryption complexity of their algorithm.Their algorithm was very effective regarding run time.Tiwari and Neogi [26] proposed a security scheme that secured a multi-tenant hybrid cloud by combining Kerberos Authentication Protocol with Resource Allocation Manager Unit (RAMU).Their scheme allowed for more resource access while also improving client confidentiality and security.The model validates the user's request before providing access, preventing the password from being revealed to hackers during transmission.The Key Distribution Centre (KDC) validates the request and RAMU grants access after reviewing the control database and resource allocation map.Gadde et al. [27] suggested an Improved Blowfish cryptography strategy for encrypting and decrypting sensitive data in the cloud server using the optimum key.Optimal key creation is a critical method for achieving integrity and confidentiality goals of integrity and confidentiality.Similarly, data restoration is the inverse process of sanitization (decryption).In [28] a four-step data security approach in cloud computing was proposed.They used the least significant bit of LSB approaches to integrate three cryptographic algorithms, RSA, AES, and identity-based encryption, with steganography to attain confidentiality and privacy of cloud data.

Enhanced RSA
Enhanced RSA Improved traditional RSA's security by combining classic RSA with Gaussian interpolation formula.The integration raises the security of RSA to the fifth level.After encrypting the message's ASCII values with Gaussian First Forward interpolation, the conventional RSA is used to encrypt and decode the message at the second and third levels.The last stage uses Gaussian First Backward interpolation to decode the data again, as seen in Fig 1 .The integration helps to overcome the classic RSA factorization problem [17].

Non-Deterministic Cryptographic Scheme
This method consists of three stages: key generation, encryption, and decryption.The three levels of key production are meant to produce secret keys that help to secure the algorithm [18].These include the use of Good Prime numbers, the Linear Congruential Generator, the Fixed Sliding Window Algorithm, and XORing the output with the plaintext.In NCS a sub array of n a i is computed on the twelve numbers generated after the application of the Fixed Sliding Window [18].

Enhanced Homomorphism Scheme
Enhanced Homomorphism Scheme (EHS) is developed and implemented by the amalgamation of Good Prime Numbers (GPN), Linear Congruential Generator (LCG), Fixed Sliding Window Algorithm (FSWA), and Gentry's Algorithm.Two stages are considered in this algorithm, the generation of keys and the application of the homomorphism scheme.Three procedures are used to generate the keys which includes the generation of two good prime numbers with the product as a seed for the Linear Congruential Generator to produce twelve numbers.The sliding window algorithm is applied on the twelve numbers using a sub-array of three . The first value is s i , second value s j , the third value s k , fourth value is s l and with M the plaintext as seen in Eq 1 for data encryption [19].
The three operations of Salsa20, addition, rotation, and XOR, give desired cryptographic features, with modular addition providing non-linearity and bit rotation providing diffusion in a word.The diffusion attribute is propagated from one word to the next via the XOR technique [29].

Chacha20
Chacha20 is a 2008 update of Salsa20 that uses a new round function to boost diffusion.Salsa20 uses a 32-bit module addition, XOR, and rotation operations based core hash function, whereas Chacha20 uses an internal state of sixteen 32-bit words arranged as a 4 × 4 matrix to map a 256-bit key (128-bit key is also suitable for Salsa20), a 64-bit nonce, and a 64-bit counter to a 512-bit block key stream.While the configurations differ, their beginning states are both made up of 8 words of key, 2 words of stream position (counter), 2 words of nonce, and 4 words of constant.Chacha20 is a replacement for Salsa20's quarterround, which updates each word twice with the same amount of operations as shown in Eq 3. Unlike Salsa20, which alternates quarterrounds down columns and across rows, Chacha20 executes quarterrounds down columns and along diagonals in a doubleround.Chacha20 likewise employs 10 doubleround iterations and the same output function [30].

The proposed framework of the system
This section provides a broad summary of the comparative study of the enhanced RSA, Non-Deterministic Cryptographic Scheme, enhanced Homomorphism Scheme, Chacha20, and Salsa20.For the ERSA, NCS, EHS, ChaCa20, and Salsa20, the architecture is divided into five phases: key generation, encryption, decryption, memory utilization, and throughput.From  4 show a snapshot of the run time for 4KB data for Chacha20 using the system, with further information available at https://github.com/Elkie1/Chacha20.

Experimentation
The comparative analysis of the Enhanced RSA [17], Non-Deterministic Cryptographic Scheme [18], Enhanced Homomorphism Scheme [19], Salsa20 and Chacha20 was implemented on an i7 Lenovo computer, 2.10 GHz CPU using C# language.C# programming language is preferable because it influences the execution time of data as was implemented in a NET C# programming language where it was used to test the execution of AES algorithm resulting in 300MB/seconds while OpenSSL C simulation produced a 960 MB/seconds average speed [31].

Description of dataset used in this work
The study's dataset was collected from the Kaggle database [32].The dataset was used to assess the robustness of the algorithms in terms of run time trend, memory use, and throughput.The dataset is an English-to-French translation that includes text, numbers, and special characters.This is critical since Loyka et al.'s [33] investigation showed divergent results when only text and numbers were utilized.The proposed algorithms were evaluated using data sizes of 5n�10 2 (KB (2 1,2,4,10,20,40).The dataset was run fifteen (15) times to ensure the accuracy of the run time, and the mean and standard deviation of the execution were computed.

Encryption time
From Table 1, Chacha20 had the lowest mean encryption time of 15.9047±1.69milliseconds followed by NCS (52.13±31.1766)with Salsa20 having the highest mean encryption time of 853±85.06milliseconds when data size of 500KB (0.5MB) was executed.The encryption time increased from 853±85.06 milliseconds to 1302.8 ±703.97 milliseconds for Salsa20 when the data size was increased to 1000KB making it linear [34,35].The encryption time for ERSA also increased from 462.93±40.93milliseconds to 575.67±57.05millisecond when data size was raised to 1000KB.However, the encryption time for NCS reduced from 147.33±172.41milliseconds to 85.8±54.46milliseconds and to 82.2±75.17milliseconds when data size was increased from 500KB to 1000KB, and from 2000KB to 5000KB.2, it could be seen that for Salsa20, Chacha20 and ERSA their run time increases as data sizes increases making them linear (O (N)).However, the run time for NCS and EHS alternates as the sizes of the data increases making it non-linear.

Encryption throughput (KB/ms)
The number of units of data that can be processed at a given time is considered throughput [36].This is computed using Eq 4.

Throughput ¼ Size of Data Run Time ð4Þ
From Table 3, with 500KB of data, Salsa20 had the lowest mean encryption throughput time of 0.039078 KB/ms with Chacha20 having the highest throughput time of 31.43731KB/ ms.Also, when the data size was increased to 2000KB, NCS had the highest mean encryption throughput time of 23.31002 KB/ms with Salsa20 having the lowest mean encryption throughput time of 0.04541KB/ms followed by ERSA with a mean encryption throughput time of 2.87383 KB/ms.However, when data size was increased to 10000KB, NCS had the highest encryption throughput 101.9714KB/ms.However, with a data size of 20000KB, EHS had the highest mean encryption throughput time of 134.6499102KB/ms followed by NCS with a mean encryption throughput time of 122.6492KB/ms.

Decryption throughput
From Table 4, NCS had the highest mean decryption throughput time of 6.756757 KB/ms with Salsa20 having the lowest mean decryption throughput time of 0.575109 KB/ms when data size of 500KB was executed.Again when the data size was increased to 2000KB, NCS had the highest mean decryption throughput of 12.53656 KB/ms followed by EHS with a mean decryption throughput time of 8.517887564 KB/ms.However, with a data size of 20000KB, EHS had the highest mean decryption throughput time of 145.0676983KB/ms followed by NCS with a mean decryption throughput time of 125.8917KB/ms.

Encryption memory usage
From Table 5, with a data size of 500KB, Salsa20 has high memory complexity of 196.4±32.24megabytes followed by Chcaha20 (191.8±35.91 megabytes) with EHS having the lowest memory complexity of 16±2.75 megabytes.When data size was increased to 20000KB, Salsa20 used 6160.27±650.17megabytes of memory which was still the highest followed by Chacha20 (6092.27±653.58)while NCS used the least memory complexity of 16.4±3.78megabytes.

Decryption memory usage
From Table 6, Salsa20 has the highest mean memory complexity of 192.13±34.41megabytes of memory when 500KB of data was decrypted with NCS having the lowest mean memory complexity of 15.87±3.04megabytes.However, when data size was increased to 20000 KB, Cha-cha20 had the highest memory complexity of 6281.4±713.08megabytes with NCS still having the lowest memory complexity of 17.07±3.28megabytes.

Comparing significance difference between the encryption and decryption times using Friedman Test and Bonferroni Post Hoc test
The Friedman's Test tests the hypothesis that "all treatment effects are zero" as against the alternate hypothesis "not all treatment effects are zero".From the output in Tables 7 and 9, P-Value = 0.00 < 0.05(alpha value), which indicates that the difference in the encryption times is statistically significant for the different algorithms and data sizes.
A Post Hoc Bonferroni pairwise comparison test was used to see if there was a significant difference between pairs of algorithms and data sizes.The encryption timings for EHS and NCS were statistically different from Salsa20, Chacha20, and ERSA with P-Values less than 0.05 (P − Value < 0.05, Reject H 0 ) according to Table 8.
From Table 10, the decryption Times for Salsa20-Chacha20, ERSA -EHS, and NCS -EHS are statistically not different.

Discussions
From Table 1, it could be deduced that the encryption times for Salsa20 and Chacha20 were proportional to the data sizes executed which resulted from the addictive, XORing, and constant distance rotation during execution [37,38].ERSA encryption times also showed a proportional relationship between data size and encryption time [17].This made their encryption times predictable, deterministic, and patterned which confirms the work of Masram et al. [39] and [38,[40][41][42][43][44][45].The use of longer keys ensures higher security but results in higher CPU utilization when encryption time is dependent on data size (O (N)) [46].However, the use of smaller keys is the best employed in cloud computing due to less CPU engagement [47].
The encryption time for NCS and EHS is non-patterned, non-deterministic, and unpredictable because of the disintegration of the keys through the application of a Fixed Sliding Window Algorithm and XORing of the keys and the plaintext which makes NCS and EHS resistant to breaking the resultant cipher through XORing any captured encoded text [48].Again the randomization from the application of the Sliding Window Algorithm helps to increase the security of the encrypted data and also reduces the time complexity of the Non-Deterministic Cryptographic Scheme [49,50].From the trend of the encryption time from Table 1, it could be concluded that encryption times for NCS and EHS are not dependent on data size but the size of the key while the encryption times for Salsa20, Chacha20, and ERSA are influenced by data size.
From these discussions, it could be concluded that data size is proportional to the decryption time for Salsa20, Chaca20, and ERSA as indicated in Table 2.This makes their trend of decryption time deterministic, predictable, and linear which is supported by the works of [44,51].With a linear trend of decryption time, hackers can predict, intercept and modify data [52].
Based on these discussions, it is possible to conclude that ERSA, Salsa20, and Chacha20 produced linear, predictable, deterministic, and high decryption times, making them vulnerable to side-channel attacks and thus do not guarantee the absolute privacy and confidentiality of data on the cloud, as suggested by Kumar et al. [34] and Karthik [35].The application of the Fixed Sliding Window Algorithm, which disintegrates the huge numbers obtained from the selection of the good prime numbers as the initial keys, Linear Congruential Generator, and XORing the keys and the Ciphertext to obtain plaintext, caused NCS and EHS to have unpredictable, non-deterministic, and non-linear decryption time.
This has the advantage of reducing bandwidth utilization since data encryption and decryption raise the overhead cost of data processing [53].Again, non-linear encryption timings aided in increasing data secrecy and privacy while reducing device ripping and wear for industry participants and people [18].
According to Tables 1 and 3, encryption time is inversely related to throughput time.Algorithms with faster throughput times use less CPU, and vice versa [54].
When Tables 2 and 4 were compared, it was possible to deduce that with a long decryption time, the associated throughput time was short.This supports the findings of Abolade et al. [54], who discovered that algorithms with a high throughput time need less CPU time.
Table 5 shows that algorithms that consume less memory during execution serve to decrease computational bottlenecks for the CPU and, as a consequence, are regarded as the best [55].According to Table 6, Salsa20 is more memory intensive since its operation is based on 20 cycles with 10 repeating instances [56].Because the secret key and Ciphertext are XORed without padding, the NCS had the lowest mean memory usage.
The Friedman Test and Bonferroni Post Hoc test results from Tables 7-10 show that the encryption and decryption times for NCS and EHS are statistically different from Salsa20, Chacha20, and ERSA.
It could be summarized that NCS and EHS produced lower, non-deterministic, non-patterned, and secret key-dependent run times as such defeats the idea behind ERSA, Salsa20, and Chacha20 as the fastest symmetric algorithms which used less memory during data execution [49].This makes NCS a lightweight algorithm to be employed in the cloud and other areas where fast and lightweight algorithms are needed.Also, it could be used in environments where mobile devices and other devices with less memory are used such as the Internet of Things.

Conclusion
To ensure the secrecy and privacy of data in the cloud, modern cryptographic techniques are applied to encode and decode data.These encryption techniques have computational overheads that have an impact on cloud performance.The symmetric stream cipher algorithms; ERSA, Salsa20, Chacha20, NCS, and EHS have all been thoroughly examined.For securing the privacy and secrecy of data stored in the cloud, ERSA, Salsa20, and Chacha20 are seen to be strong cryptographic schemes.However, compared to NCS and EHS, their run times are linear, predictable, and long, rendering them vulnerable to side-channel attacks.
Their linear runtime trends result in significant bandwidth use and hardware device wear and tear during the transfer of large amounts of data, making them unsuitable for a cloud computing environment.Additionally, because of their linear run times, hackers can estimate the execution time of any piece of data.The Friedman test and Bonferroni Post Hoc test, however, showed that NCS and EHS had the advantage of producing non-linear run time trends, nonpatterned run time trends, non-deterministic run time trends, lowest run times, high throughput, and consumed less amount of memory during execution.
Since NCS and EHS will ensure reduced bandwidth usage, prevent tearing and wearing of hardware, and maximize the utilization of any device without much attention on the specification of hardware, this provides industry players and academia optimism that they can fully embrace cloud computing.Future studies should focus on doing experiments using computers with greater specifications.Additionally, research should be done to compare the security strength of NCS to other cutting-edge algorithms.

Table 2 ,
presents the comparison of the mean decryption time trend for Salsa20, Chacha20, ERSA, NCS, and EHS.With data size of 500KB, NCS had the lowest decryption time of 74 ±45.16 milliseconds followed by Chacha20 (281.33±35.42)and EHS with a decryption time of 368.4±133.88milliseconds.When the data size was increased to 1000KB, NCS had the lowest decryption time of 105.6±71.78milliseconds with Salsa20 having the highest run time of 869.4 ±223.18.From Table