New insights into the existing image encryption algorithms based on DNA coding

Because a DNA nucleotide sequence has the characteristics of large storage capacity, high parallelism, and low energy consumption, DNA cryptography is favored by information security researchers. Image encryption algorithms based on DNA coding have become a research hotspot in the field of image encryption and security. In this study, based on a comprehensive review of the existing studies and their results, we present new insights into the existing image encryption algorithms based on DNA coding. First, the existing algorithms were summarized and classified into five types, depending on the type of DNA coding: DNA fixed coding, DNA dynamic coding, different types of base complement operation, different DNA sequence algebraic operations, and combinations of multiple DNA operations. Second, we analyzed and studied each classification algorithm using simulation and obtained their advantages and disadvantages. Third, the DNA coding mechanisms, DNA algebraic operations, and DNA algebraic combination operations were compared and discussed. Then, a new scheme was proposed by combining the optimal coding mechanism with the optimal DNA coding operation. Finally, we revealed the shortcomings of the existing studies and the future direction for improving image encryption methods based on DNA coding.


Introduction
Since Dr. Adleman of the United States used DNA molecular biological computing to solve the directed path problem of seven nodes [1] in 1994, DNA computing has attracted the attention of researchers across the world [2][3][4][5]. In DNA computing, DNA nucleotide bases A, C, G, and T coding sequences are used as carriers of information; it has great advantages in dealing with large storage capacity, parallelism, and energy consumption of information [6][7][8][9]. DNA computing essentially uses biochemical experiments to address practical problems. However, because of the limitations of biochemical reaction conditions, such as expensive experimental equipment, environmental requirements, difficulty in extracting DNA sequence, and difficulties in controlling the concentration, temperature, and PH of the reactant, studying DNA computing is difficult. Regarding image encryption in DNA computing, researchers ignore the a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 complex experimental links of DNA, only use DNA coding to carry image information, and design a reasonable and effective encryption algorithm by combining chaotic mapping with different DNA coding algorithms. This idea gives rise to new perspectives in the research on image encryption.
From 2009 to 2014, Zhang et al. innovatively used DNA coding methods for image encryption, and proposed many image encryption algorithms based on DNA coding [10][11][12][13][14][15][16][17]; this created new ideas in DNA cryptography. Their main encryption ideas involve the following four steps. First, the original image is encoded using the DNA encoding rule. Then, the position of the DNA-encoded matrix is scrambled by the chaotic sequences generated by the onedimensional chaotic system [14], combination of several low-dimensional chaotic systems [10][11][12], and hyper chaotic system [13,15] or the combination of several hyper chaotic systems [16,17]. Subsequently, the values of the image pixels are changed by various operations, namely, DNA addition and subtraction [10,12,13], XOR [15], DNA subsequence [16], DNA complement [11], or the combination of these operations [14,17] under the control of chaotic sequences. Finally, the encrypted image is obtained by DNA decoding and regrouping. The image encryption technology based on DNA coding proposed by Zhang et al. combined DNA coding and image encryption effectively. Consequently, the application of DNA cryptography in image encryption garnered the attention of many researchers both home and abroad. However, in recent years, researchers have highlighted many shortcomings in the methods proposed by Zhang et al. For example, [18,19] observed that the method of Zhang [12] combined DNA addition with chaotic mapping to encrypt an image. Nevertheless, the addition operation of this method is non-invertible; therefore, the decrypted image cannot be obtained. Belazi [20], Liu [21], and Wang [22] used an RGB image encryption algorithm based on DNA coding and chaotic mapping proposed by Liu and Zhang [14], to restore original images using chosen-plaintext attack (CPA). Wang indicated that the main drawback of this algorithm is that the key used for the encryption is independent of the original image. Furthermore, changing a base in DNA can only affect the base in one position; moreover, the diffusion ability of the base is poor, resulting in poor diffusion ability of the image pixels. Similarly, an image fusion encryption algorithm based on the DNA sequence operation and hyper chaotic system proposed by Zhang and Guo et al. [15] was broken by choosing different sizes of plain images in [23][24][25][26]. Li [27] analyzed some image encryption algorithms based on the DNA subsequence operation and chaotic system proposed by Zhang [16] using CPA to obtain original images. In summary, although the methods proposed by Zhang et al. are effective, their security is very poor.
Since 2015, some researchers have applied a combination of multiple simple chaos, more complex, or more secure chaotic systems to encrypt images, for enhancing the security. For instance, Wu [28] used three one-dimensional chaotic systems combined with the DNA addition and subtraction, and XOR operations. Zhang [29] and Mondal [30] used a mixed linearnonlinear coupled map lattices (MLNCML) system embedded logistic, and Zhang [31] used a fractional-order piecewise-linear (PWL) hyperchaotic map combined with DNA addition and subtraction, to realize image encryption. Li [32] used fractional-order hyper chaotic mapping Lorenz (FOHCL) to determine the rules of DNA addition and subtraction as well as DNA XOR. Color images are encrypted using a logistic system in combination with DNA addition and subtraction, and XOR [33]. Similarly, Zhang [34] used a logistic system combined with DNA addition, subtraction, and complementation. The DNA coding method used in the above-mentioned algorithms is the same as that in the method of Zhang et al., which adopts fixed DNA coding, and gives DNA coding operation rules (DNA addition and subtraction, DNA XOR, and DNA complementation). The security of the above-mentioned encryption algorithms is guaranteed by a chaotic system. Without a chaotic system, the image encryption based on DNA would only be a binary bit computation. If the security is greatly threatened, it is difficult to resist the brute-force attack, known plaintext attack (KPA), and CPA. For instance, Dou [35] used brute-force attack and CPA to analyze Ref. [30], and restored the plaintext image. In addition, Kumar [36] used DNA encoding and elliptic curve Diffie-Hellman cryptography without a chaotic system to realize RGB image encryption. However, this method was less-secure, and was broken by Akhavan [37] via CPA.
For these reasons, dynamic DNA coding and more complex DNA coding algorithms have been proposed in recent years. Kalpana [38], Zhang [39], Zhen [40], Cai [41], Rehman [42], and many others proposed various dynamic DNA coding algorithms. They first defined different DNA coding rules, and used chaotic sequences to select DNA coding rules dynamically to encode images. Further, [39,43,44] proposed replacing the conventional single-base complement operation with a complement operation based on the principle of base complementation, to increase the complexity of DNA operation. These encryption algorithms achieved better encryption results. Belazi [45] proposed a novel medical image encryption scheme based on chaos and DNA encoding, and realized two encryption rounds using different coding rules and complement, and XOR operation under chaotic control. This method achieves good encryption, and could resist all types of attacks. Furthermore, a new image encryption scheme based on CML and DNA sequences was proposed by Wang [46], which also performs good encryption. Nevertheless, because fixed DNA coding was selected in their coding process, and the security could be improved. Dagadu [47] proposed a medical image encryption scheme based on multiple chaos and DNA coding, different DNA coding rules, and XOR operation combined with chaotic map to realize image encryption. Although the scrambling degree in their method was high, it could not resist CPA and KPA.
This literature review reveals that researchers generally improve the performance of the encryption algorithms by changing the DNA coding methods and operations. The existing studies simply use a DNA coding method, DNA coding operation (addition, subtraction, complement, XOR, etc.), or a combination of multiple coding operations to realize image encryption, without comprehensively comparing or analyzing the methods or operations. Put differently, in any existing study, the reasons for selecting a specific DNA coding (fixed or dynamic coding) method, DNA coding operation, or a combination of multiple coding operations to realize image encryption are unclear.
Therefore, this study provides new insights into the existing image encryption algorithms based on DNA coding. We categorized the existing image encryption algorithms based on DNA coding into five types, depending on the type of DNA coding: fixed DNA coding, dynamic DNA coding, different types of DNA base complement operations, different DNA sequence algebraic operations, and combinations of multiple DNA operations. Then, a more detailed classification is performed according to the characteristics of the different algorithms. Furthermore, we comprehensively compared and analyzed all these methods, and revealed their advantages and disadvantages. Further, the DNA coding mechanism, DNA algebraic operation, and DNA algebraic combination operation are compared, and the newly proposed scheme is discussed. Finally, the study highlights the shortcomings and indicates the future research direction for improving image encryption based on DNA coding.
The remainder of this manuscript is structured as follows. In the second section, the theoretical basis of this study is introduced. In the third section, we categorize the existing DNA coding-based image encryption methods, and discuss their advantages and disadvantages by a comprehensive comparison and analysis, and on the basis of above discussion a new method is proposed. Finally, the last section discusses the shortcomings and the future research direction for improving image encryption based on DNA coding.

DNA coding
A DNA sequence is composed of four nucleic acid bases, namely, adenine (A), cytosine(C), guanine (G), and thymine (T), wherein A and T, and G and C are complements. Researchers use the binary values 00, 01, 10, and 11 to denote these four bases. A total of 24 types of coding can be listed; however, according to the complementary relation between 0 and 1 in binary, it can be deduced that 00 and 11 are complements, and that 01 and 10 complement each other. Therefore, 8 out of the 24 coding rules are selected to satisfy the base complementary criterion shown in Table 1. The pixel value of an image lies between [0, 255], and consists of eight-bit binary numbers; therefore, a single pixel value may consist of four-bit DNA bases. For example, for a pixel value of 175, the corresponding binary bit is "10101111," the DNA sequence generated by the R1 rule presented in Table 1 is "GGTT," and decoding is performed with the same rule or with the other seven different rules. The existing image encryption algorithms based on DNA coding are almost inseparable from this coding rule or deforming based on this coding rule.

DNA complement rules
Two methods are available for image encryption based on DNA sequence complement: (1) single base direct complement method, and (2) the method that uses the principle of single base and double base complementary pairing in biotechnology to perform the complement operation. A single base direct complement is defined as follows: where complement (.) is the function of the complement. The complement of base A is T, and that of base C is G. The corresponding binary complement is satisfied when both the complement of 00 is 11 and that of 01 is 10, and vice versa. The complement rule is defined based on a double helix structure; a nucleoside is paired according to the double helix structure. Supposing the complementary transformation is D, every nucleoside x i satisfies the following equation: where x i and D(x i ) are complementary, i.e., x i and D(x i ) are a pair of base pairs. These base

PLOS ONE
New insights into the existing image encryption algorithms based on DNA coding pairs must satisfy the condition of single-shot mapping. Based on the above equation, the base pairs satisfying single-shot mappings are listed in Table 2 [48].

DNA addition, subtraction, and XOR algebra operations
Because encoding and decoding processes are implemented in binary, the algebraic operation of the DNA sequence (addition or subtraction) follows the binary operation rule. The coding of eight different rules corresponds to eight different operations (addition or subtraction). In the processes of encryption and decryption, the addition and subtraction operations are reciprocal. Furthermore, the inverse operation of XOR remains an XOR operation. Therefore, we only list the addition and XOR operations, and Table 3 is based on the addition and XOR operations of R1 in Table 1.

Correlation equation of this study
For a comprehensive comparison, we use the logistic chaotic sequence under the same parameter to act on the operation of the DNA sequence. Additionally, histogram, information entropy [49], rate of base distribution, hamming distance [50], fixed point ratio [51], histogram variance [52], correlation coefficient, and the number of pixel change rate (NPCR) and unified average changing intensity (UACI) are used as the evaluation indexes. The relevant theories of these evaluation indexes are given by the following Eqs (3)-(16). 1. Logistic map A logistic map is an excellent chaotic map. A logistic map is used in this study for experimental simulation, and is described as follows: where μ 2 [0, 4], x n 2 (0, 1), n = 0, 1, 2. . .,. The study result showed that the system is in a chaotic state under the condition 3.569945 < μ � 4.  Table 3. DNA sequence of addition and XOR operations. The global information entropy is defined below.

Global information entropy and local information entropy
where m i denotes the ith grey value for the L level grey image, and P(m i ) represents the emergence probability of m i . The information entropy of an ideal random image is eight. The local information entropy is defined below. In a test image S, we select the non-overlapping image blocks S 1 , S 2 , S 3 . . .S k randomly, and calculate the information entropy of each image block using Eq (4). Here, the intensity scale of the test image is L; generally, L = 256. Finally, the following equation is used to calculate the average value of the information entropy of the image blocks.
where H(S) local denotes the local information entropy of the image blocks, and i = 1,2,3. . .k.

Rate of base distribution
The rate of base distribution is defined below.
where count (A) denotes the number of bases "A" in the entire image coding matrix; M ×N denotes the size of the image (a pixel can be represented by four bases; therefore, M×N×4 is the total number of bases); and AP denotes the percentage of base "A" in the entire coding matrix (the distribution of other bases is similar). Because DNA coding consists of four different bases (A, C, G, and T), the distribution rate of the DNA bases in an ideal random image should be 25%.

Hamming distance
The hamming distance is used to calculate the total number of different bases at the same location for two sequences of equal length, and is given by, where m i and n i denote the ith base of the DNA sequence M and N, respectively, and D(M,N) denotes the hamming distance of M and N; the greater the hamming distance, the greater the difference between the bases in the two sequences. 5. Fixed point ratio Let O = (O i,j ) M×N and N = (N i,j ) M×N represent the original and encrypted images, respectively, where M × N defines their sizes. If the pixel position (i,j) in an image O does not change its gray value after scrambling (i.e., o ij = n ij ), the pixel is a fixed point. We use the following equation to represent the fixed point ratio: If o ij = n ij , f(i, j) = 1, and if o ij 6 ¼ n ij , f(i, j) = 0; clearly, the smaller the fixed point ratio of the two images, the greater the difference between the encrypted image and the original image and the better the scrambling effect. 6. Histogram variance In addition to visual analysis to determine the distribution of image histograms, the histogram variance can be used for quantitative analysis. It is known that visual analysis is often unreliable. The histogram variance is defined below.
where M = m 0 , m 1 , . . ., m 255 denotes the vector of the histogram value, m i and m j denote the numbers of pixels whose gray values are given by i and j, respectively, and n is the greyness level. Clearly, the smaller the histogram variance, the more uniform the histogram distribution of the image. 7. Correlation coefficient The correlation of the adjacent pixels in an original image is very high. Therefore, the better the encryption, the smaller the correlation coefficient (close to 0). In this study, we randomly select 8000 pairs (horizontal, vertical, and diagonal) of adjacent pixels from the original and encrypted images. Then, we use the following equations [53] to calculate the correlation coefficient: where x and y denote the grey value of two adjacent pixels in the image. 8. Analysis of differences between two images The NPCR and UACI are used in this study to analyze the differences between two images. They are defined as follows [54]: UACI ¼

Analysis and comparison of the existing DNA coding-based image encryption methods
The existing image encryption algorithms based on DNA coding involve four basic processes: (1) scrambling the pixel position of the image by using a chaotic sequence; (2) encoding the scrambled image matrix to the DNA sequence; (3) disturbing the DNA sequence matrix by using a chaotic sequence combined with addition, subtraction, XOR, or complement operation, or a combination of these operations; (4) obtaining the encrypted image by DNA decoding and recombination. The block diagram of these processes is shown in Fig 1. As discussed already, a DNA chain consists of four bases (A, C, G, and T), which are used as the carriers of information. The process of converting the information into a DNA nucleotide chain is known as encoding. The opposite process of converting a DNA nucleotide into information is known as decoding. These processes are depicted in Fig 1. DNA encoding and decoding are the key problems in image encryption based on DNA coding. As discussed already, a DNA operation includes addition, subtraction, XOR, and complement. Apparently, different types of encoding, operations, and decoding can produce different encryption results. Therefore, researchers use different methods of DNA encoding and decoding, and different DNA operations, to achieve safer and more effective image encryption. Now, the existing image encryption algorithms based on DNA coding are categorized into five, depending on the type of DNA coding: fixed DNA coding, dynamic DNA coding, different types of base complement operations, different DNA sequence algebra operations, and combinations of multiple DNA operations. Dynamic DNA coding is further classified into three categories: image block or row-column dynamic coding (row-by-row), pixel dynamic coding (pixel-by-pixel), and binary bit dynamic coding. The image encryption based on different types of base complement operations is categorized into two: single base direct complement method and base complementary method based on the principle of base complement (static regular base complement and dynamic regular base complement). Further, addition and subtraction operations and the XOR operation based on different DNA sequence algebra operations are available. We not only analyzed and compared all the five methods independently, but also compared and studied the DNA coding mechanism, DNA coding operation, and DNA combination operation intensively. The specific classification scheme is shown in

Image encryption based on fixed DNA coding
Researchers typically use a DNA coding rule listed in Table 1 for encoding; after performing some DNA operations, the DNA sequence is decoded using the same encoding rule or other encoding rules listed in Table 1. We refer to the way of coding as the fixed DNA coding method in this study. The previous studies [55][56][57][58] adopted a specific coding rule for encoding, and used the same rule for decoding. For example, for the pixel value of 175, the corresponding binary bit is "10101111," the DNA sequence generated using the R1 rule defined in Table 1 is "GGTT," and decoding is performed using the same rule. Apparently, multi-step encryption operations must be performed before decoding. However, [10-14, 16, 17] randomly selected one of the eight coding rules for encoding, and selected other rules for decoding using seed key1 and key2. For instance, the pixel value of 175 was encoded with the R1 rule to generate the DNA sequence "GGTT," and "GGTT" was decoded using the R5 rule to obtain the binary bit "00001010"; the corresponding decimal number is 10. Apparently, the pixel values are changed in this manner.
To further analyze and study the image encryption algorithm based on DNA fixed coding, we conducted two experiments. Experiment 1: A "lena" image whose size is 256×256 was first encoded using the R1 rule defined in Table 1. Then, the encrypted image was obtained by decoding using the other seven rules. The results are shown in Fig 3. Experiment 2: Eqs (4), (6) and (8) are used to calculate the base distribution, information entropy, and histogram variance of the eight different coding. Then, the time complexity of the algorithm and the maximum distance between the base distribution and 25% are calculated. The results are presented in Table 4.
From the analysis of the test results of Experiment 1, the results of encoding by R1 and decoding by the other seven rules are shown in Fig 3. In fact, Fig 3(a) can be regarded as an image encoded by R1 and decoded by R1. The results shown in Fig 3(a), 3(b), 3(d), 3(e), 3(c), 3(f), 3(g) and 3(h) are similar based on visual observations. Therefore, it can be concluded that among the eight types of coding mechanisms, only four are effective. The contour of the original image can be observed in all the encrypted images. Table 4, which presents the results of Experiment 2, shows that the base distributions of the eight coding rules are not uniform. The maximum distance between the base distribution and 25% is 10.6%, and the base distributions of R1 and R2, R3 and R4, R5 and R6, and R7 and R8 are very similar. Their information The max distance of base distribution and 25% 10.6% https://doi.org/10.1371/journal.pone.0241184.t004

PLOS ONE
New insights into the existing image encryption algorithms based on DNA coding entropy is not close to eight, and the histogram variance is very large. However, the complexity order is only O(8MN) (M, N denotes the size of the image). Therefore, the following conclusions can be drawn: fixed DNA coding rules are simple to implement, have high computational efficiency, and one can even disturb a pixel value by selecting decoding rules that are different from the encoding rules. However, because there are only eight coding combinations, effective results are obtained only for four kinds. Further, the encryption is poor, ability to resist exhaustion is poor, bit distribution of the bases is not uniform, and the degree of scrambling is low. Therefore, it is difficult to encrypt an image, especially a single-pixel image such as a medical image. To our knowledge, most existing image encryption algorithms based on DNA coding adopt fixed coding; therefore, their security is apparently under threat.

Image encryption based on dynamic DNA coding
In dynamic DNA coding of images, the DNA sequence is obtained using different rules (the eight rules listed in Table 1) to encode each row, each column, each pixel, or each binary bit of the whole image. Selecting encoding patterns for different encoding objects randomly makes the coding system more complex, renders decoding more difficult, and enhances the image encryption security.
In this study, the existing dynamic coding methods are categorized into three, as follows. (1) Dynamic coding according to image block or row-column (row(column/block)-by-row). For example, Zhen [40] encoded each row of the original image differently using different rules that were controlled by a logistic chaotic sequence, and obtained the encrypted image by decoding with one of the rules. (2) Dynamic coding according to pixel (pixel-by-pixel). For instance, Kalpana [38], Wang [48], and Dagadu et al. [47] chose different encoding rules to encode each pixel in the image under the action of chaotic sequences. (3) Dynamic coding according to binary bit (bit-by-bit), wherein a single pixel can be converted to 8-bit binary bits, and every two binary bits can be encoded into one base using different rules that are controlled by the chaotic sequence; this method was used by [39]. The decoding process of all the above three methods is simply the opposite of the encoding process, and is thus not discussed here. Now, the three dynamic DNA coding methods are comprehensively analyzed. We assume that the chaotic sequences of the control encoding and decoding rules are E =  Fig 4(a) depicts dynamic DNA encoding and decoding by row; because the first elements in E and D are 7 and 5, respectively, the first row of pixels for the "lena" image is encoded with R7 and decoded with R5. Similarly, the second row is encoded with R3 and decoded with R4. Fig 4(b) depicts dynamic DNA encoding and decoding by pixels. Each pixel in the first row of the "lena" image is encoded with R7, R3, R8, and R6. . .R6, and decoded with R5, R4, R5, and R1. . .R5 . Fig 4(c) depicts dynamic DNA encoding and decoding by binary bit. Take the first pixel value (162) of the "lena" image as an example; the corresponding binary bit is "01010000." In this case, R7, R3, R8, and R6 are used to encode every two binary bits, and R5, R4, R5, and R1 are used for decoding; finally, a pixel value of 24 was obtained. Experiment 3: By using the logistic map described in Eq (3), the initial values x0 = 0.35, u1 = 3.95, y0 = 0.38, and u2 = 3.92 are chosen to generate two chaotic sequences of 1×256, and map them to the integer interval [1,8] to obtain the encoding rule E and decoding rule D. Then, we implement the three dynamic DNA coding methods using E and D to encode and decode "lena" gray scale images, respectively. The encrypted images are shown in Fig 5(a)-5(c).   Table 5. Fig 4 indicates that the correlation of the image pixel cannot be reduced by converting (162,162,162,. . .) to (8,8,8. . .) after the encoding and decoding operations using dynamic coding by row. However, dynamic coding by pixel and binary bit can reduce the correlation of the image pixels. In Fig 5, the contour of the original image is clearly seen in the image processed using dynamic encoding and decoding by row, but completely invisible in the images processed using dynamic encoding and decoding by pixel and binary bit. Table 5 shows that the base distributions of all these three methods are close to 25%. Moreover, the base distribution of dynamic coding by binary bits is much closer to 25% compared with that of the other two methods. The maximum distance between the base distribution and 25% is 0.04% for this method; additionally, the information entropy of its encrypted image is 7.9976, which is very close to eight. Furthermore, its histogram variance is the least, but the complexity order is highest.
In summary, the implementation of dynamic coding by binary bit is more complex than that of the other two methods. Nevertheless, dynamic coding by binary bit performs better encryption, has higher degree of scrambling, and can resist exhaustive attacks.

Image encryption based on different types of base complement operations
The complement operation of a DNA sequence under chaotic sequence control is a common encryption tool for changing the pixel values of an image. According to the principle of base complementary in Table 2, two methods are available for base diffusion. The first method selects one of the six rules randomly, and then uses Eqs (2) and (17) to select the corresponding complementary bases [39,43]; this method is called the static regular base complement method in this study. The other diffusion method [48] involves two steps: (a) selecting the complement rule for each base by chaotic sequence (i.e., to randomly select a rule from among R1 to R6 from Table 2), and (b) selecting the corresponding complement base using Eqs (2) and (17); this method is called the dynamic regular base complement method in this study. Moreover, [16] adopted a single base direct complement method (for details, see Section 2.2). Therefore, we classified the existing image encryption algorithms based on DNA complement into three: (1) single base direct complement method, (2) static regular base complement method, and (3) dynamic regular base complement method. The selection of the complement rule is expressed in the equation below: where L(i) is a chaotic sequence mapped to the integer region [0, 3], and x i and D(x i ) are the same as those in Eq (2). For the complement rule R1, if the base of the current position is x i = A, and ifL(i) = 3, the complementary base of x i is D (D(D(x i ))). We can obtain the base as G ( Table 2); in other words, the complement of A is G.
To comprehensively compare the above two categories involved in the three complementary methods, in the simulation of the above three methods, we use the logistic chaotic sequence generated by the same initial value to control the position of the complementary base. In addition, to avoid the interference of dynamic coding, we use R1 from Table 1 for encoding and decoding.
Experiment 5: The methods of Zhang [16] and Jian [44] were applied to the single base direct complement and the static regular base complement operation for the "lena" gray scale image of 256×256, using the logistic system with the initial values x0 = 0.38 and u1 = 3.95. Similarly, using the logistic system, the initial values x0 = 0.35, u1 = 3.95, y0 = 0.38, and u2 = 3.92 were taken to perform the dynamic regular base complement operation on the "lena" gray scale image of 256×256, based on the ideas from [48]. Fig 6 shows the "lena" partial base sequence and complemented base sequence obtained by the three methods. Fig 7 shows the encrypted images and the histograms obtained by the three methods. Table 6 compares the three methods in terms of information entropy, the hamming distance of the corresponding position base, and histogram variance. We use "DBCO," "SRCO," and "DRCO" to denote the single base direct complement, static regular base complement, and dynamic regular base complement methods, respectively, in Table 6, Figs 6 and 7.
From Fig 6, it is seen that the base diffusion of the corresponding position with static regular base complement operation or dynamic base complement operation is better than that of the direct base complement operation. Visual observation of Fig 7(a), 7(c), 7(e), 7(g), 7(i), 7(k), 7(m) and 7(o) shows that the encrypted images of the static regular base complement and the dynamic regular base complement methods are completely different from the original image, whereas the contour of the original image is seen in the image encrypted by the direct base complement method. Analyzing the histograms of Fig 7(b), 7(d), 7(f), 7(h), 7(j), 7(l), 7(n), and 7(p) shows that the histogram distributions for direct base complement and dynamic regular base complement are more uniform. The six histograms of the static regular base complement method showed three forms. The histograms of R1 and R2, R3 and R5, R4 and R6 have similar shapes, and the other four regular distributions are not uniform, except the histograms of R4 and R6. Here, R1-R6 were obtained from Table 2. A comparison of the information entropy and hamming distance of the three methods indicates that the information entropy of the dynamic regular base complement method is the largest, and the static regular base complement is not very stable. The hamming distance of the static regular base complement and dynamic regular base complement methods is the same and larger than that of the direct base complement method, because the chaotic sequence of the control complement operation is the same. The smaller the histogram variance, the better the encryption. Further, the variance value of the dynamic regular base complement method is the least of all methods.

PLOS ONE
In summary, the dynamic regular base complement operation performs better encryption, and has higher diffusion degree of pixels. Moreover, the complexity of this algorithm makes it robust to image encryption attacks.

Image encryption based on different DNA sequence algebra operations
Based on the discussion in Section 2.1, image pixels can be represented as DNA sequences by encoding. These DNA sequences can be changed by using different DNA sequence algebra operations. When the DNA sequence is changed, the pixel values of the image are disrupted. Thus, the addition, subtraction, and XOR algebra operations of the DNA sequences are widely used in image encryption. Table 3 shows that the rows and column bases corresponding to A and G of the addition and XOR operations are the same, and that the two operations are very similar. The main difference is that the addition operation is irreversible, whereas the XOR operation is reversible. Zhang et al. [12] used chaotic mapping and the DNA addition operations to perform image encryption. [18,19] indicated that the encrypted image in [12] cannot be restored back to the original image. [59] also has the problem of irreversibility of the addition operation. To overcome this disadvantage, the DNA sequence matrix of the original grayscale image is usually added to that generated by the chaotic sequence [29,30,34,60]. In case of a color image, the image is divided into three channel matrices (R, G, and B), and the channel matrix is transformed into DNA sequences, which are then added [14,38]. The XOR

PLOS ONE
New insights into the existing image encryption algorithms based on DNA coding operation also mainly follows the above two ideas. For example, the DNA sequence matrix of the original image and the DNA matrix which is generated by the chaotic system are carried out using the XOR operation [61][62][63]. Reference [64] implemented the XOR operation between the three channel DNA matrices in the color image. To compare the encryption performance of the addition and XOR operations, we do not scramble the original gray image. We use R1 in Table 1 to encode the original image, and then perform the addition and XOR operations using the DNA matrix produced by the chaotic sequence. The experimental procedure is as follows. Experiment 6: The initial values x0 = 0.38 and u1 = 3.95 are chosen to generate the chaotic sequences of X 256×256×8 using the logistic system. If x(i) > = 0.5, x(i) = 1, and if x(i) < 0.5, x(i) = 0; here, x(i) is an element of X. Then, the X sequence is encoded according to the fixed coding rule R1 in Table 1 to obtain the DNA matrix L, whose size is 256,256×4. The original "lena" image is also encoded into a DNA matrix P according to the fixed coding rule R1 in Table 1. We performed P+L and P XOR L operations by using the addition and XOR rules defined in Table 3. Finally, we obtained the encrypted image after decoding and reconstructing (Figs 8  and 9). To our knowledge, these methods have been adopted in [29,30,[61][62][63].

PLOS ONE
New insights into the existing image encryption algorithms based on DNA coding In Figs 8(a) and 9(a), the contours of the original "lena" image can be seen, because the original image was not scrambled before the addition and XOR operations. A comparison of Figs 8(b) and 9(b) show that the histogram distribution of the encrypted images obtained by the XOR operation is more uniform. Table 7 indicates that the information entropy of the XOR operation is closer to eight. Because the two methods use the same chaotic mapping with the same parameters, they have the same hamming distance, and approximately 85% of their bases have changed (221592/(256×256×4)×100% = 85%).
The smaller fixed point ratio and histogram variance indicate a better degree of disturbance. Clearly, the fixed point ratio and the histogram variance of the XOR operation are smaller as observed from Table 7. This indicates that the encryption of the DNA XOR operations is better than that of DNA addition.

Image encryption based on combinations of multiple DNA operations
Recently, combining DNA coding mechanisms and different DNA algebraic operations (such as addition and subtraction, and XOR and complement) to perform image encryption based on DNA coding has attracted the attention of scholars. Chai [65] used 2D logistic mapping to dynamically encode a plaintext image by pixels, and then used a cyclic displacement scrambling DNA matrix. Then, the addition and XOR operations were performed on the scrambled DNA matrix and the DNA matrix generated by the chaotic system under the new key updated by the hamming distance of the plaintext image. Finally, the encrypted image was obtained by decoding and recombining. Zhang [39] used Lorenz to generate chaotic sequences, and obtained a, b, and c after deforming. Then, Chen chaotic map was used to generate chaotic sequences, and A, B, C, and D were obtained after deformation. Further, the plain images were scrambled to obtain E1 using a and b, and E1 and C were dynamically encoded by binary bit using A and B, respectively, after that obtained E2 and E3. Addition operation was performed on E2 and E3 under c sequence, and E4 was obtained using D to perform the direct complement operation on E3 which is the result of addition, to finally obtain the encrypted image.
These methods perform good encryption, are highly secure, and can resist common attacks. However, most combination encryption methods are not secure, because most of them use the fixed coding rule, or the base is directly complemented, which was proven to be unsafe in Sections 3.1 and 3.3. Furthermore, some methods cannot restore the plaintext image, and cannot resist CPA and KPA. A detailed analysis is presented in Table 8.

Further analysis and comparison of methods
The previous sections independently discussed the advantages and disadvantages of the methods. In this section, all the proposed methods are compared comprehensively. Image encryption based on DNA coding involves two main aspects: DNA coding mechanisms and DNA coding operation. We compare these two aspects individually. 3.6.1 Comparison of DNA coding mechanisms. DNA coding (encoding and decoding) is the interface between image pixels and DNA sequences. Different DNA coding mechanisms yield different encryption performances. Therefore, choosing a better DNA coding method is a key step in the DNA image encryption process. The existing DNA coding methods are compared in Tables 4 and 5. The performance of the three dynamic DNA coding methods is better than that of fixed DNA coding. Among all the dynamic DNA coding methods, the dynamic coding by binary bit has the best performance; the information entropy is 7.9976, which is very close to 8. Further, its histogram variance is the least, suggesting that the histogram distribution of the encrypted image is the most uniform for this method than for the other methods. It has the least maximum distance between the base distribution rate and 25%, indicating that the base distribution is the most uniform for this method than for the other methods. However, due to calculation by binary bit, this algorithm is more complex than the others. We believe that the current speed of the computer is totally affordable for this method.
3.6.2 Comparison of DNA coding operations. The existing DNA encoding operations include individual DNA operations and the combination of multiple operations. We compared them based on information entropy, histogram variance, correlation coefficient, and time complexity (Tables 9 and 10). Table 9 indicates that the information entropy of DRCO is much closer to 8 than that of the other DNA coding operations. Further, its histogram variance and correlation coefficient (very close to 0) are the least. However, it has the highest complexity order. The order of superiority of the other DNA coding operations is as follows: SRCO (Rule 4 and Rule 6 from Table 6) >DBCO>XOR>addition. This order can be used to determine if a specific DNA operation is reasonable. Because DRCO exhibits the best performance of all the base complement methods, the combination of DRCO with other DNA operations attracts our attention. All DNA operations in this section used the same experimental data and environment as those in the previous sections. Table 10 shows that DRCO + Addition is better than Addition + DRCO, DRCO + XOR is better than XOR + DRCO, and XOR + Addition is better than Addition + XOR. Table 9 shows that DRCO>Addition, DRCO>XOR, and XOR>Addition.
These results show that when performing combinatorial DNA operations, better encryption performance can be obtained by placing the better operation before other operations. Table 10 indicates that DRCO + XOR is the best encryption method of all combinatorial operations, because DRCO and XOR operations occupy superior positions in the order DRCO>XOR>addition. If time complexity permits, the superposition of the operation with better performance will result in better encryption; for instance, DRCO + XOR + Addi-tion>DRCO + XOR. This would indicate the direction for implementing DNA combinatorial operation encryption later.

Proposed DNA coding encryption scheme
Section 3.6.1 revealed that the dynamic DNA coding by binary bit is the best DNA coding method; Section 3.6.2 revealed that DRCO + XOR + Addition is the best DNA coding operation. Now, we attempt to develop a new encryption scheme by combining the optimal DNA coding and optimal DNA operation defined above. First, the original image is encoded by binary bit dynamically; then, DRCO + XOR + Addition operation is performed on the coding matrix. Finally, the encrypted image is obtained by decoding the DNA coding matrix to the binary sequence dynamically. The encryption process is the same as the decomposition process mentioned above, and is thus not discussed again. However, as described by Eqs (18) and (19), the key is treated as follows to make the proposed scheme more secure. We apply the proposed algorithm to ten standard gray-scale images, of which seven are 256×256 and three are 512×512. The encrypted images of "Cameraman" and "Woman" are shown in Fig 10. Apparently, the encrypted images are difficult to be recognized, but only from a visual perspective. Next, we evaluate the proposed algorithm in terms of information entropy, resistance to exhaustive attacks, resistance to statistical attacks, and resistance to differential attacks, and conduct a NIST randomness test.

Analysis of global and local information entropy.
To analyze the information entropy more accurately, we calculate both the global information entropy and local information entropy of different scales. This is because the local information entropy can effectively overcome the weaknesses of the global information entropy, such as inaccuracy, inconsistency, and low efficiency. Table 11 shows that the global information entropy and local information   [68]. Table 12 compares the developed encryption scheme with the existing encryption schemes. The comparison indicates that the information entropy obtained by our scheme is higher, and the encrypted image is similar to the ideal random image.

Key sensitivity analysis (resistance to exhaustive attacks).
The key is determined to be sensitive or insensitive depending on whether the encryption scheme can resist exhaustive attacks. We use NPCR and UACI to analyze the difference between two encrypted images with a key (x0) difference of 10 −14 . Table 13 lists the NPCR and UACI values of the encrypted images before and after minor changes to the key. NPCR = 99% and UACI = 33% are the benchmark values [53]. Table 13 reveals that all the NPCR and UACI values of our method are larger than the benchmark values. Table 14 compares our algorithm with the existing algorithms, and reveals that the NPCR and UACI values of our algorithm are larger than those of the existing methods. Therefore, our algorithm has high key sensitivity and is robust against exhaustive attacks.

Resistance to statistical attacks.
Here, we discuss the effectiveness of the proposed algorithm in resisting statistical attacks. Cryptoanalysts generally use the distribution of histograms and the correlation between adjacent pixels to count important information in order to find information related to plaintext images [79,80]. Therefore, the histograms of "Cameraman" and "Woman" are shown in Fig 11, and the histogram variances of all 10 standard original and encrypted images are listed in Table 15. Further, we use Eqs (10)- (13) to calculate the correlation coefficients of 10 standard gray-scale images and the corresponding encrypted images (Table 16).  Apparently, the distributions of the histograms for encrypted images are very uniform, and the histogram variance of all encrypted images is much lower than that of the original image. The correlation coefficients in the horizontal, vertical, and diagonal directions of the encrypted images are very close to 0, suggesting that the correlation between adjacent pixels is very low. Therefore, the proposed scheme can resist statistical attacks very well.
3.7.4 Resistance to differential attacks. To attack an encryption algorithm, a cryptographic attacker searches for loopholes by detecting the sensitivity of the keys to plaintext. If the keys are more sensitive to plaintext, it is difficult to perform a differential attack. Similar to that in Ref. [80,81], we use NPCR and UACI to calculate the difference value after encrypting two plaintext images with very little differences. Table 17 lists the NPCR and UACI values for five standard grayscale images, and indicates that the averages of the NPCR and UACI values are higher than the corresponding benchmark values. Further, we compare our scheme with other existing methods in Table 18; Table 18 shows that our scheme is highly superior to the existing methods. 3.7.5 NIST randomness test. The SP800-22 test package of the National Institute of Standards and Technology (NIST) can detect the randomness of digital information, and is currently the most authoritative method for detecting the randomness of binary sequences [81,84,85]. Each value in this package is tested successfully only if it is greater than 0.01. Further, to preserve generality, we detect the randomness of both encrypted grayscale images and color images of three channels, and the results are presented in Table 19. Table 19 shows that all test values are greater than 0.01, and that the results are "success." This shows that our algorithm can obtain random sequences by encrypting grayscale and color images.

Shortcomings
Section 3 indicates that the existing methods have some shortcomings. First, the existing DNA coding mechanisms are fragile. Although DNA coding is a crucial step in encryption, most of the existing coding methods use fixed coding. Section 3.1 proved that fixed coding yields poor encryption performance, has poor resistance to exhaustive attacks, and has non-uniform distribution of bases. Although some studies have used dynamic coding, most of them are rowby-row (image block) or pixel-by-pixel dynamic coding. These two methods are not as secure as dynamic coding by binary bit. Second, improper application of the DNA sequence addition operation, resulting in irreversibility of the image encryption method. The inverse operation of DNA sequence addition is the DNA sequence subtraction operation. Therefore, if addition is performed between the pixels of an original image, the DNA subtraction operation cannot be performed, and it would be difficult to decrypt an encrypted image. Third, the security of the DNA complement operations is poor. In recent years, various studies have widely used the method of DNA complement to diffuse the pixel values. However, most of them used the direct base complement or static regular base complement method. Section 3.3 showed that the static regular base complement method comprises three different encryption forms. Except for the fourth and the sixth rule, the information entropy of the other static regular complement methods is low, and the direct base complement method exhibits poor encryption performance. Fourth, because the combinatorial DNA operations are selected arbitrarily, image encryption based on combinations of multiple DNA operations is not highly secure. Fifth, the diffusion capacity of DNA bases is poor. Most studies only use the relevant theory of DNA coding to encrypt images and ignore the diffusion of bases, making their methods vulnerable to CPA and KPA. Sixth, the existing methods use chaotic systems combined with DNA coding to achieve image encryption, whose security (i.e., key space, key sensitivity, and degree of image scrambling) thus depends on the security of the chaotic system. Furthermore, parallelism and high storage of DNA computing were not applied to these image encryption methods.

Future direction for improvements
We propose the following improvements to alleviate the above shortcomings. The first suggestion is related to using the DNA dynamic coding mechanism to transform an original image into a DNA sequence matrix, for which dynamic coding by binary bit can be preferred. Table 5 shows that the information entropy of dynamic coding by binary bit reaches 7.9976, and its base distribution is very uniform, which are beneficial as good factors to the successfully begin the encryption process. Second, the DNA addition operation can be replaced with DNA XOR operation to perform image encryption. Table 3 indicates that the DNA XOR operation is very similar to the addition operation. Its main advantage is that it is reversible, and makes the algorithm simple. By comparing the relevant data, we can solve the problem of irreversible addition, and also obtain good encryption performance. Third, selecting the dynamic regular base complement method to improve the diffusion capacity of pixels is preferable. Fourth, choose a reasonable and effective DNA combination operation to change the pixel values of images based on Table 10. Fifth, use the information related to plaintext as a part of encryption key, such as combining the hamming distance of the DNA sequence from the plaintext image with the encryption key (chaotic initial value) to form the final key, or combining DNA dynamic coding with a chaotic system. This can improve the diffusion capacity of the DNA base, and effectively resist the CPA and KPA. Sixth, in addition to using more secure chaotic systems combined with DNA coding to perform image encryption, researchers should combine the DNA coding methods effectively to improve the security of the DNA coding encryption methods. Future studies should emphasize the use of DNA computing parallelism and large storage in image encryption to quickly encrypt many images and even video files, while ensuring security.

Conclusion
This study first reviewed the existing DNA coding-based image encryption methods. Image encryption based on DNA coding was classified into five types, depending on the type of DNA coding: DNA fixed coding, DNA dynamic coding, different types of base complement operation, different DNA sequence algebraic operations, and combinations of multiple DNA operations. All these methods and other existing methods were compared and explained. Furthermore, we combined the optimal coding mechanism with the optimal DNA coding operation to develop a new encryption scheme, and demonstrated its effectiveness and security. Finally, the shortcomings of the existing image encryption methods and the future direction for improvement were discussed. In the future, we will study the advantages and disadvantages of image encryption methods based on different combinations of DNA coding and dynamic DNA operations. We will also study the influence of different chaotic systems on DNA coding schemes.