A Novel Fractal Coding Method Based on M-J Sets

In this paper, we present a novel fractal coding method with the block classification scheme based on a shared domain block pool. In our method, the domain block pool is called dictionary and is constructed from fractal Julia sets. The image is encoded by searching the best matching domain block with the same BTC (Block Truncation Coding) value in the dictionary. The experimental results show that the scheme is competent both in encoding speed and in reconstruction quality. Particularly for large images, the proposed method can avoid excessive growth of the computational complexity compared with the traditional fractal coding algorithm.


Introduction
Image compression provides an efficient way of digital image storage and transmission to reduce the irrelevance and redundancy of the image data. Among the various image compression techniques, fractal coding is an attractive one with its high compression ratio. Fractal image coding is based on the construction of an image transformation of a special kind which, when iterated on any initial image, produces a sequence of images that converges to a fractal approximation of the original [1]. Since this method was first proposed by Barnsley in 1988 [2], the method has been widely used in practical applications and has been improved upon by Jacquin [3]. In general, we call Barnsley's method [4] as traditional fractal coding algorithm. Fractal image coding was used not only in image coding, but also in some interesting image problems, such as image retrieval [5]. However, compared with traditional image coding technologies, fractal compression suffers from the high computational complexity in encoding [6].
The most important thing in fractal coding research is to reduce the coding time without loss of restored image quality. In recent years, much work has been done on fractal compression. Qu et al. presented an algorithm combined with wavelet algorithm [7]. Duh et al. applied the DCT in fractal image compression [8]. Jaferzadeh et al. accelerated the compression speed by block classification [9]. Sze et al. used the quad tree finding algorithm to achieve the fractal compression [10]. Based on the spatial correlation on the range and domain blocks, Truong et al. proposed searching the matching domain block from the adjacent domain block of the current range block [11]. Chen et al. used normalized one-norm and kick-out condition to encode images, which had 22% execution time improvement ratio in average when compared with the traditional method [12]. Amol et al. extracted image feature for the encoding of fractal image, which reduced encoding-decoding time and achieved a good quality of compressed image [13]. Also, Schwartz et al. presented a scheme based on robust feature descriptors to speed up the encoding time [14]. In order to reduce the encoding time, Doudal et al. proposed a faster method by reducing the size of the domain pool, which is based on the lowest horizontal and vertical DCT coefficients of domain blocks, and combined their method to the AP2D approach which uses two domain pools in two steps of encoding [15]. All these methods have a common idea that the best matching domain block is sought in the original images. However, when the image size grows the number of the domain block for matching will increase by multiples, making the computation more time-consuming. In this paper, we propose a new fractal coding method, called fractal dictionary coding (FDC), based on the predefined dictionary. The paper is organized as follows: In Section 2, we introduce the theory of traditional fractal coding algorithm. In Section 3, we introduce our algorithm. Then the experimental results are demonstrated in Section 4, which is followed by conclusions.

The Traditional Fractal Coding Algorithm
Fractal coding is based on Iterated Function System. An image is first partitioned into non-overlapping cells as range blocks. Then, the image is also divided into overlapping sub-blocks, which are named as domain blocks. Each range block is mapped from one of the domain blocks (See Figure 1 [16]).
Generally, the domain block has double size of the range block. For each domain block, we map a square domain cell to a square range cell [1] by Eq. (1), where D is the domain block,D D is the mapped domain block, and k and l are the horizontal and vertical cell position in image, respectively.
To find a best matching domain block, a range block has to be compared with each linear processed domain block to record the minimum distortion. The linear processing includes eight affine transformations. These transformations do not modify pixel values; they simply shuffle pixels within a range block, in a deterministic way. Besides, we denote d as the distortion criterion between the range block and the transformed domain block. See Eq. (2).
where N is the pixel number of a block, s is contrast factor, o is luminance factor, x ij represents the value of pixel (i, j) in range block R and y ij represents the value of pixel (i, j) in affine transformed domain block D. After comparisons with the range block, a transform minimizing the distortion between the range block and the selected domain block is recorded.
In the coding process, the domain block with the minimum distortion d is regarded as the proper one to match the range block. Applying the Least-Square Method on Eq. (2), s and o can be calculated according to the minimum distortion d, respectively. The Equations are defined as follows:    After encoding the image, the best domain block's position (i, j), s, o and t, five parameters totally, are stored.
The decoding operation simply iterates on an initial image with encoding parameters, defined as Eq. (5), until converging to the final decoded image. The initial image D 0 can be an arbitrary image with the same size as the encoded image.
where D j-1 represents the domain block at the (j-1)-th iteration, U is a matrix with the same size as D j-1 , whose elements are all 1 and R j represents the result of the j-th iteration. In general, the process should be in 7 to 8 iterations to obtain a good decoded image.

The Proposed Method
For the traditional fractal encoding, a matching procedure with domain blocks is very time consuming. It is a challenge to reduce the comparison number. In traditional method, each image has its own domain block pool, which means the domain block pools may be different if the images are not the same. Due to the fact that domain blocks cannot be reused, it increases the extra computation time. There is also another problem that the number of domain blocks increases quickly with the image size growing. What's more, in the decoding process the iteration has to be repeated several times to reduce the decoding error. As Ozawa [17] shows that every two images can be used as the other's domain block pool and can be mutually encoded, we believe that there is also a pair of relations between an image and a fractal image. With this mind, we establish a codebook, which is called dictionary in the algorithm. For a range block in image, we can search a matching domain block in a public domain block pool. Once the distortion criterion satisfies Eq. (6), the best matching domain block is found.
where R i represents the i-th range block in the original image, D k is the k-th domain block in a BTC (Block truncation coding [18]) queue, which will be discussed in Section 3.1, and D m(i) is the domain block that minimizes the value of d with the range block.
In the decoding process, the reconstruction is completed by calculation with the domain blocks in the dictionary only once. Thereby, the decoding process operation can be accomplished quickly and has no iterative errors.

Block Truncation Coding
Delp et al. [18] presented a Block truncation coding (BTC) scheme for image compression. It is a type of lossy image compression technique for grayscale images. In this method, each block can be converted a BTC value. Firstly, the original image is divided into non-overlapped blocks. For each block, one pixel in it is represented by one bit. As shown in Eq. (7), if the pixel value is  greater than or equal to the average value of the block X ave , then we set the pixel as 1, otherwise set the pixel as 0.
x _~1 , x §x ave 0, xvx ave ð7Þ We can treat this matrix as a vector with a binary sequence and calculate its decimal number which is called BTC value. Because each domain block owns a unique BTC value and a BTC value can be shared by a series of domain blocks, the BTC value can be treated as a classifier for the domain blocks. Before generating a dictionary in our algorithm, each domain block is classified by the BTC value, i.e. the domain blocks sharing the same BTC value are added to the corresponding BTC queue, as shown in Figure 2.
In the classifying process, the matching norm d is calculated between each block. The block will be added into the dictionary either if the norm value is greater than a threshold or if the number of the blocks with the same BTC value is less than the upper limited number set in the algorithm.

Fractal Dictionary
As mentioned above, ffractal dictionary (FD) is a set of domain blocks. Obviously, for various kinds of images, fractal dictionary with rich domain blocks contents can help achieve good coding results. These rich contents characterize comprehensive image patch features, such as edge features, smooth features and texture features. To obtain abundant image blocks, we utilize the Mandelbrot sets and Julia sets (abbreviated as M sets and J sets) to generate the dictionary. M sets and J sets are the classic fractal images. They have rich information (see the discussion in Section 4.4.3), and can display the embedding topological structures in different scales.
Take the mapping function f(z) = z 2 +c for instance, the M set records the value of a unique c under the iteration regulation z n+1 = z n 2 +c. The J set is a result of a fixed c value with a location z under the same regulation, which is depicted in Figure 3. The points in the M set and at the border of it are converged, while the outside sets are diffused.
When selecting a converged point from M set, we can easily construct a J set. As Figure 3 shows, the J sets are various when c values are selected differently. In our research, all the converged points from the M set are selected to construct the J sets whose the image blocks are prepared for generating a dictionary. The details are as follows: (1) Obtain parameters of J sets. The first step is to create the standard M set image of the W6W size. We choose the converged points from M set image, denoted by W N , where N is the number of the selected points. (2) Construct J sets. For each point in W N , we can get the unique c value and create the corresponding J set of W6W size by the escape time algorithm [19]. In the algorithm, the escape time is recorded as pixel value, denoted by V (k, l) , where    (k, l) represents the pixel coordinates (k, l). All the values should satisfy Eq. (8).
where Max_Iterative represents the maximum number of iterations.
Note thatV V (k, l) is the value in (k, l) after normalization and 256 is the color level of greyscale image. (4) Generate domain blocks. After the preprocessed J set are constructed, we regard the J set as a domain block pool and divide them into domain blocks, as depicted in Figure 1. (5) Classify the domain blocks. As Section 3.1 discussed, we construct the fractal dictionary. The details are shown in Figure 2.
Once fractal dictionary is generated, it should be optimized to make the dictionary rich enough before directly using for coding. The following aspects should be considered. N The number of domain blocks in a BTC queue. First, a BTC queue should have enough blocks. Only in this case, it can be fully guaranteed that the image can find a suitable domain block in the dictionary. However, if the dictionary has too many blocks, the speed of the coding will slow down, which will be discussed in Section 4.5. In our experiment, the size of a BTC queue is 10.
N The number of redundancy blocks in dictionary. Due to the similarity of the M-set image, the dictionary also includes a large number of similar blocks, which are considered as duplicate blocks. Obviously, these duplicate blocks should be removed from the dictionary. A BTC queue in dictionary should consist of K domain blocks. Initially, the block can be added to the queue, if it has more than 30 distortion with other blocks, where the 30 is an estimated value. In the optimization process, each image block in the dictionary is multiplied by a large prime number (similar with Eq. (9)

Image Coding
Once the fractal dictionary is constructed, the image can be encoded as follows: as the best block matching with R i , which has the minimum value of d (R i , D m(i) ). The process is as follows.
1) Calculate the BTC value of range block and locate the corresponding BTC queue. 2) Calculate the distortion metric. From the above analyses, the block with affine transformation (t) that minimizing the distortion metric by Eq. (2) is the best matching domain block. Record its position (offset) in the BTC queue, the contrast factor (s) and the luminance factor (o) and the affine transformation (t).
Finally, each range block of the encoded image is stored in the format of (BTC, offset, t, s, o). In the proposed algorithm, the searching process is just in a shared dictionary file. The dictionary file can be pre-loaded in the memory, reducing the reading time from the disk.

Decoding Process
Because the dictionary is fixed, the iteration in decoding process only executes once. The algorithm is described as follows: 1. Load the dictionary.  the dictionary. The reconstruction only need one iteration, so j = 1.

Simulation Results
We use the Miscellaneous [20] as our database, which consists of 16 color images and 28 monochrome images. The compression system is shown in Figure 4. All experiments were conducted on a Core(TM) i5(2.40 GHz) PC. In the experiment, the size of the M set and the size of J set are both assigned as 256. The mapping function of both M set and J set are f(z) = z 2 +c. We construct various kinds of J set based on the converged points in M set.
The J sets are divided into 868 block, all of which are used Eq.
(1) to regulate a 464 block, so that encoding time can be reduced when compressing an image. The sixteen pixel values in a domain block with binary format can be converted to a decimal number-BTC value, ranging from 0 to 65535. Due to the fact that all pixel values are not smaller than the average value, so the minimum number 0 doses not exist. Although all pixel values are not able to be bigger than the average, they maybe equal to the average value. According to Eq. (7), the maximum number can be 65,535. The number of blocks K with the same BTC value is assigned as 10 at most, thus the final dictionary used in the experiments contains (65,53621)610 = 655,350 domain blocks. Moreover, for arbitrary blocks with the BTC value 65535, the matching norm value is 0, according to Eq. (2). It means that a block with BTC value 65,535 can represent all blocks with the same BTC value. Therefore the number of blocks in the directory is 65535029 = 655341 in fact.

Compression Ratio
The traditional scheme needs five parameters for image reconstruction, that is domain block position (D x , D y ), s, o and t. Suppose s and o are both assigned eight bits and affine transformation t is assigned three bits; for an image with 2566256 size, the domain block position is assigned eight bits to D x and D y , respectively. So the compression ratio is 3.66 when the range block size is 464. However, if the image size becomes larger, such as 5126512, the corresponding bit allocation for domain block position will be eighteen bits totally, and the ratio becomes lower, 3.46.  In the proposed method, a fixed domain block size, 464, is applied into the algorithm. The bit allocation for each parameter is shown in Table 1.
The corresponding compression ratio is (46468)/(16+4+3+8+ 8)<3.28. As we do not record the domain block position, the compression ratio does not change for different size. In addition, two bits can be allocated for quantization of s, which has four quantized value {0.25, 0.5, 0.75, 1} with approximately zero quantization error [21]. In this way, the ratio is up to (46468)/ (16+4+3+2+8)<3.87. However, if we try to quantize the s and o too simple, the compression ratio would grow, resulting in a poor reconstruction quality. Besides, if we set a range block size 868, the corresponding BTC bit allocation would be 64, so the compression ratio is (86868)/(64+4+3+8+8)<5.89. From above analysis, in this fractal coding method, the domain block size determines the compression ratio when the sizes of other parameters are fixed.

PSNR and Time Consumption
PSNR (Peak signal-to-noise ratio) is commonly used to measure the quality of reconstruction for images. When comparing compression algorithms, PSNR is an approximation to human perception of reconstruction quality. It is defined as follows: where MSE is mean squared error between two images, and max is the maximum possible pixel value of the image. In this case, max = 255. Typical values for the PSNR in lossy image are between 30 and 50 dB, provided the bit depth is 8 bit, where higher is better. In addition, due to the encoding speed problem in fractal coding, time consuming is also another index that we should take into consideration. For a large-size image, the encoding process needs a long time. As discussed above, the performance can be evaluated by the time consumption and the value PSNR, two comparative indices. Figure 5 shows the decoded images by fractal coding algorithm (TFC) and our proposed algorithm. Figure 6 is the decoded images of Baboon, and the second row is the comparison regions of region A and region B. It can be seen from Figure 6 that the TFC is failed with dealing with the patches with details, because, as we can see from region A and region B, its recovery is blurring.

Comparison of Other Scheme
In this section, we compare our proposed method (FDC) with five schemes, which are traditional fractal coding (TFC), EP_NRS [22], mutual coding algorithm [17], random sequence dictionary based encoding and VQ method. The experimental results show that the FDC works better both in image reconstruction and in reduction of encoding time.  Table 2. It is obvious that FDC is 158 times quicker than TFC at least, saving plenty of time, and the value of PSNR is almost larger than 30, which is also higher than the former. The experimental results can validate the efficiency of the proposed algorithm. By using FDC, the coding efficiency improves greatly. The computational complexity of the traditional fractal encoding is O((n/4)6(n/4)6(n-K+1)6(n-K+1)),O(n 4 ) for a n6n image when the size of domain block is K. While the computational complexity of our proposed method achieves 106O((n/4)6(n/4)),O(n 2 ), which has an obvious advantage over the traditional method.
Therefore, it shows that the FDC can achieve a good coding performance and has a great advantage in coding time.
In addition, we test the time consumption and PSNR value of Lena images with different sizes and the results are shown in   [22] and the mutual coding algorithm [17].
The results are shown in Table 4. Considering the image size is 5126512 and the small size range block will significantly reduce the encoding speed, we select range block as size of 868 when applying the above three algorithms. The green channel of Lena image is processed in the experiments since it is similar with the greyscale Lena image. The experimental results of EP_NRS algorithm are obtained from Lin [22]. The efficiency of the mutual coding algorithm depends on the choice of range image, which could enhance the PSNR, but it does not reduce the coding time.
The EP_NRS algorithm has an advantage in time consumption through classifying range blocks of the image. However, it search domain blocks in a relatively comprehensive and strict way, causing the PSNR value decreasing correspondingly. In our proposed algorithm, the PSNR value is competitive. What's more, the comparing block number is the smallest, so that it could reduce the coding time greatly. The speed of FDC algorithm is 400 times faster than that of TFC algorithm and 100 times faster than that of EP_NRS algorithm. In conclusion, the proposed algorithm based on the fractal dictionary can achieve a good performance in both reconstruction quality and time consumption.

Comparison with Random Sequence based
Dictionary. Considering that the computer can neither produce a large amount of random sequences nor produce a large amount of random blocks with BTC classification at one time, we present the following algorithm to construct random sequence based dictionary: Step 1: Generate an array of 256 numbers ranging from 0 to 255 randomly.
Step 2: Randomly produce a number as the array index to indicate the number in the array.
Step 3: Repeat step 1 and step 2 sixteen times to get a 464 size block. Calculate its BTC value and add the block to the corresponding BTC queue. If the BTC queue is full, then calculate the distortions between the new block and the other blocks in the BTC queue, replacing the domain block minimizing distortion with the new one.
Step 4: Repeat step 1 to step 3, until the number of the blocks in the dictionary reaches 10665535 (block number in BTC queue multiplied by the number of BTC values). Figure 7 is the comparison between the random dictionary and fractal dictionary. As Figure 7 shows, both random dictionary and fractal dictionary have a good recovery performance in smooth patches and texture patches. However, when it comes to apparent edge blocks, the fractal dictionary does better.
In order to demonstrate the proposed method has a good quality in apparent edge recovery, we select a J set with mapping parameters c = -0.75+0.056i and enlarge its bottom right part (see Figure 8). From the left one, we can easily see that the J set contains apparent edge patches and smooth patches. From the right one, we can see the border of the J set is not smooth, which contributes to restoring the texture patches in the image. So we deem the J sets have rich information. Based on the above discussion, we arrive at the conclusion that a suitable domain block from J sets with a transformation that minimizing the distortion can be found to map with the range block in images. That is to say, it is able to encode an image by the J sets. However, for a random sequence, it is hard to present an apparent and smooth edge.
4.4.4 Comparisons with VQ method. In addition, we use the Lena image and Baboon image as training sets, constructing a VQ (Vector Quantization) codebook with 256 vectors based on LBG [23]. The details are as follows: Codebook1 is trained on both Lena and Baboon image, Codebook2 is trained only on Baboon image and Codebook3 is trained only on the Lena image.
As the proposed method has extra affine transformation calculation processes, the distortion between the range block and domain block becomes more accurate, which contributes to the good reconstruction quality. Table 5 shows that our proposed method achieves better performance than the VQ scheme.
In the end, we make a comparison with VQ method, TFC method and FDC method, shown in Table 6. Also, Table 7 lists the PSNR values of the whole dataset with random sequence based dictionary method, VQ method, TFC method and FDC method.

The Number of Blocks in a BTC Queue
A BTC queue should have enough blocks for matching the range blocks, but too many blocks will reduce the speed of the algorithm. Suppose that if a BTC queue has very few blocks, the range block would be hard to search a suitable domain block as well as the transformation that minimizes the distortion. On the other hand, if a BTC queue has too many blocks, the searching will become time consuming, because it has to calculate the distortion between each domain block in the queue with the range block to get the closest one. In the experiment, we test the 2566256 Stream and bridge image. This image has many texture patches and texture patches are various. Therefore, it should have enough domain blocks preserved in the dictionary so that the range block with texture patches can find a suitable domain block with the transformation minimizing the distortion, while the smooth patches do not need many domain blocks for matching. The number of domain blocks in a BTC queue impacts on the restoration quality of the image with a certain amount of texture patches. In our experiments, we also test the 2566256 Lena image and the 2566256 Clock image, both of which have many smooth patches.
From Figure 9, we can see that when the block number increases, its PSNR grows. However, its comparison time is in linear growth, which means the time in encoding increases at the same time. In the experiments, we select 10 domain blocks in a queue as a compromise proposal.

Conclusions
This paper has presented a new method of fractal image coding. It was based on a fractal dictionary, consisting of rich domain blocks generating from J sets. An image range block can be matched with the best-matching block in the dictionary by less comparison without losing the reconstruction quality. Experimental results show that the block number for comparisons during coding was obviously less than literature algorithms, which could explain why our algorithm is faster than other schemes. In addition, the PSNR is satisfying. Therefore it could give a good reconstruction quality when a fixed fractal dictionary is adopted. What's more, the performance of the proposed algorithm has a superiority of the speed, especially for the large size images.
For future work, we will consider the bit allocation for coding parameters to improve the compression ratio and try to make the fractal dictionary adaptive to different size of domain blocks.

Author Contributions
Conceived and designed the experiments: YYS. Performed the experiments: RDX. Analyzed the data: LNC. Contributed reagents/materials/ analysis tools: RQK XPH. Wrote the paper: YYS RDX.