An algorithm of image mosaic based on binary tree and eliminating distortion error

The traditional image mosaic result based on SIFT feature points extraction, to some extent, has distortion errors: the larger the input image set, the greater the spliced panoramic distortion. To achieve the goal of creating a high-quality panorama, a new and improved algorithm based on the A-KAZE feature is proposed in this paper. This includes changing the way reference image are selected and putting forward a method for selecting a reference image based on the binary tree model, which takes the input image set as the leaf node set of a binary tree and uses the bottom-up approach to construct a complete binary tree. The root node image of the binary tree is the ultimate panorama obtained by stitching. Compared with the traditional way, the novel method improves the accuracy of feature points detection and enhances the stitching quality of the panorama. Additionally, the improved method proposes an automatic image straightening model to rectify the panorama, which further improves the panoramic distortion. The experimental results show that the proposed method cannot only enhance the efficiency of image stitching processing, but also reduce the panoramic distortion errors and obtain a better quality panoramic result.


Introduction
Image mosaic is the integration of multiple images with overlapping regions into a non-distorted, high-resolution panoramic image [1][2]. Improving the real-time and splicing quality of image mosaic has become an important research agenda in the field of computer vision and graphics [3][4].
In the field of image stitching, related algorithms can be divided into two kinds: image mosaic based on gray level information, and image mosaic based on the features [5]. The former, based on the gray level information, is to calculate the similarity degree of the two images by using the pixel value of the image to be spliced. This is done to determine the overlapping area of the splicing and realize the splicing of the image. However, this method is computationally intensive and not robust enough [6]. The latter, based on the features, is to extract the relevant feature information in the image, then match the features of the two images to obtain a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 the mapping relation between the images. Most of the algorithms are based on the local feature because of its efficiency and robustness.
In 2004, Lowe D. G. summarized and formalized the SIFT (Scale Invariable Feature Transform) algorithm [7]. In 2007, Lowe D. G. expanded his previous work, and at the same time presented a panoramic automatic stitching software in the paper [8], which is very robust in image rotation, scaling, and scale transformation. However, because of its large amount of calculations, it is difficult to meet real-time requirements. In 2008, Bay H. proposed the SURF (Speeded Up Robust Features) algorithm [9]. The descriptor of the algorithm has lower complexity than SIFT, which improves the real-time performance of the algorithm. However, in the process of constructing the image of the Pyramid, both use the linear Gauss expansion filters, which causes boundary obscurities and loss of important details [10], thus affecting the accuracy of the feature points. In 2011, Rublee E. proposed the ORB (Oriented FAST and Rotated BRIEF) algorithm [11], which is a fast feature extraction and matching algorithm. It is very quick, but it is less effective in terms of scale. In recent year, many researchers have done a lot of research on image mosaics [12][13][14][15][16]. The A-KAZE algorithm based on nonlinear scale decomposition can solve the above problems. Therefore, we uses A-KAZE algorithm to extract image feature points [17][18], to ensure the real-time and accuracy of feature point extraction and location.
The image splicing process of Song F. H. takes the first image of the sequence images as the reference image [19], and gradually splices the panorama from left to right. In the case of the large number of input images, the result of the final image splicing is seriously distorted. After that, the authors propose an improved algorithm to select the middle position of the scene as the reference image [20], gradually splicing panorama in the order from the middle to the two sides. Theoretically, the improved method reduces the distortion in half compared with the original method. At the same time, we also use the camera calibration method for image stitching [21]. On this basis, we puts forward the method of image splicing based on the binary tree model. The proposed method takes the input image set as the leaf node set of a binary tree, then uses the bottom-up approach to construct a complete binary tree with the root node image of the binary tree as the ultimate panorama obtained by stitching. We also proposes an automatic image straightening model according to the different degrees of distortion and morphology of the panorama. It has been demonstrated that this method can significantly reduce the distortion of the panorama in the image mosaic of traditional digital image processing. The overall flow chart of proposed method is shown in Fig 1. The improved image splicing method is shown in the following steps: Input: n(n�2) images sequence S(S 1 ,S 2 ,. . .S n ) with overlapping regional sequence.
(i) Image preprocessing: This stage mainly includes image denoising, image geometric correction, color correction and cylindrical projection to facilitate subsequent image stitching.
(ii) Construction of the binary tree model: According to the properties of the binary tree, there are at most 2 i−1 (i�1) nodes in the i layer of the non-empty binary tree, the sequence S(S 1 ,S 2 ,. . .S n ) of the input n images is taken as the leaf node set of the binary tree. The number of layers i of the binary tree is obtained by the formula i ¼ dlog n 2 e þ 1. Construct the binary tree model of n images as leaf nodes by the input.
(iii) Image registration of all left and right subtrees in the binary tree model: (a) The images of the left subtree and the right subtree are respectively S k and S k−1 ,k2[1,n −1]. Using A-KAZE algorithm to extract the feature points of image S k and image S k−1 , then using bidirectional KNN algorithm to search the matching feature points between S k and S k−1 according to the shortest Euclidean distance, and the feature point set is stored in array featureList.
(b) Calculate the affine transformation model that images S k to S k−1 based on the matching feature point data set featureList, and the result is saved in the array HList; (c) Return steps (a) to continue the image registration of the left subtree and the right subtree until all leaf node image registration is completed.
(iv) According to the array HList, use the formula (5) to calculate the affine transformation matrix H of the right subtree image to the left subtree image in the leaf nodes of binary tree.
(v) Apply the obtained affine matrix to the right subtree image, so as to the right subtree image has the same coordinate system with the left subtree image.
(vi) Find the optimal splicing between the right subtree image and the left subtree image, and then Laplacian fusion based on optimal stitching seam to achieve the seamless splicing of the two images and get the splicing result graph ImgResult.
(vii) Add ImgResult to the set S to replace the two spliced input images, and then the next splicing process with ImgResult is regarded as the new input image.
(viii) Go to step (iv) to perform the leaf node image splicing of the next group of left and right subtrees, until all leaf node images splicing is completed.
(ix) Go to step (ii) to construct the complete binary tree from the next bottom-up recursion, until there is only one image in the set S, which is the panoramic view.
(x) Go through the automatic image straightening model to get the panorama corrected. Output: A panorama that completes the binary tree splicing and improves the distortion error.

A-KAZE feature point extraction
Alcantarilla et al. [17] proposed a new and fast multi-scale feature detection and description algorithm called A-KAZE in 2013. The three main steps to extract the image features in the A-KAZE algorithm include: (i) Non-linear scale space is constructed by using the principle of nonlinear diffusion filter and the fast explicit diffusion (FED) algorithm to solve implicit difference equations [22][23]. The nonlinear diffusion equation is: Where L is the brightness of the image, t is the scale parameter, div and r represent the divergence and gradient operators respectively, with c(x,y,t) being the conductivity function.
(ii) The feature points of interest are detected. These feature points are in the non-linear scale space and are the local maxima (3×3 pixel field) of the Hessian matrix determinant after scaling. The calculation of Hessian matrix is shown in the following equation: In the formula (2), s 2 i;norm is the normalized scale factor of the octave of each image in the nonlinear scale. L i xx and L i yy , respectively, the horizontal and vertical image of the second-order partial derivative, L i xy is cross-partial derivative. (iii) The eigenvectors are constructed and the main directions of the eigenvalues are calculated. Based on the first-order differential images, the eigenvectors with scale and rotation invariance are extracted. A-KAZE uses a new kind of binary descriptor M-LDB (Modified-Local Difference Binary) to describe the feature points. We select a patch around the feature point, divide each image patch into n×n equal-sized grids, and extract representative information from each grid cell. Then, binary test operations on a pair of grid cells (i and j) are performed. Binary test operation ϖ is shown in the following equation: ( Where Func(�) represents the function for extracting information from a grid cell.

A-KAZE feature registration
After extracting the A-KAZE feature points, two KD-trees are constructed for the reference image and the target image respectively. The next step is taking one of them in turn as the reference for KNN(K Nearest Neighbor) matching, then extract the public matching pairs of matching operations as the initial matching. Finally, the RANSAC algorithm is adopted to remove the outer points and estimate the affine transformation matrix between images. r (r = 3) pairs of sets are randomly selected from the N matched pairs in the rough matching to estimate the parameters of the affine transformation matrix. The affine matrix structure is shown in the following equation: where h 11 ,h 12 ,h 13 ,h 21 ,h 22 ,h 23 make up the affine transformation matrix. We choose the matrix H as the affine transformation matrix, which corresponds to the maximum number of inliers. The formula for calculating the affine transformation matrix is Due to the affine transform matrix having 6 degrees of freedom, 3 pairs of unfair line matching feature points are randomly selected to estimate the transformation matrix.

Find the optimal stitching line
The goal of image fusion is to automatically transfer the meaningful information contained in multiple source images to a single fused image without information loss [24][25][26]. After image registration, direct synthesis will lead to discontinuity of color transition [27] and image artifacts appear [28][29][30] when there are moving objects. So, it is needed to find an optimal stitching line to eliminate the artifacts and hide the image edges [31], which requires the color difference between the two sides of the image to be kept at a minimum and the geometry of the neighborhood to be similar. Therefore, the idea of a dynamic programming method is used to obtain the optimal stitching line with minimum energy. The energy formula is defined as, Where E C represents the color difference in 5×5 rectangular scope around stitching line pixels. E G represents the change of texture. α and β are weight values. α+β is equal to 1. We  carried out many experiments of α and β and found the outcome to be 0.83 and 0.17 respectively.
According to the energy formula, the overlapping points are taken as the starting point P, and the three pixels adjacent to the P point are taken as the direction of expansion to find the optimal stitching line.

Elimination of the stitching line
In the actual operation, the image mosaic traces still exist. This is due to different shooting angles, which lead to different image exposures.
For the stitching line to have a natural transition, the method of Laplacian fusion is used to eliminate the stitching line by creating the mask image I R of the stitching line. The area on the   Image mosaic based on binary tree and eliminating distortion error left side of the stitching line is filled with a pixel value of 255, and the right side is filled with a pixel value of 0, as shown in Fig 6(C). The minimum bounding rectangle R of the optimal splicing is the area framed by dashed frame, and the left and right boundary of R are derived as x min and x max . According to the experimental results, an empirical threshold value ξ(ξ = 30) is obtained. We take  Image mosaic based on binary tree and eliminating distortion error the rectangular area R' to make the left boundary as x min −ξ, the right boundary as x max +ξ. Rectangular area R' is framed by solid line frame. The steps of Laplacian fusion algorithm are as follows: (i) The target image I 1 and reference image I 2 after registrations are expanded to the same size as the mask image, and the extended partial pixel values are assigned to 0, as shown in Fig  6(A) and 6(B). Then, we build Laplacian Pyramid, iþ1 is the same size as G l and is obtained through the Up-sampling from G l+1 (iv) The mask image is processed by Gauss expansion, which makes the area of the stitching line more smooth. Then, we create the Gauss Pyramid of I R and it is recorded as G R .
(v) According to the specific fusion criteria, the two images I Image mosaic based on binary tree and eliminating distortion error following equation:  Image mosaic based on binary tree and eliminating distortion error

Image mosaic based on binary tree model
Song F. H. [19] use the first image in the sequence as a reference image and each splicing process uses the previous image as a new reference. Therefore, the overlapping area between the new input image and the reference image occupies a smaller proportion of the total reference image area, and the image matching will consume a lot of system resources with very slow splice speed.
After that, reference [20] proposes an improved algorithm to start splicing from the middle of the scene sequence image. The image in the intermediate position is used as the reference image. By calculating the affine transformation matrix H[i] between adjacent images, the affine transformation matrix of arbitrary position image to a reference image is indirectly obtained. As shown in the formula (14) (k stands for intermediate image position index), the next spliced image is dynamically selected according to the number of feature points matching the adjacent image of the statistics. The process of calculating the affine transformation matrix is shown in Fig 9. H ¼ Image mosaic based on binary tree and eliminating distortion error A method of image splicing based on binary tree model is proposed. The main idea is to build a new binary tree from the bottom of the leaf, as shown in Fig 10. According to the nature of the binary tree: at the i-layer of the non-empty binary tree, there are at most 2 i−1 nodes (i�1), the set of the input n(n�2) images is taken as the leaf node set S(S 1 ,S 2 ,. . .S n ) of the binary tree, and then the number i stands for binary tree layers of the n image structure is obtained by the following equation: Through the recursive method, the bottom-up method is used to recur i times. Each recursion, according to the statistics of the node image matching, the number of feature points to select the splicing of the reference image and construct a complete binary tree, then obtain the binary tree root node, that is, the panorama of multiple images being stitched.
In Fig 10, if n = 8, that is, the leaf nodes of the i ¼ dlog n 2 e þ 1, then i = 4 layer represent the input image, and the i < dlog n 2 e þ 1 layer is constructed by concatenating the bottom-up recursion. When i = 1, the end of the recursion, the root node is the panorama of multiple images being stitched, the experimental results are shown in Fig 11.

Model of automatic image straightening
When splicing multiple image sequences, the oblique distortion occurs due to the accumulation of errors, and it becomes more obvious when the number of input images are increased [32][33]. Therefore, we put forward an automatic image straightening model for the different slanting degree and slanting morphology of the multi-image mosaic. As shown in Fig 12 , respectively (a.x,a.y), (b.x,b.y), (c.x,c.y) and (d.x,d.y). Then the tilt angle of the panorama θ is calculated using the triangular anti-tangent formula equation: If T 1 < y < T 2 ; ð0 � T 1 < T 2 � p 2 Þ, T 2 ,T 1 are the thresholds of the correct value model. After many experiments, T 1 and T 2 are equal to 1˚and 10˚respectively. Then the value of the panoramic graph is evaluated. As shown in Fig 12(D), set the four vertex coordinates after straightening. The top left and bottom left coordinates are not changed, and the top right and bottom right coordinates are (X,a.y) and (X,b.y). X is the length of the pre-estimated panoramic image. Since the panorama before the correction is distorted, the maximum of both X 1 Image mosaic based on binary tree and eliminating distortion error and X 2 is taken as the length of the panorama after the correction.
The perspective transformation matrix can be calculated by four pairs of coordinate points, which can be applied to the whole panorama and the bilinear interpolation can be used to complete the image straightening. The resulting image correction equation is shown in the following equation:

> < > :
Where S is an image matrix and H is the perspective transformation matrix. The experimental results are shown in Fig 13.

The experimental results and analysis
The proposed method is used to experiment with multiple sets of images from multiple scenes. We selected 4 sets of comparative experimental results from multiple experiments to show and analyze in the article. The following are experimental software and hardware environments: CPU: Intel(R) Core(TM) i3-2330M 2.20GHz, OS: Windows 7, Library: OpenCV 3.0.0.  It can be seen from Table 1 that even though the number of feature points extracted by SIFT is larger than that of A-KAZE, the A-KAZE algorithm compares the SIFT algorithm with the same number of feature points. The time cost of A-KAZE significantly is less than the traditional SIFT algorithm.
The correct matching probability of the image is defined in the formula (19). The experimental results are shown in  Image mosaic based on binary tree and eliminating distortion error Combined with the Fig 22, it can be seen that after RANSAC eliminates the false matching, the proposed method has a higher correct matching rate and a stronger robustness compared to the SIFT feature matching algorithm. Fig 23 shows the comparisons of four methods of their mosaic time of the panorama. We can clearly see that the total stitching time of the proposed algorithm is similar to that based on the camera calibration method, and the splicing efficiency of the two algorithms is obviously better than the other three splicing algorithms. But the camera calibration method is based on image block, which make a rough match in the blocked images and a fine match in the most Table 1. Comparison of experimental data for extracting feature points.

Fig
Number similar blocks by taking advantage of the FAST algorithm. Although the method based on the camera calibration is slightly better in time efficiency than the proposed algorithm, it can be seen from Table 2 and Table 3 that the method based on the camera calibration easily causes a certain degree of distortion in the splicing that resulting in distortion of the panorama. Therefore, for the comprehensive consideration, the proposed algorithm is still superior to the camera calibration method.  Image mosaic based on binary tree and eliminating distortion error Angle β is used to express the distortion degree, as shown in Fig 24. The point P 1 and the point P 2 are the midpoint of the left boundary and the right boundary for the panoramic image respectively. Two points are connected into a straight line and then the angle β is obtained between this line and the horizontal line.
It can be seen from Table 1 that the panorama obtained by the algorithm proposed in this paper has almost no distortion. Compared with the other four algorithms, our algorithm is obviously superior to others.
We also define the variable Info proportion, which indicates the ratio of the number of useful pixels in the scene to the total number of pixels of the whole image. The definition is shown in the formula (20), in which width and height represent the size of the panoramic image. The variable Info proportion can also approximately reflect panorama distortion degree.
From Table 2 and Table 3, it can be clearly seen that, as the number of stitched images and the resolution of the images increase, the panoramic image produced by the stitching of the Song algorithm has severe distortion and tilt, which results in a low proportion of information of the panorama. The other three methods, i.e., the splicing algorithm spliced from the middle to the two sides, the method based on the autostitch and the algorithm based on the camera calibration, show the similar results that when the number of stitched images increases, the distortion of the panorama becomes larger, leading to a decrease in the information ratio of the panorama. However, obviously, the algorithm proposed in this paper is almost unaffected by the number of images and resolution. Therefore, our algorithm can obtain high-quality panoramic images, which greatly improves the panoramic distortion phenomenon.

Conclusions
We presents an image stitching method based on the binary tree, which solves the problems of obscure boundary and detail loss. In addition, the improved method accelerates the panoramic stitching time efficiency and obtains a high-resolution, high-quality panorama image. Being different from the traditional stitching method, the improved method changes the process of selecting the reference image and puts forward a method of image selection based on the binary tree model, which takes the input image set as the leaf node set of binary tree. Then by using the bottom-up approach to construct a complete binary tree, the root node image of the binary tree is the ultimate panorama obtained by stitching. Meanwhile, the improved method proposes an automatic image straightening model to rectify the panorama, which further improves the panoramic distortion. The experimental results show that the proposed method improves the efficiency of splicing, enhances the robustness of feature points matching, and greatly improves the panoramic distortions.