Robust 3D point cloud registration based on bidirectional Maximum Correntropy Criterion

This paper presents a robust 3D point cloud registration algorithm based on bidirectional Maximum Correntropy Criterion (MCC). Comparing with traditional registration algorithm based on the mean square error (MSE), using the MCC is superior in dealing with complex registration problem with non-Gaussian noise and large outliers. Since the MCC is considered as a probability measure which weights the corresponding points for registration, the noisy points are penalized. Moreover, we propose to use bidirectional measures which can maximum the overlapping parts and avoid the registration result being trapped into a local minimum. Both of these strategies can better apply the information theory method to the point cloud registration problem, making the algorithm more robust. In the process of implementation, we integrate the fixed-point optimization technique based on the iterative closest point algorithm, resulting in the correspondence and transformation parameters that are solved iteratively. The comparison experiments under noisy conditions with related algorithms have demonstrated good performance of the proposed algorithm.


Introduction
The development of scanning equipment makes the acquisition of 3D point cloud possible [1,2]. Point cloud registration problem has attracted considerable attentions in computer vision [3,4]. The goal of registration is to find the correspondence and estimate the optimal rigid transformation between two point clouds. The conventional methods based on the iterative closest point (ICP) [5,6] algorithm have been widely used for the advantages of high speed and accuracy. Usually, such methods build a global hard correspondence which are efficient but not robust enough with the presence of various noise, outliers and partial difference.
To overcome the above disadvantage, two kinds of improvements have been applied by researchers. One is to replace the hard assignment with soft assignment, which uses one-tomore correspondences instead of one-to-one mapping [7,8]. For example, Liu et al. [9] added the expectation maximization principle to the ICP, and used the SoftAssign for registration of overlapping point sets. Jian et al. [10] modeled point sets as probability density functions (PDF) by Gaussian mixture model. The distance between two global distributions is minimized for fitting two point sets with some smooth transformation constraints. Similar idea a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 was used by Zhou et al. [11] for non-rigid point set registration. These methods are accurate and robust. But they need to estimate the correspondence from each point to all points. The computational complexity is very high. The other way is to select part of reliable point-pairs to estimate the transformation. The straightforward solution is to add a penalty term for the corresponding points with greater distance [12]. In order to balance the penalty intensity automatically, a parameter related with the overlapping ratio was applied into a series of methods [13][14][15] to adjust the trimming and matching ratio. These methods are effective to overcome the challenge of partial difference. But in reality, when varied kinds of non-ideal conditions occur together, they are hard to deal with them jointly.
Most traditional registration algorithms build the distance measure based on the mean square error (MSE). In information theoretic learning (ITL) [16,17], researchers have noticed the poor performance of MSE to process non-Gaussian data. Instead, correntropy, a nonsecond statistical measure, has shown significant potential in the general robust learning and signal processing [18][19][20][21]. The core reason for its robustness is that the correntropy directly estimates the probability of how similar the two random variables are. The adverse effects of noise and outliers are automatically eliminated by assigning small weights. The related work is a shape matching method [22] based on the Maximum Correntropy Criterion(MCC), which is equivalent to minimizing C-Loss between two point sets. This work has shown some superior in dealing with non-Gaussian noise and large outliers. But the algorithm assumes the correspondences between two point sets should be assigned in advance. Moreover the result's stability should be improved. More recently, Xu et al. [23] employed the correntropy and Yang et al. [24] employ a non-second statistical measure in kernel space under the ICP framework, which can establish the correspondence and do the matching concurrently. However their algorithms are easily converge to partial noisy points. Later, we will present some registration results in [23] in the experiments for comparison. Different from the above methods, this paper presents a robust point cloud registration algorithm based on the bidirectional Maximum Correntropy Criterion (BiMCC). Since the MCC is considered as a probability measure which weights the correspondence of point sets for registration, the noisy points are penalized. Moreover, using bidirectional measure can maximum the overlapping parts and avoid the registration result being trapped into a local minimum. Both of these two strategies can better apply the information theory method to the registration problem, making the algorithm more robust. In the process of algorithm implementation, we integrate the fixed-point optimization technique into the ICP framework, resulting in the correspondence and transformation parameters that are solved jointly. The comparison experiments under noisy conditions with related algorithms have demonstrated good performance of the proposed algorithm.
The rest of the paper is structured as follows. In Section 2, we give a brief review of correntropy. Following that, Section 3 proposes a novel formulation and optimization algorithm for noisy point cloud registration based on the BiMCC. In Section 4, we present experimental results tested on the benchmark data set to demonstrate the performance. Section 5 concludes this paper.

Brief review of correntropy
In real world, signals are generally captured from unconstrained scenarios by sensors, which brings big challenges for computer processing. The point cloud captured from 3D space has unpredictable nature errors. Some errors may be with large magnitude as outliers, some of them may be small but with arbitrary distributions. Therefore, a robust cost function should have the ability to filter out all kinds of errors. Today, the MSE is the most widely used in cost function for registration. However, as a standard second order statistics, it is a good solution only for processing Gaussian noise. Recently, correntropy is proposed in ITL to handle non-Gaussian noises and large outliers, which has widely and successfully applications in signal processing and machine learning. Essentially, correntropy is a second order statistical measure in kernel space, which corresponds to a non-second order measure in original space.
Given two arbitrary random variables X and Y, the correntropy is defined as where κ(Á) stands for a kernel function, satisfying κ(x) ! 0 and denotes the mathematical expectation. In this work, without mentioned otherwise, the kernel function is selected as a Gaussian kernel, given by where σ denotes the kernel bandwidth. In a practical situation, the joint probability density function is usually unknown, and one has to estimate it from a finite number of data pair ðx i ; y i Þ N i¼1 . Based on the Parzen window approach, the sample estimator of correntropy is simply estimated aŝ We can see from the Eq (3) that correntropy is directly related to the probability of how similar two random variables along the line X = Y and in a neighborhood of the joint space controlled by the kernel bandwidth. This is the reason that correntropy weights the correspondence of points.

Proposed algorithm
In this section, we first define the bidirectional correntropy for two point cloud data, and propose the rigid registration formulation for noisy point cloud based on the bidirectional Maximum Correntropy Criterion (BiMCC). And then we develop the optimization method to estimate optimal parameters. Finally, the theory analysis is presented with the algorithm property.

Problem formulation
Given two point clouds in 3D space, denoted as a model point cloud ðỹ j 2 R 3Â1 Þ, the goal for registration is to build up the correspondence and calculate the rigid transformation to achieve the best alignment with two point clouds. We assume the corresponding point forx i in the opposite point cloud isỹ cðiÞ , using c(Á) to denote the corresponding index. The registration process is to rigid transform the pointx i by a rotation matrix R and translation vectort toỹ cðiÞ , then the error between two matched points is For each point in X, it can be found one corresponding point in Y by the nearest neighbor searching. These corresponding points compose a new data set denoted asỸ , which has the same number and order with X. If model point set and its corresponding point set are considered as two random variables with noise and outliers, we measure their correntropy as: It is easy to know thatỸ & Y. So, when we only consider one corresponding direction, part of the points in Y will be ignored in the formulation. Without a good initial position, most of the Y are wrong at the beginning. Especially with the presence of noise, the distance between model points and noisy points may be smaller compared with other true correspondences. In that case, these noisy points are trusted for registration which causes the result being stick to partial noisy region.
To avoid this situation, we define the bidirectional correntropy for two point clouds. Correspondingly, for each point in Y, it can be found one corresponding point in X by the nearest neighbor searching. These corresponding points compose a new data set denoted asX, which has the same number and order with Y. It can be understood as we construct two big point sets fX;Xg and fỸ ; Yg, respectively, with the same point number N = N x + N y . With the help of more correspondence between fX;Xg and fỸ ; Yg, the weight of local registration will be reduced. The result leans to a global registration.
Then, the bidirectional correntropy is formulated as: The two terms in Eq (6) are the sum of the forward distance and backward distance between two point sets respectively. For each term, it is formulated as a non-second order manner. This formulation also denotes a distance measure between two point sets. To minimum the distance measure is equal to maximum the bidirectional correntropy, call as BiMCC. Thus, the proposed the cost function for the point cloud registration based on BiMCC is maxV s ðfX;Xg; fY;Ỹ g; R;t Þ s:t: Note that since R is a rotation matrix, it should satisfy the constrained conditions R T R = I 3 and det(R) = 1. The constraint about rotation matrix R should be added, when we maximize the cost function.

Parameter estimation
In the above cost function Eq (7), there are two kinds of parameters to be optimized: 1) the reconstructed point setsX andỸ . It is equal to find the one-to-one corresponding index c(i) and d(j) between two point sets, 2) the rigid transformation parameters R andt. Actually, these two kinds of parameters are close related with each other. When one of them has been known, the other is easily to be obtained. Moreover, the parameter estimation can be solved like the ICP algorithm under an iteration framework. We call the proposed algorithm as ICP-BiMCC, which is presented as follows: Step 1: Given the (k − 1) th transformation parameters ðR kÀ 1 ;t kÀ 1 Þ, the bidirectional corresponding points between two point-sets are found by the nearest neighbor searching: The above Step 1 finds the closet point in another point set for each point. Some conventional searching strategies like Delaunay triangulation and k-d tree are able to be used in advance to speed up the computation.
Step 2: According to the current correspondence fi; d k ðjÞ N y j¼1 g and fc k ðiÞ Then the transformation parameters are computed by maximizing the following function: Firstly, taking the derivative of Eq (9) with respect tot, we have: where e i ¼ Rũ i þt Àṽ i . To simplify the cost function, we substitutet Ã into Eq (10) and let Intuitively, point sets P and Q are normalized by weighted centroid. Therefore, It is difficult to solve the constraint optimization problem by simple technique. But, Arun et al. [25] has presented in the ICP algorithm to solve for the rotation matrix with respect to the MSE, by calculating the matrix H and its singular value decomposition (SVD). The difference is that our cost function is based on the MCC, the modified matrix H is And we make the SVD of H, it becomes: H = UΛV. Then we get the optimal R Ã : where, detðHÞ > 0; diagð1; 1; À 1Þ; detðHÞ < 0: Finally, with the known parameters R Ã , we can calculate the translation parametert Ã according to Eq (10). R Ã andt Ã are the current transformation under the last k − 1 time result. Thus, the transformation on the original template should be updated. Update R k andt k by: The above Steps 1 and 2 are repeated until satisfying the stop criteria.
We can see that both the optimal parameters R Ã andt Ã depend on the kernel function κ σ (e i ). Whereas, e i is related to the parameters R andt. So it is a fixed-point equation which can be solved by a fixed-point iterative algorithm. For the solution of the algorithm, we assume that the kernel function is fixed as the last iteration k s ðe i Þ ¼ k s ðR kÀ 1ũi þt kÀ 1 Àṽ i Þ. Since after a few iterations, there is no much difference of parameters between two continuous iterations, this approximation will not affect the final registration result. So the complete algorithm only has one iterative layer. Algorithm 1 summarizes the process of the proposed ICP-BiMCC algorithm for rigid point cloud registration. There are some effective ways [26] to set optimal initialization, which are not discussed in this paper. Here, we use the identity matrix as the initial parameter. Output: transformation R andt, correspondences fi; dðjÞ N y j¼1 g and fcðiÞ N x i¼1 ; jg 1: Initialization: Set R 0 = I 3 andt 0 ¼ 0, and randomly initialized σ 0 2: for k = 1, 2, . . ., K do 3: Set up the correspondence c k (i) and d k (j) by nearest neighbor searching; 4: Compute the kernel κ σ (e k ) by R k−1 andt kÀ 1 ; 5: Reconstruct the point sets P and Q; 6: Solve the rotation matrix R Ã by Eq (14); 7: Solve the translation vectort Ã by Eq (10); 8: Update the transformation parameters R k andt k ; 9: Compute the MSE e k ; 10: Set the kernel width σ k ; 11: Until ke k − e k−1 k < ε. return

Theory analysis
Compared with the conventional ICP algorithm, there are two main improvements of the proposed ICP-BiMCC algorithm. We firstly use a bidirectional measure to build the correspondence, and then compute the rigid transformation by a non-second order measure: correntropy. In this section, we analysis the reasons why the ICP-BiMCC algorithm is more robust by these two improvements.

Computing the transformation.
In each iteration, when correspondence has been established, the one-to-one point-pairs are selected as candidates for computing the rigid transformation. The ICP algorithm uses all of them equally to estimate transformation parameters. But when there are some noise or outliers, the wrong correspondences will drag the global transformation to a bad position. Instead, the ICP-BiMCC algorithm utilises the correntropy measure. As we mentioned, the correntropy is directly related to the probability of how similar two random variables are. Intuitively, we can see the distribution of Gaussian function κ σ (e). When e locates outside of the space around 0 vector, the value reduces quickly. That means only when the two corresponding points are close enough, they could be used reliably for estimating transformation parameters. Otherwise, they have little influence for the result. Since the noisy points and outliers are usually far away from the model points, the proposed algorithm can avoid the bad effects caused by noise and outliers in theory.
Here, the kernel bandwidth σ plays an important role in controlling the weight of each corresponding point's contribution. It affects the range which distinguishes the inliers and noisy points. Since the majority correspondence are not accurate at the beginning, all points should be used without obvious differences. Here we set a big bandwidth. Gradually, after a few iterative steps one-to-one correspondences are built accurately. Noisy points should be filtered out by small bandwidth. Therefore, the kernel bandwidth is varied from large to small. It is corresponding to the coarse-to-fine registration strategy. We employ the Silvermans rule [18] in Eq (15) to adjust the kernel width automatically.
where σ E is the standard deviation of k e i k 2 2 and D is the interquartile range. We can see that σ is adjusted according to the current registration's error, which conforms to the above analysis. Without mentioned otherwise, we apply this chosen principle.

Building bidirectional correspondence.
In each iteration, the model point cloud is transformed by the estimated parameters to a new position. The correspondences are built according to the nearest neighbor principle in 3D space. Now, we assume that target point cloud is noisy. Some of points are disturbed by a non-zero mean noise but with small variance. It's can be imagined that these noisy points are similar with the local shape of model point cloud, such as Fig 1. We consider this situation when a certain percentage of model points is matched to noisy data, other points are mismatched. In this case, if we only build a single directional correspondence for every model point, some of them still find the noisy data with very short distance values, and others find the wrong correspondences with big distance values. Based on the above analysis of MCC, the latter will be considered as noisy points which has no contribution for updating the rigid transformation. Thus, the iterative process terminate, and the registration is trapped into a local minimum as Fig 1(c). Whereas, if we use bidirectional measure to build the correspondence, the most points in target point set will find the right correspondences relatively with model points, which weaken the bad influence caused by noise. Therefore, using bidirectional measures between two point sets can maximum the matching overlaps to reach a global optimization as Fig 1(d).

Experimental results
In this section, we compare the performance among the ICP algorithm [5], ICP-MCC algorithm [23] and our ICP-BiMCC algorithm. The registration precision is reported by the MSE between corresponding points. An alternative way to measure the performance is the errors between estimated values and the ground truth simulation parameters. For rigid transformation, they are defined as The experiments are conducted on the public and standard datasets [27] and [28][29][30]. The following experiments are tested on the noisy data and partial overlapping data, respectively.

Testing on noisy data
In this part, the 3D point clouds 'Bunny', 'Dragon' and 'Happy Buddha' data from S1 Dataset in [27] are tested. The original data is used as model point cloud, and the target point cloud is simulated as the following ways. The model data is firstly rotated by 25 degree around three axis and translated by [0.1, 0.1, 0.1] T . Then 20% points are selected randomly to be added with Gaussian noise N(0, 0.02). 10% points are selected randomly to be added with non-zero mean Gaussian noise N(0.003, 0.018). Since the noise is randomly added to the testing data, we repeat every experiment condition for 10 times. The statistical quantitative results: MSE, E R and Et of three point clouds are shown in Table 1. Since the MSE can only deal with global Gaussian noise, the performance is poor of the ICP algorithm. And the ICP-MCC algorithm can filter out some noise and outiers. But they are misled by those non-zero mean noise, so it achieves a local registration. It can be seen that all registration errors of our ICP-BiMCC algorithm are the smallest. The ICP-BiMCC algorithm has the highest robustness and accuracy to noisy data. Fig 2 compares the convergence curves of 'Bunny', 'Dragon' and 'Happy Buddha' among three algorithms. It can be seen all three algorithms are converged and the ICP-BiMCC algorithm achieves the smallest registration error.
To clearly and intuitively demonstrate the results, Fig 1 displays the experiments on the point cloud 'Bunny'. Blue and red points present the model point cloud and the simulated target point cloud, respectively. In the result of Fig 1(b), we can see that the ICP algorithm uses all corresponding points for registration, whose result is inevitably influenced by the noise. The result of Fig 1(c), the ICP-MCC algorithm mismatches at noisy points, since it is difficult to distinguish the noise and outliers in a local minimum region. However, from Fig 1(d) we can see that the registration result of the ICP-BiMCC algorithm is very accurate. And the noise is filtered out clearly.

Testing on partial overlapping data
We design two kinds of experiments to test whether the result will be trapped into local minimum on partial overlapping data. In the first experiment, we cut off part of point set. Specifically, the 3D point cloud 'Dragon' from S1 Dataset is firstly rotated by 10 degree in each axis and translated by [0.1, 0.1, 0] T . And then, the upper part of the body, which refers to those points with the first dimension being smaller than -0.03, is trimmed. The differences between the initial point cloud and the target point cloud are shown in Fig 3(a). It can be seen all three algorithms are converged and the ICP-BiMCC algorithm achieves the smallest registration error.
In the second experiment, we add some local similar shapes to the original data to test the robustness of the algorithm. During the iterative process, since their local similarity, when the result reaches the partial registration, the other part will be easily considered as outliers. But more robust algorithm can avoid it. Specifically, one more leg and one more tail of 3D point cloud 'T-rex' from from S1 Dataset in [30] are copied, and given a rotation and a translation transformation, then added to the original point cloud. The differences between the initial show the registration results by three algorithms. We can see that the results of ICP and ICP-MCC algorithm are mismatched because of the additional part. However, the ICP-BiMCC algorithm can find the maximum overlapping parts and register two point clouds accurately. Fig 5(b) demonstrate their convergence curves. It can be seen all three algorithms are converged and the ICP-BiMCC algorithm achieves the smallest registration error.

Conclusions
This paper presents a robust point cloud registration algorithm based on the BiMCC. The proposed formulation replaces the similarity measure to the BiMCC loss, which is more robust to the noise and outliers in practice and avoid the registration result being trapped into a local minimum caused by bad initializations. The fixed-point optimization technique under the ICP framework is adopted to solve the proposed formulation. Qualitative and quantitative experimental results under noisy conditions show that our approach outperforms other related methods.