Brain MR image segmentation based on an improved active contour model

It is often a difficult task to accurately segment brain magnetic resonance (MR) images with intensity in-homogeneity and noise. This paper introduces a novel level set method for simultaneous brain MR image segmentation and intensity inhomogeneity correction. To reduce the effect of noise, novel anisotropic spatial information, which can preserve more details of edges and corners, is proposed by incorporating the inner relationships among the neighbor pixels. Then the proposed energy function uses the multivariate Student's t-distribution to fit the distribution of the intensities of each tissue. Furthermore, the proposed model utilizes Hidden Markov random fields to model the spatial correlation between neigh-boring pixels/voxels. The means of the multivariate Student's t-distribution can be adaptively estimated by multiplying a bias field to reduce the effect of intensity inhomogeneity. In the end, we reconstructed the energy function to be convex and calculated it by using the Split Bregman method, which allows our framework for random initialization, thereby allowing fully automated applications. Our method can obtain the final result in less than 1 second for 2D image with size 256 × 256 and less than 300 seconds for 3D image with size 256 × 256 × 171. The proposed method was compared to other state-of-the-art segmentation methods using both synthetic and clinical brain MR images and increased the accuracies of the results more than 3%.


Introduction
Brain disease has become one of the most talked-about diseases in the world. Scientists mainly depend on medical imaging technologies to analyze brain diseases. The use of magnetic resonance imaging has become a preferred choice of method because it is generally painless, harmless and can provide very informative diagnostic images of most of the relevant organs and tissues. Among the MRI (Magnetic resonance image) data analysis, precise measurement of the distribution of tissues of interest (TOI), including gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF), plays an important role for brain studies. In order to obtain the precise measurement, accurate image segmentation is a crucial step. Although many segmentation methods have been reported [1], automated and accurate segmentation still remains a difficult task due to the noise and intensity inhomogeneity (also named as bias field), which can easily be found in MRI.
when the intensity inhomogeneity level was high, an accurate bias field was still difficult to estimate. In order to obtain the estimated bias field, Li et al. [31][32][33] used a linear combination of orthogonal basis functions to fit the bias field, which resulted in the estimated bias field being smooth and accurate. However, these methods are based on clustering methods and have not considered spatial information, which makes the segmentation results be inaccurate.
In this paper, we propose a novel region-based level set model for brain MR image segmentation and bias correction. To integrate the spatial information, we propose a novel treatment of Hidden Markov Random field to construct the data term. Since the Student's t-distribution has one more parameter than that of Gaussian distribution, the proposed method uses the multivariate Student's t-distribution to fit the intensity distribution. In order to further reduce the effect of noise, novel anisotropic spatial information is proposed by incorporating the inner relationships among the neighborhood pixels. The anisotropic spatial information can preserve more details of the edges and corners when it is utilized into the multivariate Student's t-distribution. Following the idea of [31][32][33], we estimate the bias field by using a linear combination of orthogonal basis functions, which can guarantee bias field smooth, and introduce it to the data term, which makes the proposed method estimate the bias field meanwhile segmenting images. In order to make the framework more robust, we reconstructed the energy function to be convex and calculated it by using the Split Bregman theory [34,35,39], which allows our framework for random initialization and thereby allowing fully automated applications. The proposed method has been compared to other state-of-the-art segmentation methods in both synthetic and clinical brain MR images to show that our method can obtain results that are more accurate.

Materials
The synthetic brain MR images are generated from the Brain Web (http://brainweb.bic.mni. mcgill.ca/). The Brain Web provides full 3-D simulated brain data sets with three modalities: T1, T2 and PD; and can produce brain data sets with a variety of slice thicknesses, noise levels and intensity inhomogeneity levels; and provides the ground truths. The clinical brain MR images are downloaded from the Internet Brain Segmentation Repository (IBSR, http://www. cma.mgh.harvard.edu/ibsr/ which can provide full 3-D clinical brain MR image data sets and segmentation results to permit a standardized mechanism for evaluation of the sensitivity of a given analysis method for signal to noise ratio, contrast to noise ratio, shape complexity, degree of partial volume effect, etc.

Proposed method
2.2.1 Hidden markov random field models. We consider an alphabet K ¼ f1; 2; Á Á Á ; Kg. Let S ¼ f1; 2; Á Á Á ; Ng be the set of indexes, H be a random field, whose state space is K, so far for 8x 2 S we have HðxÞ 2 K. Let H be the set of all possible configurations of H so that H ¼ P N x¼1 HðxÞ ¼ K S . For HðxÞ x2S 2 H on the product space, the positive probability distribution p(H(x)) satisfies: pðHðxÞÞ > 0; 8HðxÞ 2 H. Let N ¼ fN ðxÞ : x 2 Sg be a neighborhood system on S, such as x= 2N ðxÞ and y 2 N ðxÞ if and only if x 2 N ðyÞ for 8x; y 2 S. Then the neighborhood system N can be introduced by using MRF into the previously considered random field:

pðhðxÞjhðS À fxgÞÞ ¼ pðhðxÞjhðN ðxÞÞÞ ð1Þ
According to the Hammersley-Clifford theorem, the MRF can be characterized by a Gibbs distribution: Where Z is a normalizing constant given by and U(x) is the energy function with the form C is a class of subsets of the indexes that are all neighbors, V c is the clique potential associated with the clique c and depends on the local configuration of c.
Let I be an observable random field with state spaces I, which is also indexed by the supposed set of sites S and is given as: I ¼ Y N x¼1 I ðxÞ; I ðxÞ ¼ fIðxÞ 2 R w g. Given any H 2 H, the random variables I are independent: Then, the joint probability of (I,H) can be written as: Given the neighborhood configuration HðN ðxÞÞ of H(x), the joint probability of any pair of (I (x),H(x)) can be written by using the local characteristics of MRF [40,41] as:

pðIðxÞ; HðxÞjN ðxÞÞ ¼ pðIðxÞjHðxÞÞpðHðxÞjHðN ðxÞÞÞ ð7Þ
Then the marginal probability distribution of I(x), Θ, and N ðxÞ can be written as: Where Θ is the set of parameters and in this case Θ is treated as a random variable. Compared with MRF, which is only defined with respect to H, HMRF is defined with respect to the pair of random variable families (I,H).

2.2.2
The level set method. The snake model evolves a curve/contour from an initial position in the direction normal to the boundary of the object. One limitation of the original snake model is the explicit representation of the curve, thus topological changes (such as merging and breaking of the curve) may be hard to handle. In order to address this problem, a level set model was introduced in [42].
Later, Chan and Vese introduced a level set model (PC model) [7], which is based on the general Mumford-Shah formulation [18], for active contour segmentation. For two-phase segmentation, the minimization in [7] is defined as: where ϕ is the level set function satisfying: C is the contour, H is the Heaviside function: [36]. ν and μ are the weighting parameters. Z O Hð0Þdx is the area term and Z O rjHð0Þjdx is the length term.
In order to segment images into more regions, Vese and Chan [43] improved PC model into multiphase model. For segmenting brain MR images into four classes: WM, GM, CSF and the background, the improved PC model can be written as: Where M i (ϕ 1 , ϕ 2 ) are functions of ϕ, which are designed such that The improved PC model assumes that image intensities are statistically homogeneous in each disjoint region, which makes it sensitive to intensity inhomogeneity. To deal with the effect of intensity inhomogeneity, the LGD method used the Gaussian distribution to describe local region information and the energy function is defined as follows: Where Θ is the parameter of Gaussian distribution, ω(x−y) is a non-negative weighting function such that ω(x−y) = 0 for |x−y| > r and  [4]. The LGD method can reduce the effect of Gaussian noise by using Gaussian distribution; however, when the images have strong noise and severe intensity inhomogeneity, the method is hard to obtain accurate results. In order to reduce the effect of intensity inhomogeneity, our previous work [38] introduced the bias field into LGD: WhereĨ ¼ logðIÞ;B ¼ logðBÞ. B is the bias field, which satisfies: log(I) = log((J + n)ÁB) = log(J + n) + log(B). J is true signal, n is noise. The proposed method can estimate a bias field when segmenting images, however, as analyzed above, the method still sensitive to noise without any spatial information.

Anisotropic spatial information.
Baudes et al. [44] have proved that a non-local neighbor patch contains more information than that of signal pixel in image. Follows this idea, many proposed methods [45][46][47] used the non-local neighbor patch information to reduce the effect of noise. In non-local neighbor patch information based methods, the distance between patches is defined as: Where P x and P y are the neighbor patches centered at x and y, respectively. |P(x)| is the total number of pixels in the neighbor patch. From Eq (14), we can find that each pixel in the patch has same weight, which makes the patch information be isotropic. Fig 1(A) shows a part of simulated brain MR image. The points, marked with red plus (set as A), green plus (set as B) and blue plus (set as C), belong to GM, WM and GM, respectively. Fig 1B-1D shows the intensities information of the square neighbor patches centered at the corresponding pixels with size 7 × 7. By using Eq (14), the distance between P A and P B is 97.73, and the distance between P A and P C is 182.84, means that that the point B is more similar to A than that of C. However, the point B belongs to WM, which makes that the neighbor patch based method hard to find accurate results. In order to deal with this problem, we proposed an anisotropic neighbor patch inner-relationship: where I 0 (x) is the mean intensities of P x without x. β is a nonnegative constant, which is dependent on the standard deviation of the image noise σ, which can be estimated by using the method in [48]. From Eq (15), it can be found that the pixels in the neighbor patch with similar intensities to the center pixel will have higher weights, which make the inner-relationship based neighbor patch be anisotropic and contains more details.
Due to the effect of the noise, the intensities of some pixels are much larger or less than all other pixels in the neighbor patch and we refer to these pixels as sole points. For a we regard x as a sole point. T is non-negative constant (the default value is 0.75 in this paper). The distance, calculated by using dðP x ; P y Þ ¼ X jPðxÞj i¼1 ðR x;i ðIðP x;i Þ À IðP y;i ÞÞÞ 2 , between P A and P B is 4.73, and between P A and P C is 4.68, which means that the point C is more similar to A than that of B. From the definition in Eq (15), it can be found that the neighbor patch is anisotropic, which can contain more detail information.

2.2.4
Level set-type treatment of the HMRF model. As analyzed in our previous work [38], the energy function (Eq 13) is non-convex, which makes the method easily trap into a local minimum. In order to obtain global mini-mum, we improved the energy function by using the Split Bregman method [49], which is a technique for solving a variety of L1-regularized optimization problems, and is particularly effective for problems involving total-variation regularization. In this section, we use two segmentation variables u 1 2 [0,1] and u 2 2 [0,1] to represent the membership functions of four regions: M 1 (u 1 ,u 2 ) = u 1 u 2 , M 2 (u 1 ,u 2 ) = u 1 (1−u 2 ), M 3 (u 1 ,u 2 ) = (1−u 1 )u 2 and M 4 (u 1 ,u 2 ) = (1−u 1 )(1−u 2 ). The HMRF based energy function can be written as: When the observed data follows the Gaussian distribution, the Gaussian function can fit the distribution accurately. However, in brain MR images, the intensities are not always following the Gaussian distribution, which makes Gaussian distribution based methods hard to find satisfied results. By way of contrast, the Student's t-distribution [50] has one more parameter, named as degree of freedom υ. When υ is set as 1, the Student's t-distribution reduces to be the Cauchy distribution. The Student's t-distribution becomes closer to the Gaussian distribution as υ increases. Hence, Student's t-distribution can model the observed data more powerfully and flexibly than Gaussian distribution. Then, p(I(x)|H(x) = k) can be defined by using multivariate Student's t-distribution: where μ, S and υ are mean vector, covariance matrix and degree of freedom, respectively. In order to improve the robustness, we introduce the anisotropic spatial information into the energy function: where P 0 x and R 0 x are vectors generated from the neighbor patch P x and the inner-relationship R x , respectively. b is the bias field.
In order to obtain a smooth bias field, Li et al. [31][32][33] model the bias field to be a linear combination of L smooth basis functions s 1 ,s 1 Á Á Ás L : Where q l 2 R, l = 1,Á Á Á,L, are the combination coefficients. Orthogonal polynomials are usually used as the basis functions, which satisfy: Where δ i,j = 1 for i = j and δ i,j = 0 for i 6 ¼ j. Following the idea shown in [31][32][33], we use the Legendre orthogonal polynomials as the basis functions. In 2D case, the size of the parameter is L = (m + 1)(m + 2)/2, where m is the degree of the Legendre polynomials. In 3D case, the size of the parameter is L = (m + 1)(m + 2)(m + 3)/6. The choice of m depends on prior knowledge of the coil and the expected type. To simplify, the bias field can be written as: Where Remark 1: Although the local patch information has been used to improve segmentation methods to reduce the effect of the noise; however, most of the methods use isotropic neighbor patch information, which makes them easily lose details [51]. Our method can preserve more detail information by using anisotropic neighbor patch information. Furthermore, we use HMRF to model the spatial correlation between neighboring pixels/voxels and further improve the robustness of our method.
Remark 2: Compared to the method in [13], we use a multivariate Student's t-distribution to fit the intensity distribution of each region in the image, which can improve the ability to identify the class for each pixel. In our method, we use the local neighbor patch information to reduce the effect of noise, which also makes our method can only use one integration, instead two integrations in the LBF [4] and LGD [19,20], to construct the data term. Our method is more efficient than LBF and LGD.
Remark 3: In LBF and LGD, the smoothness of the bias field is ensured by using a Gaussian convolution; however, the parameter of the Gaussian convolution needs to be changed when the methods segmenting different images. In our method, we use orthogonal polynomials as the basis functions, which makes our method does not need any Gaussian convolution and can estimate the bias field, even when the intensity inhomogeneity is severe.
Remark 4: Although Student's t-distribution has been widely used in segmentation methods [48] and obtained more accurate results than Gaussian distribution based methods. However, the Student's t-distribution is sensitive to noise. Furthermore, the Student's t-distribution based methods are sensitive to initialization. In our method, we use the anisotropic patch information to improve the robustness of the Student's t-distribution on noise. Furthermore, in our method, the energy function can be rewritten as convex and calculated by using Split Bregman Method, which makes our method can obtain accurate results with randomly initialization.
Remark 5: In our method, the bias field is modeled by using Legendre orthogonal polynomials, which can be found in [31][32][33]. The methods in [31][32][33] are based on clustering theories and have not considering spatial information, which makes the segmentation results be inaccurate and reduce the accuracy of the estimated bias field. Our method uses the anisotropic information and Student's t-distribution to model local information to improve the accuracy of the segmentation results and makes the estimated bias field more accurate.

Split Bregman method for minimization of energy.
The split Bregman technique is used to minimize the energy function in a more efficient way and obtain the global minima. The proposed model thus can improve the robustness and efficiency, while inheriting the desirable ability to estimate the bias field when segmenting images.
When given Θ and b, we minimize E(u 1 ,u 2 ,Θ,b) with respect to u 1 using the gradient descent method by solving the gradient flow equation as: Where In the same manner, minimizing the energy functional E with respect to u 2 , we derive the gradient descent flow: Without loss of generality, we take ν = 1 and the stationary solution of (22) and (23) coincides with the stationary solution of [38]: The simplified flow represents the gradient descent for minimizing the energy: Where It has been proved [52] that when u 2 is fixed, ifû 1 is any minimizer of E 1 , for 8T 2 ð0; 1Þ we have that the characteristic function where C is the boundary of the set O C , is a global minimization of E 1 . It can also be proved that for fixed u 1 , the characteristic function 1 O CðT Þ ¼ fyjû 2 ðyÞ > T g is a global minimization of E 2 .
Following the idea of [52], we introduce a new vectorial function di = ru i and rewrite Eq (26) as: For simplicity, we only consider the minimization of the energyẼ 1 , while the minimization of the energyẼ 2 can be solved in the same manner. Adding a new vector p 1 ru 1 into a quadratic penalty function, we obtain the following optimization function:

> < > :
For the fixed d 1 , we can derive the Euler-Lagrange equation of the optimization problem Eq (28) with respect to u 1 : Where Δ is the Laplacian operator. In 3D case, by using central discretization for the Laplacian operator and backward difference for the divergence operator, a fast approximated solution for Eq (28) is: where (i, j, k) is the position in image coordinate and a i;j;k ¼ For the fixed u 1 , the minimization solution d t+1 is performed by using the following formula: Before update u 1 and u 2 , we first need calculate other parameters of the energy function. For fixed u 1 , u 2 and b, the parameter μ, S and υ can be calculated by taking the derivative of E with respect to μ, S and υ and setting the results to zero, respectively. For updating μ k , we have: It is hard to calculate μ k directly from Eq (32). Fortunately, it has been proved that if I follows the Student's t-distribution p(μ,S,υ), it can be considered following a Gaussian distribution N(μ,S/t), where t is a scaling factor following a Gamma-distribution [50,53] and can be calculated by: So p(I(P x )|H(x) = k) can be written as: Substituting Eq (34) into Eq (32), we can obtain: Then, we can obtain: Where ./ is point division. Similarly, we calculate S k as: Setting the partial derivative of E with respect to υ and setting the results to zero, we have where cðxÞ ¼ d dx lnGðxÞ. For fixed M, μ, S and υ, taking the derivative of E with respect to Q and setting the result to zero, we have: Then we can obtain: is inverse-able and the similar proof of the stability can be seen in [31]. W ¼ For a deep understanding of our method, the computation process of our algorithm is summarized as follows: Step.1 Initialize u 1 , u 2 , Θ and b. In our method, u 1 and u 2 can be initialized randomly and b is a matrix of ones.
Step.5 If the distance between the newly obtained level sets and old ones is less than a userspecified small threshold (in this paper, we set ε = 0.001), stop the iteration; otherwise, go to Step 2.

Results
In this section, we apply the proposed method to segment the brain MR images into GM, WM, CSF and background. Unless otherwise specified, the parameters used in our experiments are set as follows: The u 1 and u 2 are initialized randomly. The size of neighbor patch is set as 3 × 3. The nonnegative constant β is set as 0.02. The degree of basis functions is set as m = 4 and hence the number of the basis functions L is 15. ν is set as 100 and γ is set as 1. In this paper, we conduct the image segmentation task by imposing HMRF on the image pixel labels. In our experiments, pðHðxÞ ¼ kjHðN ðxÞÞÞ is: where δ(Á) stands for the Kronecker's delta function and is given as: ( In this section, we compared our method with other methods on synthetic and clinical brain MR images.

Evaluation with 3T Brain MR images
We first test our method on three 3-Tesla brain MR images (show n in the 1st column of Fig  2), which is corrupted with severe intensity inhomogeneity. The segmentation results, bias corrected images and estimated bias fields are shown in Fig 2. It can be found that the intensities in each brain tissue of the bias corrected images become quite homogeneous and our method can obtain satisfied results even on weak edges. It demonstrates that the results of our method are consistent with the expected tissue regions.
Non-brain tissues usually affect the accuracy of the segmentation methods [36], in order to demonstrate the ability of our method, we test our method on three 3-Tesla brain MR images with skulls. The initial images, segmentation results, bias corrected images and the estimated bias fields are shown form left to right in Fig 3. It is clear that our method can still obtain satisfactory results without being influenced by non-brain tissues.

Quantitative comparison
In this section, we quantitatively compared the proposed method to three existing segmentation approaches, including the improved LBF method [37], the improved LGD method [17] and Zhang's method [27]. Generally, the parameters for each method are set with the default values specified in the papers. Please refer to the corresponding references for more details. To make a fair comparison, the curves are initialized by using k-means method.
All these methods can reduce the impact of noise and intensity inhomogeneity. Therefore, we first apply all the methods on the synthetic brain MR images selected from Brain Web

Fig 2. Illustration of (1st column) three 3-Tesla brain MR images, (2nd column) segmentation results of the proposed method, (3rd column) bias corrected images, and (4th column) their estimated bias fields.
https://doi.org/10.1371/journal.pone.0183943.g002 Brain MR segmentation based on IACM containing different levels of noise and the same intensity inhomogeneity. In our experiments, we use the T1-weighted 1mm brain MR images. Fig 4 shows the segmentation results on the 92th transaxial image of a synthetic image data set from Brain Web. The first column shows the initial images parameters: noise level 1%, 3%, 5% and 7% (from top to the bottom), respectively. The images have the same intensity inhomogeneity level: 30%. The second column to the right column show the segmentation results of the improved LBF method, improved LGD method, MICO (Multiplicative intrinsic component optimization method), the Zhang's method and our method, respectively.
The improved LBF method uses the local intensity mean to fit the intensity of local intensity distribution and thereby achieves much better segmentation results than those of traditional active contour methods, i.e. PS method. However, this method has not considered any spatial information, which makes the method still sensitive to noise when the noise level is high. From the results shown in the second column of Fig 4, we can find that the accuracy of the improved LBF method decreases when the noise level is increasing. In order to reduce the effect of noise, the method can change the control parameter of the length term; however, when the control parameter increases, the length term will makes the method lose detail information. The improved LGD method uses a Gaussian distribution to fit the intensities in each local region and can achieves much better segmentation results than the improved LBF method. Similar drawbacks of the improved LBF method still exist for the improved LGD method, which has not any spatial information been considered. It can be seen from the results shown in the third column of Fig 4 that the improved LGD method is less sensitive to the noise than that of the improved LBF method, but still sensitive to noise. The Zhang's method sets the variances of Gaussian distribution to be a piecewise constant in each region to improve the accuracy of the improved LGD method. Furthermore, the Zhang's method uses a constant kernel to indicate the local region, which is based on a solid theoretical foundation and can reduce the effect of noise. However, the constant kernel easily makes the method lose details. From the results of Zhang's method shown in the fourth column of Fig 4, we can find that the method can reduce the effect of noise but at the same time it loses some details. The MICO method [33] uses the Legendre orthogonal polynomials to model the bias field, however, the method has not considered any spatial information, which makes the method sensitive to noise and the results can be found in the fifth column of Fig 4. From the results, we can found that the accuracy is decreasing when the noise level is increasing.
In order to illustrate the problem more clearly, we magnified part of the segmentation results of all the four methods on the image with noise level 5% and intensity inhomogeneity level 30%. The details are shown in Fig 5. From left to right show the details of the ground truth, the result of the improved LBF method, the improved LGD method, the Zhang's method, MICO and our method, respectively. It can be found that the improved LBF method, the improved LGD method and MICO are sensitive to noise, Zhang's method has mis-segmented some pixels belong to CSF into GM. Comparing with ground truth and segmentations obtained with other algorithms, the proposed method can visually obtain the best results.
In order to show the robustness on the images with the intensity inhomogeneity, we compared our method with the above three methods on the 92th transaxial image of a synthetic image data set from BrainWeb. The initial images with parameters: intensity inhomogeneity level 40%, 60%, 80% and 100% (from top to the bottom in Fig 6), respectively. The images have the same noise level: 3%. The improved LBF method has considered the bias field when utilizing the local intensity means to fit the intensity of local intensity distribution and can estimate the bias field when segmenting images. The improved LBF method uses a Gaussian kernel as the local spatially weighted function to control the radius of local region, which also preserves the smoothness of the bias field. In order to obtain more accurate result, the radius of the Gaussian kernel cannot be set too large, which makes the estimated bias field inaccurate when the intensity inhomogeneity in the image is severe. Form the results of the improved LBF shown in the second column of Fig 6, we can find that the accuracy decrease s when the intensity inhomogeneity level increases. Similar drawbacks exist for the improved LGD method (see the third column of Fig 6). Compared with the Gaussian kernel used in the improved LBF method and improved LGD method, the Zhang's method uses a constant kernel to indicate the local region. The constant kernel can reduce the effect of noise; however, as analyzed above, the constant kernel also may lose details. The estimation of the bias field in all these three methods are based on local region information, which makes these methods easily be affected by the intensities of the pixels in each local region and cannot be smoothed enough. MICO uses the Legendre orthogonal polynomials to model the bias field, which makes it possible for the method to obtain more smoothly bias fields. From the result, we can still find that the method is sensitive to noise.
From the results shown in Fig 6, it can be found that the accuracies of all these three are affected when the intensity inhomogeneity level increases. In order to illustrate the problem more clearly, we magnified part of the segmentation results o f all the four methods on the image with noise level 3% and intensity inhomogeneity level 80% and the details are shown in Fig 7. From left to right show the details of the ground truth, the result of improved LBF method, improved LGD method, Zhang's method, MICO and our method, respectively. It can be found that comparing with the ground truth and segmentations obtained with other algorithms, the proposed method can visually obtain the best results. For quantitative comparison, we use Js values as a metric to evaluate the performance of these methods. The Js is defined as the ratio between intersection and union of two sets S_1 and S_2 where S_1 is the segmentation result and S_2 is the ground truth. A more accurate result should have a higher Js value. In order to show the robustness and accuracy of our proposed method, we applied the above three methods and our method on whole simulated MRI data sets from BrainWeb with different noise levels and intensity inhomogeneity levels. The accuracy is measured by using the means and standard deviations of Js values on WM, GM and CSF. The average results are listed in Table 1. In this table, we set N x F y as image data with noise level x% and intensity inhomogeneity level y%. From the values, it can be found that our proposed method has the best means of Js values, which indicates that our method is more accurate than other methods. It also can be found that the standard deviations of our method are much lower than the others, which proved that our method has the best robustness. We also find that our method has the highest mean Js values of CSF, which illustrates that our method can preserve more details.
In the next experiment, we compare the proposed method to above three segmentation methods on clinical T1-weighted brain MR images selected from IBSR. Fig 8(A) shows the 18th image from IBSR (1_24]) and the corresponding ground truth is shown in Fig 8(B). It can be found that Fig 8(A) has low contrast and severe intensity inhomogeneity. Due to the effect of the severe intensity inhomogeneity, the improved LBF method (ILBF) trapped into local optima and mis-segmented some pixels belong to GM into WM, which can be found in Fig 8(C). The improved LGD method can also obtain more accurate result by using local Gaussian distribution to fit the local intensity distribution. However, the improved LGD method used local variance information, which is unstable [27] and makes the method mis- segmented some pixels belong to GM into WM. The Zhang's method used the piecewise constant variances of the Gaussian distributions in each region and can preserve some details. However, the method uses constant kernel to convolution, which makes the method hard to find accurate results when segmenting images with low contrast. Form the result shown in Fig  8(E), we can find that Zhang's method has lost some details. Fig 8(F) shows the segmentation result of MICO. The MICO uses global means information of the intensities, which makes the method sensitive to low contrast and mis-segmented some pixels belong to GM into WM. Fig 8(H) shows the 30th image from IBSR (2_4]), which has intensity inhomogeneity and severe low contrast. The ground truth is shown in Fig 8(I). The improved LBF method only uses the local mean information, which makes the method sensitive to low contrast. Similar drawbacks can be found in the improved LGD, zhang's method and MICO. In our method, we use anisotropic patch information and multivariate Student's t-distribution to fit the intensity distribution to improve the accuracy. Furthermore, the bias field is estimated by using basis functions, which makes our method more robust to severe intensity contrast. The average mean of JS values of the eight methods on the whole data sets (1_24] and 2_4]) form IBSR are listed in Table 2. Because there are only small pixels be-long to CSF are contained in IBSR data sets, we only list the Js values for WM and GM. From the average Js values shown in Table 2, we can find that our method obtains the most accurate results.
In the next experiment, we use coefficient of variance (CV) as a measure to evaluate the performance of the algorithms for intensity inhomogeneity correction [35]. The CV is defined as a percentage and calculated from the average and standard deviation of selected tissue. A good algorithm can obtain low CV values for the bias corrected intensities within each segmented region. We compared our method with improved LBF method, improved LGD method and Zhang's method on two 3-T brain MR images with intensity inhomogeneity (first one is from IBSR 2_4] and another one is from 15_3]) and a 7-T brain MR image, which has severe intensity inhomogeneity. Fig 9 shows the bias field corrected images and the corresponding bias field. The second column to right column shows the results of improved LBF method, improved LGD method, Zhang's method, MICO and our method, respectively. Form the results, it can be found that the bias fields estimated by using the improved LBF method, improved LGD method and Zhang's method are not smooth enough. That is because all three method use convolution to preserve the smoothness of the methods. MICO and our method can obtain more smooth and accurate bias filed by using the basis functions to model t he bias field. To make a fair comparison, the segmentation results are obtained by using Fuzzy Cmeans Clustering method when calculating the CV values. The values of CV are listed in Table 3. It can be seen that our method can obtain the smallest CV values, which indicates that the bias corrected images obtained by using our method are more homogeneous than others.
Our model is also superior in terms of computational efficiency. The mean CPU times for 20 simulated MR image data sets and 20 standard sets of real brain MR data are listed in Table 4, which were recorded from our experiments with Matlab code run on a Lenovo computer, with i7 processor, 2.40 GHz, 7.88G RAM, with Matlab 7.1 on Windows10. The sizes of these images are also shown in this table. The computation time of our method was less than 1 sec on 2D images and less than 4 minutes on 3D images, which is much faster than the other methods. This demonstrates the significant advantage of our model in terms of computational efficiency.

Segmentation on 3D image data
In this section, we test our method on 3D image data. Fig 10 shows the segmentation results of our method for the BrainWeb Data with noise level 3%, intensity inhomogeneity level 80%. In this experiment, the initial surfaces are initialized randomly, which can be seen in Fig 10(A).  In order to illustrate the evolutions of the surfaces, we presented the corresponding contour evolution of three slices in different axis for the 3rd, 5th and 7th iteration results. From the results, we can find that our method can obtain the satisfactory result in 7 iterators even with randomly initialization. Fig 11 shows the segmentation of our method on a 3D clinical brain MR image from ISBR 2.0(7#), which has intensity inhomogeneity and low contrast. The left column shows the ground truth of GM and WM. The right column shows the segmentation results of our method. From the results, we can find that our method can obtain the satisfactory result.

Discussion and conclusion
In this paper, we proposed a novel automatic variational level set based segmentation and bias field estimation method for human brain MR images. This method successfully overcomes the drawbacks of existing active contour methods, including limited robustness to weak edges, noise, intensity inhomogeneity and limited accuracy to details, by proposing improved anisotropic spatially information to reduce the effect of noise and preserve more details; utilizing multivariate Student's t-distribution to fit the intensity distribution of the image to improve  the robustness; using the HMRF to construct data term to incorporate more spatial information. In order to obtain more accurate and smoothed bias field, we use a linear combination of orthogonal basis functions to model the bias field. To find the global optimal and make the results independent of the initialization of the algorithm, we reconstructed the energy function to be convex and calculated it by using the Split Bregman algorithm. Our statistical results on both synthetic and clinical images show that the proposed method can overcome the difficulties caused by noise, weak edges and intensity inhomogeneity, and outperforms other several state-of-the-art methods. The degree of basis function determines the accuracy and stability of the bias field correction. A lower degree will make the estimated bias field inaccurate when the intensity inhomogeneity is severe. A large degree will make our method inefficient, unstable and easily trap into local optima. Our experiments showed that the degree of basis functions up to 4 sufficiently models the bias field.
One challenge of the method presented in this paper is how to set the parameter β in Eq (15). The β depends on the size of neighbor patch. A much bigger β will make the weights of the other pixels in neighbor patch be smaller than that of the center pixel, which will decrease the robustness to noise. On the other side, a much smaller β will make the weights of all pixels in the neighbor patch be similar, which makes the neighbor information isotropic and lose details. In most regions, the intensity distances between the neighbor pixels to the center are always less than a certain threshold. After experimented on more than 100 data sets, we found that the results are accurate enough for the β = 0.02 when the patch size is 3×3. One possible extension of this work is to optimize β throughout the image automatically. Moreover, the selection of the patch size should be set based on the amount of noise and the details in the image. Improving the method with adaptive patch size selection will be another direction in the future of work.
Since the magnetic resonance imaging technology has been proposed, it has been widely used to analyze the pathologies. For brain pathologies, tumors, Alzheimer disease, Parkinson, etc., can be treated, the important open questions for brain MR image segmentation methods are how to find the boundaries of the special tissues, such as the hippocampal, amygdala and basal ganglia, which have similar intensities with their neighbor tissues. In order to deal with this problem, many approaches have been suggested to utilize multi-modality information [54], registration methods [55], etc. However, the use of multi-modality or atlas information makes the methods more complex. Thus, addressing such questions is out of the scope of this paper and subjects of future research.