Color image segmentation using adaptive hierarchical-histogram thresholding

Histogram-based thresholding is one of the widely applied techniques for conducting color image segmentation. The key to such techniques is the selection of a set of thresholds that can discriminate objects and background pixels. Many thresholding techniques have been proposed that use the shape information of histograms and identify the optimum thresholds at valleys. In this work, we introduce the novel concept of a hierarchical-histogram, which corresponds to a multigranularity abstraction of the color image. Based on this, we present a new histogram thresholding—Adaptive Hierarchical-Histogram Thresholding (AHHT) algorithm, which can adaptively identify the thresholds from valleys. The experimental results have demonstrated that the AHHT algorithm can obtain better segmentation results compared with the histon-based and the roughness-index-based techniques with drastically reduced time complexity.


Introduction
Image segmentation plays a crucial role in the areas of image analysis, pattern recognition and computer vision-related applications. In segmentation, an image is partitioned into different nonoverlapping regions that are homogenous with respect to certain properties, such as color information, edges, and texture [1,2]. Although many techniques for image segmentation have been proposed, it is still a very challenging research topic due to the variety and complexity of images. Moreover, color images can provide richer information than grayscale images, and natural color image segmentation is increasingly paid more attention by scholars.
One of the most widely applied techniques for image segmentation is histogram-based thresholding, which assumes that homogeneous objects in the image manifest themselves as clusters. The key to the histogram-based technique is the selection of a set of thresholds that can discriminate objects and background pixels. Numerous histogram-based thresholding a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 methods have been proposed over the years. These methods can be broadly classified into two categories. The first category contains thresholding techniques that determine the optimal thresholds by optimizing a certain objective function [19][20][21][22][23][24][25]. Among these thresholding techniques, entropy-based approaches are the most popular, and many algorithms have been proposed in this direction. Examples of these include Shannon Entropy, Renyi 0 s entropy [24,26], entropic correlation [5], and cross entropy [20]. However, the main problem associated with these algorithms is their large time complexity. For the multilevel thresholding problem in Minimum Cross Entropy Thresholding [25,27], the time complexity is O(mL m+1 ), where m represents the number of threshold values and L indicates the number of gray levels. The second category contains approaches that determine the optimal thresholds by utilizing shape information of the histogram of a given image. The rationale for threshold determination implicitly relies on the assumption that the intensities of pixels, or data in a more general setting, should be similar within the same objects and different between different objects [16]. In this manner, the intensity-level histogram values of each object could appear as a bell-shaped mode [19]. The peak of the bell-shaped region and its adjacent position intensity correspond to the main-body pixels of the object, while the boundary of the bell-shaped region corresponds to the edge pixels of the object. Therefore, the peaks and valleys in the histogram are used to locate the clusters in the image, and the optimum thresholds must be located in the valley regions. For example, Rosenfeld et al. investigated histogram concavity analysis as an approach for threshold selection [28]. Lim and Lee presented a valley-seeking approach that smoothes the histogram and detects the valleys as thresholds by calculating the derivatives of the smoothed histogram [29]. Because the histogram only includes the information of intensity levels, these methods do not consider the spatial correlation of the same or similarly valued elements. To overcome this drawback, some variations of the histogram are presented. Mohabey and Ray [30,31] utilized rough set theory [32] to construct the concept of a histon. Different from a histogram, each bin of a histon is the pixel scale belonging to the corresponding intensity with uncertainty [2]. With the aid of rough set theory, the histogram and histon can be respectively considered as the lower and upper approximations. Mushrif and Ray then proposed using the roughness measure at every intensity level to extract the homogeneous regions of a color image [33]. For some images, however, it is difficult to obtain the significant peaks and valleys of the roughness measure; Xie et al. used local polynomial regression to smooth the histogram and histon and then calculated the roughness measure, which enabled their approach to find the real peaks and valleys more easily [34].
Similar to the histogram, both the histon and roughness indexes provide the global information of homogeneous regions in the image, and every peak and its adjacent position represent a homogeneous region. Theory analysis shows that the histon pays little attention to the small homogenous regions, and the roughness index can effectively indicate the region homogeneity degree and avoid the disturbance of imbalanced color distribution. As two variants of the histogram, the histon and roughness index were demonstrated to achieve better segmentation results. Both histon-based and roughness-index-based algorithms, however, need to calculate the color difference between every pixel and its neighborhood, which means both require significant time. The steps of the above techniques involve some smoothing of the histogram (histon or roughness-index) data, searching for significant modes, and placing thresholds at the minima between them.
In this paper, we propose an original segmentation scheme named AHHT (Adaptive Hierarchical-Histogram Thresholding), which uses a structure called the hierarchical-histogram to adaptively identify the thresholds at valleys for thresholding. A hierarchical-histogram includes a group of histograms that corresponds to a multigranularity abstraction of the image. The lower the histogram is in the hierarchical-histogram, the more elaborate the details of the image it pertains to are. The role of the prior-level histogram in the hierarchical-histogram is generating for the next-level histogram, and the top-level histogram is applied to segment the image . To verify the effectiveness of AHHT, experiments are performed on the Berkeley Segmentation Data Set and Benchmark, and a comparison with the histon-based technique and  roughness-index-based technique is made in terms of both visual and quantitative evaluations. This paper is organized as follows. Section 2 reviews the related work. Section 3.1 describes the main idea of the proposed AHHT algorithm. Section 3.2 presents the AHHT algorithm in detail. Section 3.3 analyzes the complexity of the AHHT algorithm. Section 4 analyzes the experimental results. Section 5 concludes the paper.

Related work
RGB is the most commonly used model for the television systems and pictures acquired by digital cameras. As discussed in other related works, this paper also focuses on color image segmentation in the RGB color space. Consider I to be an RGB image of size M × N, consisting of three primary color components: red R, green G, and blue B. The classic histogram of the image for each color component is defined as dðIðm; n; iÞ À lÞ; for 0 � l � L À 1 and i 2 fR; G; Bg; ð1Þ is the indicator function and L is the intensity scale in each of the color components. The value h i (l) is the number of pixels having intensity l in color component i. Let c 1 and c 2 be color vectors in the RGB color space. The Euclidean distance between the two vectors is given by dðc 1 ; c 2 Þ ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi X i2fR;G;Bg ðc 1 ðiÞ À c 2 ðiÞÞ For a P × Q neighborhood around a pixel I(m, n), the color difference between I(m, n) and its surrounding pixels in neighborhood is defined as [33]: If the color difference d T (m, n) is less than a threshold T 0 , the surrounding pixels in neighborhood fall in the sphere of a similar color. For an RGB image I of size M × N, a matrix I 0 of size M × N is defined such that an element I 0 (m, n) is given by Then, the histon is defined as follows [31]: X N n¼1 ð1 þ Iðm; nÞÞdðIðm; n; iÞ À lÞ; for 0 � l � L À 1 and i 2 fR; G; Bg: ð5Þ The histogram and the histon can be associated with the concept of approximation space in rough set theory [32,35]. For intensity class l, the value of h i (l) is the number of pixels that have intensity value l and therefore can be viewed as the lower approximation, and the value of h 0 i ðlÞ can be considered as the upper approximation. Mushrif and Ray then proposed the roughness measure as follows [33].
Like the histogram, the histon and the roughness index for all intensity values also give the global information of homogeneous regions in the image, and every peak and its adjacent position represent a homogeneous region. Therefore, the histon and the roughness index are two variations of the histogram. The histogram, the histon and the roughness index are collectively called histogram-based techniques in this paper. The segmentation process of such histogrambased techniques is divided into three stages [33], as shown in Fig 1. We take the roughness-index-based [33] technique for example to illustrate the flowchart of Color image segmentation using adaptive hierarchical-histogram thresholding PLOS ONE | https://doi.org/10.1371/journal.pone.0226345 January 10, 2020 two criteria are used to obtain the significant peaks of the roughness indexes: (1) the height of the peak is greater than 20% of the average value of all peaks; and (2) the distance between two peaks is greater than 10. After the significant peaks are selected, the thresholds are identified at the minima between every two adjacent significant peaks. Second, all selected thresholds are applied to split the image into multiple clusters. The color representing each cluster is obtained by averaging all the pixels within the cluster. At this point, the initial segmentation is completed. Generally, this process usually results in over-segmentation. Lastly, the Region-Merging process uses the algorithm proposed by Cheng et al. [36] to deal with small regions and similar regions. Concretely, the following two steps are carried out. (1) The clusters with pixels less than a predefined threshold T n are merged with the nearest clusters. (2) Two closest clusters are combined to form a single cluster if the distance between the two clusters is less than a predefined threshold T d .
The basic thresholding procedure consists of analysis of an image histogram and subsequent threshold selection from the values located in the valleys between peaks. However, the determination of peaks and valleys in a multimodal histogram is a nontrivial problem. In general, there are many local peaks and local valleys in the histogram of each color space of an RGB color image. The steps of the above techniques involve some smoothing of the histogram data, searching for significant peaks, and then identification of thresholds at the minima between two adjacent significant peaks. This means that the selection of significant peaks will be used to determine the thresholds, which consequently determine the final segmentation result of the image. As such, the above histogram-based thresholding techniques mainly focus on how to identify the significant peaks in the histogram, and then identify the valleys for thresholding. As two variants of the classic histogram, the histon and roughness index were demonstrated to achieve better segmentation results. Both histon-based and roughness-indexbased algorithms, however, need to calculate the distance between every pixel and its neighborhood, which means that significant time is required to calculate the histograms. In addition, as mentioned above, both algorithms also need to determine the significant peaks to identify the thresholds. Moreover, it is difficult for these techniques to find the exact threshold point if the valley is flat.

Color image segmentation based on adaptive hierarchicalhistogram thresholding
In this section, we propose a segmentation technique that uses a hierarchy structure of histograms to adaptively obtain the thresholds for color image segmentation. Our method does not need to find the significant peaks, it can adaptively identify the thresholds from valleys, and it has high efficiency.

Main idea of adaptive hierarchical-histogram thresholding
Based on experiments performed on hundreds of RGB color images, we found that each image yields dozens of local valleys and local peaks in each histogram of the R, G, and B planes. As presented above, in the histogram, the peak of the bell-shaped region and its adjacent position intensity correspond to the main-body pixels of the object, while the boundary of the bellshaped region corresponds to the edge pixels of the object. Therefore, in the histogram, the intensities between every pair of adjacent local valleys correspond to a small breadth bellshaped region. All pixels ranged in a small bell-shaped region can also be regarded as a small homogeneous region. If we use all local valleys in the histogram of each color plane to segment an image, the image will be divided into a mass of small homogeneous regions. The colors of these small homogeneous regions will very close to the corresponding colors in the original image because such segmentations are overelaborate. Although such segmentation is exquisite, the segmented image can be viewed as an abstract version of the original image. For the original image, we noticed that a more abstract version with a relatively small number of homogeneous regions can be generated based on the segmented image. This finding is what inspired us to propose the AHHT algorithm for color image segmentation. The main idea of AHHT is to build a group of hierarchical histograms that corresponds to a multigranularity abstraction of the original image.
For each different color space, AHHT adopts a bottom-up approach to generate a group of histograms that form a hierarchy graph, and the obtained top-level histograms will be applied to segment the image. The rough process of AHHT is as follows. In each plane of R, G, and B, according to Eq 1, the histogram is calculated as the first (bottom)-level histogram. From the first-level histogram, each small bell-shaped region is merged into a bin expressed by the count (the number of pixels within the intensity range of the small bell-shaped region) and the weighted average intensity (the average intensity of all pixels within the small bell-shaped region), and then the second-level histogram is obtained. Next, similar action is applied to generate the third-level histogram, that is, from the second-level histogram, each bell-shaped region is merged into a bin expressed by the count and the weighted average intensity of all pixels within the bell-shaped region. Such process continues until the last-generated histogram has no valleys or the difference of every adjacent pair of bins is larger than a threshold w. Obviously, each bin of the top-level histogram corresponds to a group of pixels in the image. All bins 0 information in the top-level histogram in each plane of R, G, and B is applied to split the image into multiple clusters. The color representing each cluster is obtained by averaging all the pixels within the cluster. At this point, the initial segmentation is completed. In the process of Region-Merging, the AHHT algorithm adopts an approach identical to that used in the roughness-index-based algorithm.
In each color plane of R, G, and B, AHHT generates a group of histograms in a hierarchical fashion. Hereafter, such a group of histograms is called a hierarchical-histogram. The lower a histogram is in the hierarchical-histogram, the more elaborate the details of the image it encodes are. The role of the prior-level histogram is generating for the next-level histogram. The experiments performed on hundreds of color images show that AHHT commonly generates four to five histograms for each color plane of the image. The AHHT algorithm segmentation of the image is based on the top-level histograms.
The hierarchical-histogram for each plane of R, G, and B of the image Moon, generated by the AHHT algorithm, are shown in Fig 2(a)-2(c), respectively. In the experiment, the parameter w = 20, which means that the difference of every adjacent pair of bins in the histogram is larger than 20, and the top-level histogram is generated. As shown in Fig 2, AHHT generates four histograms for each color plane. The first-level histogram is generated from the original image Moon, and the next-level histogram is generated from the prior-level histogram. In each histogram of Fig 2, each dashed line marks a valley 0 s position. From Fig 2, we can see that the first-level histogram (of each plane of R, G, and B) has many local valleys, which means that there are many small bell-shaped regions in the first-level histogram. Each small bell-shaped region in the first-level histogram is expressed by a bin in the second-level histogram, and so on. The fourth-level (top-level) histogram of each color plane is applied to segment the image. In the Region-Merging process of this experiment, the regions with fewer than 0.1% of the pixels are merged with the nearest region, and two regions with a distance of less than 70 are combined to form a single region.

Algorithm
In a histogram, a valley corresponds to a local minimum, which is present near a local lowest point or a local lowest horizontal line. All valleys in the histogram of a color plane can be identified by the following rule. If H is generated from the original image, which means that H is the first-level histogram, then GetValleys(H) can find all local valleys that will be used to generate the second-level histogram. If H is the kth (k > 1)-level histogram, then the result of GetValleys(H) will be used to generate the (k + 1)thlevel histogram. Function 1 -GetNextHist is used to generate the next-level histogram of H, and the pseudocode is as follows. In lines 9 and 12, a function named GetMergeBin returns bin or a group of bins for a bellshaped region. In line 9, the parameters left and right of GetMergeBin are the left-end index and right-end index of a bell-shaped region in the histogram H. If the difference between the right-end intensity and left-end intensity is less than w, then GetMergeBin returns a bin corresponding to the bell-shaped region; otherwise, at most b(l right − l left )/wc +1 bins will be returned. For simplicity, the pseudocode of GetMergeBin is omitted. Note that the function GetMergeBin only merges adjacent bins within a bell-shaped region. This mechanism makes the next-level histogram match the original intensity distribution of the image well.
On the basis of the above functions, Function 2 -GetAHH (Get Adaptive Hierarchical-Histograms) is used to generate a hierarchical-histogram for each plane of R, G, and B of the image, and the pseudocode is as follows.  Hists i = ;; //store a hierarchical-histogram; Calculate w i according to Eq 8; (4) Generated the first-level histogram H i ; (5) Append H i to Hists i ; Append H 0 i to Hists i ; (10) For each color plane of the image, a hierarchical-histogram is generated starting with the first-level histogram, and the next-level histograms are iteratively generated until the new-level histogram has no change.
For an RGB color image, there are different widths of valid intensity between each plane of R, G, and B. Taking Fig 2's image of Moon as an example, the (first-level) histogram of the Blue plane has a wide distribution of valid intensity, and the (first-level) histogram of the Green plane has a relatively narrow distribution of valid intensity. Therefore, different threshold values w should be set for different color planes. A reasonable threshold w should be given a relatively large value for a color plane with a wide width of valid intensity and a relatively smaller value for a color plane with a narrow width of valid intensity. For different color planes, the threshold w i can be calculated as follows.
; for i 2 fR; G; Bg; SPAN i ¼ argmaxð 0�l�LÀ 1 h i ðlÞ 6 ¼ 0Þ À argminð 0�l�LÀ 1 h i ðlÞ 6 ¼ 0Þ: According to Eq 8, w i is calculated as the value of w multiplied by the scale factor SPAN i L , where SPAN i is the difference between the max valid intensity and min valid intensity of the color plane i. In this manner, a relatively larger threshold w i is applied for a color plane with a wide width of intensity. However, for some images, there are only a very small number of pixels distributed at lower intensities or higher intensities of a color plane, which makes a relatively larger threshold w i be applied to the color plane. To avoid noise trouble, we use the SPAN i of Formula (9) to replace the SPAN i of Formula (8).
In expression 9, the threshold value of 0.01 means the SPAN i is the difference between the max valid intensity and min valid intensity, excluding the top 1 percent and bottom 1 percent of pixels, which improves the robustness of the calculation of SPAN i . After such process, the initially segmentation is completed. It is pretty remarkable that the obtained top-level histograms correspond to the histogram of each R, G, and B plane of the segmented image. On the basis of the above, AHHT (Adaptive Hierarchical-Histogram Thresholding) algorithm for color image segmentation is described as follows.

Algorithm 1. AHHT(Adaptive Hierarchical-Histogram Thresholding)
Input: an RGB color image; bin merge threshold w; small region threshold T n ; distance T d for merging close regions; Output: the segmented image; Step 1: Calculate the hierarchical-histogram for each R, G, and B plane of the image by calling Function 2 -GetAhh; Step 2: Segment the image by using the top-level histograms obtained by Step 1; Step 3: Merge small regions and close regions.
The AHHT algorithm has three main advantages: (1) AHHT adopts a bottom-up strategy to build the structure of the hierarchical-histogram, which can adaptively identify the thresholds from valleys; (2) In the process of identifying the thresholds, AHHT does not need to determine peaks, and only one parameter, w, is involved; and (3) AHHT finds the thresholds with high efficiency.

Complexity analysis
The computational complexity of the AHHT algorithm is analyzed as follows. An RGB color image I with n pixels and an intensity scale L for each color space is given. The total computation time includes that consumed in each of three major steps.
In the first step, the hierarchical-histogram for each color plane is computed. The complexity of generating all three first-level histograms is O(3n). The complexity of generating all three second-level histograms is O(3L). The complexity of generating all three mth-level histograms is O(3L m ), where L m is the average number of bins in the three (m − 1)th-level histograms. Because a hierarchical-histogram only includes a limited number of histograms, the time required to generate the first-level histograms is far greater than the rest of the time required to generate the others. The complexity of step 1 can thus be considered as O(3n). In the second step, every pixel is distributed into the corresponding bin and assigned the intensity value of the bin by using the three top-level histograms. It is given that the number of bins in every toplevel histogram is k. This process requires approximately 3kn operations, and the complexity of step 2 can be considered as O(3kn). The third step is the Region-Merging process. Suppose that r 1 is the number of regions before merging and that r 2 is the number of regions merged. The complexity of calculating the difference between regions is Oð3r 2 1 Þ, and the complexity of merging the regions is O(3r 2 n). Therefore, the complexity of step 3 is Oð3ðr 2 1 þ r 2 nÞÞ. To summarize, the expected time complexity of the AHHT algorithm is Oð3ðkn þ r 2 1 þ r 2 nÞÞ. It is worth mentioning that the histon-based and roughness-index-based algorithms need to calculate the Euclidean distance 24n times to find the thresholds. By contrast, the AHHT algorithm has substantially reduced the time consumption.

Experimental results
As two variations of histogram-based techniques, the histon-based and roughness-index-based techniques have been demonstrated to achieve better segmentation results. In this study, the performance of the proposed AHHT technique is compared with them. The experiments are performed on Berkeley Segmentation Data Set 300 (BSDS300) as well as Berkeley Segmentation Data Set 500 (BSDS500). Each image is 481 × 321 pixels. For each image, a set of ground truths compiled by the human observers is provided. All the images are normalized to have the longest side equivalent to 320 pixels.
All of these techniques include three major steps, and each one of the steps offers similar functionality. For the histon-based and roughness-index-based techniques, all parameters involved are set the same as those used in the original papers [31,33]. Concretely, in step 1, two parameters are involved for finding the significant peaks: (1) the peak is greater than 20% of the average value of all peaks; and (2) the distance between two peaks is greater than 10. In the post-processing step (step 3), two parameters T n and T d for region merging are involved. Unless otherwise stated the results, T n is set as 0.1%, and T d is set as 20, respectively. For the proposed AHHT algorithm, only one parameter, the bin merge threshold w, is involved in step 1. In our experiments, w is set as 15, which means that any adjacent pair of bins cannot be merged if the difference between the two bins is larger than 15. In the same post-processing step, the two involved parameters are identical to those of the histon-based and roughnessindex-based algorithms to make a fair comparison.

Visual evaluation of segmentation results
In this section, the segmentation results for compared algorithms are visually evaluated by using 6 of all the segmented images. The segmentation results for the images Birds(#135069), Church(#126007), Mountain(#14037), Marsh(#92059), Boating(#147021) and Snake(#196073) are shown in Figs 4-9, respectively. Considering that all of the compared techniques adopt the same Region-Merging processing, in Figs 4-9, we present the initial segmented result and the result after region merging for each technique. In Table 1, columns 3-5 present the number of bands in each plane of R, G, and B of the initial segmented result, and columns 6-7 present the color number in the initial segmented result and the color number in the postmerging result. Generally, based on visual evaluation, the AHHT technique produces better segmentation results.
For the image Birds, Fig 4 shows the initial segmented result and the result after region merging for the histon-based, roughness-index-based and AHHT techniques. For the histon, roughness-index-and AHHT techniques, the numbers of colors in the initial segmented results are 18 (Fig 4b), 73 (Fig 4d) and 308 (Fig 4f), respectively; the numbers of colors in the final segmented results are 6 (Fig 4c), 10 (Fig 4e) and 13 (Fig 4g), respectively. For the histonbased technique (Fig 4b and 4c), we can see that there are fewer colors in the segmented results, which leads to larger homogenous regions in the results. However, the white feathers of the birds have been mistakenly assigned to the sky by the histon technique. For the roughness-index-based technique (Fig 4d and 4e), the white feathers of the birds have been assigned to a color close to that of the sky. By contrast, the AHHT technique has successfully avoided this classification error. Therefore, although the initial segmented results based on the histon and roughness index produced a lower number of colors, they lose many details of small distinct regions. It is worth noting that, although all three techniques adopt the same region merging process, the number of colors in the final segmented result by the histon technique is obviously less than that of the other two techniques. The reason for this is that, for the histon technique, there are small color differences between the different regions in the initial segmented result, which in turn cause these different regions to be further merged.
For the image Church, Fig 5 shows the initial segmented result and the result after region merging for the histon-based, roughness-index-based and AHHT techniques, respectively. For the histon-based, roughness-index-based and AHHT techniques, the numbers of colors in the initial segmented result are 210 (Fig 5b), 245 (Fig 5d) and 274 (Fig 5f), respectively; the numbers of colors in the final segmented results are 36 (Fig 5c), 35 (Fig 5e) and 40 (Fig 5g), respectively. In the histon-based segmentation of Fig 5b and 5c, the red mountain ridge in the distance and the light green of the exterior wall of the building do not match with those in the original image. In addition, in the histon-based and roughness-index-based segmentations of Fig 5b-5e, we observed that the very dark color of the gentle mountain slope at the left bottom corner does not match those in the original image. One can see from the original image (Fig  5a) that there is a clearly green boundary between the gentle mountain slope at the left bottom corner and the middle mountain. However, this green boundary is almost gone in the segmented results (Fig 5b-5e). Whereas in the AHHT technique of segmentation, as shown in Fig  5f and 5g, we observe that the colors of buildings, mountains, the sky and clouds match exactly with colors of the corresponding regions in the original image. For the image Mountain, Fig 6 shows the initial segmented result and the result after region merging for the histon-based, roughness-index-based and AHHT techniques. For the histonbased, roughness-index-based and AHHT techniques, the numbers of colors in the initial segmented result are 128 (Fig 6b), 201 (Fig 6d) and 181 (Fig 6f), respectively; the numbers of colors in the final segmented results are 22 (Fig 6c), 27(Fig 6e) and 22 (Fig 6g), respectively. In the histon-based segmentation of Fig 6b and 6c, the color of the distant mountain in the middle of the image is a slight violet, and the color of the cloud on the top of the statue is assigned to the color of pink, which do not match those in the original image. For the roughness-index-based segmentations of Fig 6d and 6e, the color of the middle sky is violet, and the color of the distant mountain is assigned to the color of the top sky, which also do not match those in the original image. However, the AHHT technique prevents the above classification errors. We can see from Fig 6f and 6g that the AHHT technique yields much better segmentation results, results in which the colors of clouds, the sky and the distant mountains match those in the original image.
For the image Marsh, Fig 7 shows the initial segmented result and the result after region merging for the histon-based, roughness-index-based and AHHT techniques, respectively. For the histon-based, roughness-index-based and AHHT techniques, the numbers of colors in initial segmented result are 125 (Fig 7b), 323 (Fig 7d) and 245 (Fig 7f), respectively; and the numbers of colors in final segmented results are 29 (Fig 7c), 31(Fig 7e) and 33 (Fig 7g), respectively. In histon-based segmentation of Fig 7b and 7, we can see that the color of the inner surface is pink, which does not match the white color of the corresponding regions in the original image. In addition, although both the histon-based and roughness-index-based techniques produce more homogenous water surface, there are considerable pixels of the nearshore that are assigned as part of the water surface. By contrast, in the segmented results (Fig 7f and 7g) of the AHHT technique, the border between the water surface and the nearshore is nicely retained. As shown in Fig 7f and 7g, we observe that the colors of the boat, water surface, and nearshore match the colors of the corresponding regions in the original image.
For the image Boating, Fig 8 shows the initial segmented result and the result after region merging for the histon-based, roughness-index-based and AHHT techniques. For the histonbased, roughness-index-based and AHHT techniques, the numbers of colors in the initial segmented result are 290 (Fig 8b), 373 (Fig 8d) and 322 (Fig 8f), respectively; the numbers of colors in the final segmented results are 27 (Fig 8c), 35(Fig 8e) and 46 (Fig 8g), respectively. In the histon-based segmentation of Fig 8b and 8c, we can see the red and yellow on the top of the umbrella and the white on the side of the bow, and in the original image, these are changed to pink. In the roughness-index-based segmentation of Fig 8d and 8e, the yellow and green stripe on the surface of the bow is causing a slight confusion. By contrast, the AHHT technique obtains better segmentation results; see Fig 8f and 8g.
For the image Snake, Fig 9 shows the initial segmented result and the result after region merging for all comparison techniques. For the histon-based, roughness-index-based and AHHT techniques, the numbers of colors in the initial segmented result are 10 (Fig 9b), 35 (Fig 9d) and 197 (Fig 9f), respectively; and the numbers of colors in the final segmented results Color image segmentation using adaptive hierarchical-histogram thresholding are 5 (Fig 9c), 17 (Fig 9e) and 11 (Fig 9g), respectively. In histon-based segmentation of Fig 9b  and 9c, we can see that there are fewer colors in the segmented results, which leads to larger homogenous regions in the results. However, the colors of the snake and desert are the same color, and they do not match their colors in the original image. In addition, the color of the snake 0 s shadow is seen as light green instead of black. In case of roughness-index-based segmentation of Fig 9d and 9e, the texture of sand surface is not clearly visible, although the snake is clearly visible. In contrast, we can see from the AHHT-based segmentation of Fig 9f and 9g that the colors of the snake, the snake 0 s shadow and the desert, and the texture of the sand surface match those in the original image. Color image segmentation using adaptive hierarchical-histogram thresholding The analysis of the above experiments illustrates that the initial segmentation plays a decisive role in the full process of segmentation. AHHT obtains the best visual features in the initial segmentation, which in turn allows it to produce the best visual features in the end.
In addition, to better analyze the characteristics of the compared techniques, Table 2 presents the mean number of colors in the initial segmented results and the results after region merging over all images of the BSDS300 and the BSDS500. Take BSDS500 for example, for the histon-based, roughness-index-based and AHHT techniques, the mean numbers of colors in the initial segmented results are 338, 377 and 333, respectively; the mean number of colors in the final segmented results are 44, 47 and 49, respectively. This illustrates that, although all compared techniques adopt the same Region-Merging processing, the AHHT technique obtains a slightly larger number of colors in the final segmented results. The reason for this is that, for the AHHT technique, there are relatively larger differences between the colors in the initial segmented result, which in turn restricts those colors from being further merged, thus ensuring good segmentation quality.

Quantitative evaluation of segmentation results
In this section, the results of each image segmentation technique are compared using quantitative evaluations, such as the mean square error (MSE), F(I) [37], and Q(I) [38]. The MSE evaluation function can be described as i2fR;G;Bg ðIðm; n; iÞ À I 0 ðm; n; iÞÞ where I is the original RGB color image, M × N is the image size, and I 0 is the segmented image of I. In general, a lower MSE indicates good segmentation quality of the output in the case that the numbers of regions are close for different segmented results. The evaluation function of F(I) is defined as follows [37]: where  Table 3 for the images shown in  Table 3. Obviously, the AHHT technique outperforms the roughness-index-based and histon-based techniques by obtaining the relatively small mean values of indexes in segmenting all of these images.
To better support the abovementioned findings, the mean values of MSE, F(I) and Q(I) are tabulated in Table 4 for all images of the BSDS300 and images of the BSDS500. From the results in Table 4, it is clear that the proposed AHHT technique outperforms the other techniques based on the MSE, F(I) and Q(I) measures.
The above benchmark indices are used to estimate the empirical accuracy of the segmentation results. They include some human characterizations on the properties of ideal segmentation requiring no prior knowledge of correct segmentation.
For each image in the BSDS, a set of ground truths compiled by human observers is provided. Therefore, we intend to compare segmentation results against external criteria. The Color image segmentation using adaptive hierarchical-histogram thresholding following image segmentation indices were used. The Probability Rand Index (PRI) counts the fraction of pairs of pixels whose labels are consistent between the computed segmentation and the ground truth, averaging across multiple ground truth segmentations to account for scale variation in human perception [39]. The Variation of Information (VOI) is used for quantification of the loss of information and the gain between two clusters belonging to the lattice of possible partitions [40]. The Boundary Displacement Error (BDE) is used for evaluation of the average displacement error of boundary pixels between two segmented images by computing the distance between the pixel and the closest pixel in the other segmentation [41]. The Global Consistency Error (GCE) is used for quantification of the extent to which a segmentation can be viewed as the refinement of the others [42]. These four measures must be considered all together to evaluate the performance of a given segmentation algorithm. Higher values of PRI indicate a large similarity between the segmented images and the ground truth; whereas for rest of the indices, lower values indicate closer similarity of the segmentation obtained and the ground truth. Table 5 presents the average performance indices (PRI, BDE, GCE and VOI) obtained by the proposed AHHT algorithm compared with Histon and Roughness-index algorithms. As mentioned above, these three algorithms are histogram-based algorithms. In addition, Table 5 also lists results of some other popular algorithms. The results of Mean-Shift [43], NCuts [44], FH [45], CTM [12], and MCET_DE [27], were obtained from literature sources [12,27]. For the three histogram-based techniques of Histon, Roughness-index, and AHHT, we found an improvement in the results in terms of GCE and VOI with a larger value of T d , in contrast to that of the PRI measurement, which decreased. Compared with Histon and Roughness-index algorithms, the AHHT algorithm obtains better values of BDE, GCE and VOI when the same value of T d is used. From Table 5, it can be seen that the three histogram-based techniques of Histon, Roughness-index, and AHHT can obtain superior BDE values compared with other algorithms.

Runtime comparison
The computational efficiency of the algorithm is a key factor that imposes a large influence upon its practical application. In this section, the efficiencies of the three techniques are compared as the execution time in seconds. Considering that the execution times for all compared techniques include two parts, Table 6 presents the mean time spent on the initial segmentation, the mean time spent on the Region-Merging process, and the mean total time (the sum of the two former) spent on BSDS300 as well as BSDS500. From Table 6, we can see that the mean time of initial segmentation of an image is approximately 9 seconds for the histon-based and roughness-index-based techniques. By contrast, the time of initial segmentation of an image is approximately 0.07 seconds for the AHHT technique, which means that AHHT outperforms the histon-based and roughness-index-based techniques by up to two orders of magnitude in the matter of efficiency of initial segmentation. The complexity analysis shows that the major reason for the big difference is the time required to find the thresholds: the histon-based and roughness-index-based techniques need to calculate the Euclidean distance 24n times; however, the AHHT technique mainly needs 3n instances of pixel access with no complicated calculation involved. Therefore, the AHHT technique obtains the great advantage of efficiency in initial segmentation.
The time spent on the Region-Merging process mainly depends on the number of merged regions. In this process, the differences between the compared techniques are not noticeable. The full execution time to segment an image mainly depends on the initial segmentation for the histon-based and roughness-index-based techniques. In contrast, for the AHHT algorithm, the full execution time to segment an image largely depends on the merging process. From Table 6, we can see that the AHHT technique obtains significantly faster running speeds.

Conclusion
This paper presents a novel histogram thresholding-Adaptive Hierarchical-Histogram Thresholding (AHHT), which is an adaptive thresholding algorithm used to perform color image segmentation. The contributions of the paper include the following. (1) A structure called hierarchical-histogram has been proposed in the paper. With the aid of hierarchical-histogram, the AHHT algorithm can adaptively identify the thresholds at valleys. (2) AHHT does not need to find the significant peaks. (3) The experimental results show that the AHHT algorithm can obtain better results for color image segmentation. (4) For the simplicity of implementation, the AHHT algorithm has fast running speed. The experimental results show that AHHT outperforms the compared algorithms by up to two orders of magnitude in the matter of efficiency of initial segmentation.