Salient region detection through salient and non-salient dictionaries

Low-rank representation-based frameworks are becoming popular for the saliency and the object detection because of their easiness and simplicity. These frameworks only need global features to extract the salient objects while the local features are compromised. To deal with this issue, we regularize the low-rank representation through a local graph-regularization and a maximum mean-discrepancy regularization terms. Firstly, we introduce a novel feature space that is extracted by combining the four feature spaces like CIELab, RGB, HOG and LBP. Secondly, we combine a boundary metric, a candidate objectness metric and a candidate distance metric to compute the low-level saliency map. Thirdly, we extract salient and non-salient dictionaries from the low-level saliency. Finally, we regularize the low-rank representation through the Laplacian regularization term that saves the structural and geometrical features and using the mean discrepancy term that reduces the distribution divergence and connections among similar regions. The proposed model is tested against seven latest salient region detection methods using the precision-recall curve, receiver operating characteristics curve, F-measure and mean absolute error. The proposed model remains persistent in all the tests and outperformed against the selected models with higher precision value.


Introduction
Salient Region Detection (SRD) is a procedure to confine the image according to the human visual attention and discovers the most useful and informative portion of an image. This procedure tries to approximate the possibility that the image region that is taking more attention comes out as a salient object. It is also a very helpful step because it is applied in many computer vision applications to reduce the computational complexity by only focusing on the salient parts of the image. The conventional saliency methods are separated into two groups as the bottom-up [1] and top-down [2]. The first category is a bottom-up method, which is a data-driven approach and it only depends on the prior knowledge of the object and the To save the local and global image information, and to evade the unrelated background regions from embedding with the salient object, we exploit a Graph Regularize Term (GRT) and a Maximum Mean-discrepancy Regularization Term (MMRT). The GRT preserves the locality of the salient object and homogenously treats with all the parts of salient objects. While the MMRT decreases the distribution divergence between the similar regions. In this way, the background regions, which are mistakenly highlighted are suppressed by treating with MMRT. On the other hand, the salient object regions that are erroneously concealed during pre-processing are decorated again by exploiting this MMRT. In short, theses regularization terms transform the low-rank representation into a new appearance, which produces the saliency results more smoother, locally and globally homogeneous. The contributions of our designed method are given as: • In this proposed framework more structural information of the image is incarcerated, for this purpose we concatenate four feature spaces to construct a new feature space that is consist of CIELab, RGB, HOG and LBP features.
• We propose a discriminative background dictionary that is constructed through the background and the foreground templates. This discriminative dictionary has more distinctive power because in our method the salient object is projected through the background and the foreground templates instead of the background template.
• We regularized the low-rank representation to save the similarity and locality of the regions. We introduced a GRT and a MMRT for the representation coefficients and the representation errors, respectively. By using, these terms the representation coefficients and the representation errors of similar regions contain similar saliency values when sparsely encoded with the discriminative dictionary.
• Our designed framework has more discerning power and more effective appearance as compared to the current low-rank representation-based methods because the low-rank representation is presented in a new way through the MMRT.
The remainder Sections of this paper are ordered in the following way: the existing schemes related to the SRD are discussed in the Section II. The details of our method like the feature extraction, the salient and non-salient template construction, and regularizing the low-rank representation through the Laplacian and mean discrepancy terms are given in the Section III. The evaluation metric, evaluation of our method, and its comparison with the state-of-the-art methods are given in Section IV. The conclusion and the future work are discussed in the Section V.

Related work and background
Several computational methods are proposed for SRD. The majority of the preceding schemes are appearance-based models, these models mainly depend upon the global or local contrast for their saliency map computation.

Dictionary learning-based SRD
The dictionary-based approaches [2,[11][12][13][14] facilitate in learning multifaceted labeling procedures and represent the image in a space where it can be easily processed. In [11], the basis vector is computed on the belief that the repeatedly activated bases contain less energy as compared to the rare bases. This model works selectively because the unpredicted bases are selected as salient clues. A dictionary for an image patch is constructed from a depository of natural images in [12]. Then, the sparse representation is utilized to find the contrast between each image patch. Shen et al., [13] optimize the objective of feature transformation and low-rank decomposition for training the dictionary. However, these methods manually trained their dictionaries using the top-down way. In [1,14], the authors constructed the dictionary by only utilizing the center-surrounded patches without any training. However, the saliency results are not satisfactory because the inner-region of the salient object is not detected properly. In recent dictionary-based methods [6,15], the author utilized the boundary information to extract the background dictionary. The saliency computed through this background dictionary is not clear because only the boundary information for background dictionary construction is insufficient. Currently, some methods engaged the center-remaining strategy [16], while other used the more background regions [17] to construct their background dictionary. However, most of the time the background templates contain limited information that leads to incorrect SRD.

Low rank representation-based SRD
Low-rank representation finds the lowest-rank from all the candidates that are available as a dictionary base. Low-rank representation works as an effective tool and computes all the global features information. On the other hand, the sparse representation seeks the sparsest representation of available data vector and computes all of the local features related to the salient objects. In [13], the authors optimize the objective of feature transformation and lowrank decomposition for training the dictionary. However, this method manually trained their dictionaries using the top-down way. The authors in [7] combined the low-rank representation and sparse representation to extract all the local and global features related to the salient object. However, due to the incompetent dictionary, the computed results are not persuasive. A salient object cannot be properly described by a single cue, it needs more cues to properly capture the salient object [8]. The authors generalized the low-rank representation as a multitask sparsity pursuit and effectively combined the multiple features for salient object detection. However, all of the captured features are global and the saliency results are not as significant as expected. Dual low-rank pursuit [18] decomposes the image into a low-rank and a sparse part. This method uses the low-rank and sparse measures to characterize the global information and deals robustly with noises and background occlusions. However, this method remains unsuccessful in detecting the inner part of the salient object, as it is more emphasizing on the global image features.

Sparse representation-based SRD
The image boundary is always standing out as a part of the background. So, it can be very helpful in constructing the background template set [6,15]. The authors computed the sparse representation error through this background template set. However, the computed results are not significant, when the salient object is touching the image boundary. The center-surrounded strategy is helpful in detecting, so, the authors in [16] engaged the center-remaining procedure to extract the dictionary. Then, the sparse reconstruction error is calculated through this dictionary. The computed saliency results averaged and improved through a multi-label inference process. To enhance the difference between the salient object and the background, a sparse coding-based generative model is discussed in [17]. To capture all information related to the image a superpixel sparse reconstruction-based model is defined in [3,4]. However, the results generated by these models are not very clear because these methods only utilizing the local image information for SRD. Consequently, all these methods improved their results through an enhancement process, which recovers the lost information.

Global or local measures-based SRD
The previously designed SRD techniques are broadly divided into two categories local and global methods. The local methods compute the saliency by the rarity of neighbors or surrounded regions. While the global methods extract saliency using the uniqueness of features over the entire scene. In [14], the authors computed the saliency as the center-remaining difference of many features. Graph-based SRD method [9] exploits the rarity of different local features to compute the saliency map. A fuzzy growing approach is utilized to compute the saliency with the contrast of neighboring superpixels [19]. Ming Lin et al., [20] proposed the saliency of superpixels by incorporating the global features, namely spatial distribution and uniqueness. They used the PCA method to incorporate color and pattern distinctness to find the SRD. In [21], the authors computed the saliency by the global contrast between the image patches and their spatial position. They performed sampling based on the conventional threecolor cues maps and PCA to extract the main features of the image patches. To extract a saliency map with high resolution that is dependent on color contrast, a Histogram Contrast (HC) method is defined in [10]. In [22], a non-local histogram approach is engaged to improve the efficiency of the method, and a smoothing procedure is applied to get rid of quantization artifacts. However, these proposed techniques are only suitable for simple natural images and lose their accuracy for highly patterned and textured images.

Multiple feature-based SRD
The existing approaches for SRD are mainly focusing on the color features, while ignoring the other features like texture, structure, and the orientation. Therefore, these types of methods are not successful when dealing with an image contains rich textural features. Many approaches for SRD use the RGB color model and few of them depending upon LAB or YC b C r color space for their result calculation. The authors consider the near-infrared region with the RGB color model for SRD [23], as the near-infrared region provides more clues for recognition and categorization than the RGB color model. SRD using sparsity-based and graph-based model is proposed in [3], the authors combine the multi-features of colors with sparse representation model to compute the saliency. A method for SRD by combining multiple features of color distribution and contrast is proposed in [24], the authors exploited a multi-features color difference measure, a multi-features color distribution measure, and a multi-features salient object measure to compute the saliency. To exploit the multi-features constructing through image manifold of the different feature, a multi-feature enhancement procedure is discussed in [16]. However, these methods add some high contrast pixels with the salient object that lead to insignificant detection.

Foreground or background-based SRD
The discriminative schemes are also very important because these schemes help in enhancing the contrast between the background and foreground regions for SRD [24]. A number of discriminative strategies based models have appeared in current years. Shuang Li et al., [25] suggested that the saliency of a region is computed by the distance from the most assured background and foreground seeds. Hongyang Li et al., [26] proposed that the saliency of an object is estimated through propagating the cues extracted mainly from the certain object regions and background. The graph-based methods can capture more grouping features in the scene with the graph likeness. Graph similarity typically controls the performance of a graph-based method [27]. Some of them used the semi-supervised learning to approximate the similarities by incorporating local-grouping features deduced from the whole image. The foreground represents appearance consistency and uniformity, while the background many times reveals global or local connectivity with each of the four image boundaries [5]. In [17], a two-stage saliency scheme is defined which is based on relevance to the given query. After that, they used the graph-based manifold ranking procedure to rank the foreground and background cues. However, if the contrast is very less between the foreground and the background the computed saliency results are not accurate. Furthermore, it is very difficult to choose the position and the number of salient queries because these cues are generated through the random walks on the graphs, especially for the images that contain, unlike salient objects.

Deep Convolutional Neural Networks-based SRD
Since Deep Convolutional Neural Networks (DCNN) based methods [28][29][30] are engaged for SRD a tremendous progress has been achieved because of the availability of large visual datasets and GPU computing resources. The development of deeper and larger DCNNs [28][29][30] that could automatically learn more and more powerful feature representations with multiple levels of abstraction from big data. Significant progress has been made in the past few years to boost the accuracy levels of SRD [28][29][30], but existing solutions often rely on computationally expensive feature representation and learning approaches, which are too slow for numerous applications. In addition to the opportunities they offer, the large visual datasets also lead to the challenge of scaling up while retaining the efficiency of learning approaches and representations for both handcrafted and deeply learned features. In addition, given sufficient amount of annotated visual data, some existing features, especially DCNN features [28][29][30], have been shown to yield high accuracy for visual recognition. However, there are many applications where only limited amounts of annotated training data can be available or collecting labeled training data is too expensive. Such applications impose great challenges to many existing features.
The proposed method is different from the current SRD models in the following ways: the proposed model utilizes the combination of different color and texture models to accurately detect the salient object, the discriminative dictionary of the proposed model is constructed using the local as well as the global information that encodes the salient object at its best level, and the low-rank representation is presented in a new way that has more effective representation than before.

Proposed SRD framework
The proposed model is discussed in detail in this section. First of all, the input image is oversegmented to extract the visual and textural features. In the second step, the boundary metric, candidate object metric, and candidate distance metric are combined to compute the low-level saliency that is later engaged to construct the salient and non-salient templates. At the final step, the low-rank representation is regularized through GRT and MMRT to compute the final saliency map as shown in Fig 2.

Visual and textural feature extraction
To encode and capture the structural features of the given image, the given image is over-segmented to N superpixels through the SLIC [31]. SLIC adapts a k-means clustering approach to efficiently generate superpixels. Despite its simplicity, SLIC adheres to boundaries as well as or better than previous methods. At the same time, it is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to super voxel generation. SLIC algorithm group pixels into perceptually meaningful atomic regions which can be used to replace the rigid structure of the pixel grid. Existing methods utilize the RGB color model or the CIELab space to compute the saliency. We also believe that a framework that is just utilizing visual features should not be capable of detecting the saliency persuasively as shown in Fig 3. However, we cannot deny that the major portion of an image is consisting of the visual features. To precisely capture the salient objects, all the visual and textural features are essential. Therefore, we combine the boundary, texture, geometry and spatial information to obtain our saliency results. Different attributes from four feature space are combined like (RGB, CIELab, HOG, LBP) to form a feature vector f i . The feature vectors are stacked in columns in feature matrix F ¼ ½f 1 i ; . . . :; f 73 i � 2 R m�N , where m represents the dimensions of the feature vector. The mean of the color feature is extracted from the superpixels, and we utilized after normalizing it. While the textural features like HOG and LBP feature are also extracted from the superpixels but after normalizing their histogram.

The discriminative dictionary
The discriminative dictionary is constructed using the local and the global features to encode salient object at its best level. We use the boundary metric, the candidate objectness metric, and the candidate distance metric to construct the dictionary. The details of these metrics are given in the next subsections.

Boundary metric.
Current SRD methods explicitly exploit the background features to compute their salient region map. As they believe that the background contrast, background connectivity, boundary information, and boundary connectivity information is obligatory for complete SRD. Since the salient object always appears near the center of the image and the boundary superpixels have possibly been the part of the background. We computed the distance of i th superpixel from boundary superpixel j th using the following expression: where, f Bnd j is the boundary feature vector and B represents the number of background

Candidate objectness metric.
To compute the candidate objectness, we computed a series of window priori that contains the probability of salient object and choose more than 3000 trial windows. The candidate objectness C Obj map is computed after summing up these chosen samples. The candidate objectness metric for i th superpixel is computed using the following expression: where, the N(h i ) is the number of pixels in h i and (a, b) are the coordinates of the h i .

Candidate distance metric.
The candidate distance metric is very helpful in prominenting the salient object part. The candidate distance metric G(i) expression is given as: where, the s c is the middle point of the salient object, and computed as: where, σ x = x c and σ y = y c are the image center coordinates, x i and y i are the superpixel coordinates, s i and s j are the i th and j th superpixels of the image. Finally, the three low-level features are exploited to compute the low-level saliency map SM LL of the i th superpixel as: Here, we combined the low-level features to construct our dictionaries and the image representation is locally similar and smooth. The dictionary compiled on the basis of background template is not enough to compute the precise the saliency map. Therefore, we computed the salient and non-salient dictionaries (D ST , D NT ) based on the γ 1 and γ 2 as: The values of parameters γ 1 = 0.35 and γ 2 = 1.05 are set according to the experiments. The current dictionary-based SRD methods collect the boundary information [6,15], the centerremaining difference [13], and center surrounded difference [16] for constructing the dictionary. Their performance remains satisfactory if the salient objects stay in the center of the scene. These methods have poor performance when the salient objects pop out near the image boundaries. Keeping in mind this issue, we designed a discriminative dictionary that is dependent on the salient object position. Not only this, we combine the boundary information, high-contrast background information, and the salient position information to compute the discriminative dictionary. The proposed discriminative dictionary is very helpful in salient object detection and very effective in suppressing the background part without disturbing the salient object as demonstrated in Fig 4.

Locality preservation method
In the recent literature of the SRD, sparse reconstruction based method are widely applied to detect the salient object and successively got the average results. They used the classical method without utilizing any global information related to the image. Therefore, sometimes it emphasizes the intensity of same regions and the obtained results have marginally less contrast as compared to the background regions. We exploited a GRT and a MMRT to preserve the local features and the global features related to the image and to evade the unrelated background from embedding with the salient object. The GRT preserves the local features of the salient object and homogenously treats with all the parts of salient objects. While the MMRT decreases the distribution divergence between the regions. In this way, the background regions, which are mistakenly highlighted are suppressed by treating with MMRT. On the other hand, the salient object regions that are erroneously concealed during pre-processing are decorated again by exploiting this MMRT. In short, theses regularization terms transform the low-rank representation into a new look, which has more smooth, locally and globally homogeneous. First, we exploited the following GRT: where, Z = [z 1 , z 2 , z 3 , . . ., z N ] 2 R K×N is a representation coefficient matrix, the affinity matrix W = w ij is employed to compute the weights between the connected regions. The Laplacian matrix L is defined as L = C − W, here C is the diagonal degree matrix C 2 R N×N , which is We computed the weights using the color, LBP, and HOG features. As we discussed earlier, the more accurate results are computed through the visual and textural features. Therefore, we assigned more weights to visual features as the colors contain the major portion of the image structure. If the weights are assigned according to the above-discussed limitation, then the constructed graph is connected marginally. A major portion of the regions has zero weights because the similar segments contain the similar saliency values. So, we applied the K − adjacent graph model to fully utilized the visual and textural information. After exploiting the GRT with the low-rank representation [7] the expression can be written as: The combination of the basis vector of the discriminative dictionary is used to represent a superpixel. While the α 1 and the α 2 are Graph Regularization (GR) and sparsity parameters, respectively. The designed framework contains all the locality and adhesiveness of the background and the foreground features to obtain the sparse coefficients for all the probability distributions. Even though the K-adjacent neighbor model is utilized to construct the graph, if we fail to obtain all the intrinsic features for sparse reconstruction coefficient probability distributions, the distribution divergence between the same regions (from similar to similar and dissimilar to dissimilar) remain unchanged, which can affect the obtained results. To deal with this issue, there is a need to decrease the distribution divergence between the same regions of the foreground and the background part in low-rank representation. The appropriate distributions of sparse coefficients are accomplished by expressing the data points through the empirical maximum mean-discrepancy matrix [32], which is applied as a non-parametric distance measure to keep the balance between the similar regions. We compute the spatial distance between the consecutive regions (from similar to similar and dissimilar to dissimilar) using K − dimensional co-efficient as follows: where, n r and n u are examples foreground and background regions.
where, M is computed as follows: M ¼ where, D r ¼ fðe i 1 ; e j 2 Þ; ::::; ðe i r n ; e j r n Þg and D r ¼ fðe i 1 ; e j 2 Þ; ::::; ðe i r n ; e j r n Þg are the foreground regions and background regions, respectively. After exploiting this regularization term, the discriminative dictionary is constructed and the distribution divergence from similar to similar and dissimilar to dissimilar regions is decreased by developing the Z. The following function is achieved after regularizing the low-rank representation term as: After exploiting the GRT and MMRT to the low-rank representation, it is transferred to a new appearance that has a more effective representation as shown in Fig 5.

Optimization
The above-discussed optimization problem is convex, this optimization is solved through [33,34] and we also need to minimize this augmented Lagrangian function: where, μ and Y are penalty parameter and Lagrangian multiplier, respectively. The notation <�> used to indicate the inner-product of two-matrices. This unconstrained problem is to minimize through Z and E.

Update Z
where, gðZÞ ¼ a 2 trðZLZ T Þ þ m 2 k F À DZ À E þ 1 m Y k 2 F , the solution of above equation can be found through [33,34] as: where, r 2 g(Z (t) ) is partial-differential g(Z) w.r.t. Z and is calculated as: The above-defined equation's solution is given as: where, threshold function S τ (w) is defined as:

Update E
where, gðEÞ ¼ a 3 trðELE T Þ þ m 2 k F À DZ À E þ 1 m Y k 2 F , the solution of above equation can be found through [33,34] as: where, r 2 g(E (t) ) is partial-differential g(E) w.r.t. E and is calculated as: The above-defined equation's solution is given as: The complete summary is given in Algorithm 1 and the details of calculations are given below as: where, S ST and S NT represents the reconstruction errors due to the salient and non-salient parts, respectively. We will get the final saliency results after merging the S ST and S NT maps using the following expression: We initialize the low-rank representation method [7], after the optimization procedure the values updated accordingly. We also note that the convergence of our method is fast and it takes only a small number of iteration for convergence as shown in Fig 6.

Experimental results and analysis
We analyzed and investigate our model on four largest benchmark datasets against existing methods. For performance assessment, four evaluation criteria are selected to completely analyze the proposed algorithm against the preceding schemes. In the next section, we discuss the details of selected benchmark datasets for performance evaluations.

Benchmark datasets
To analyze the computed saliency results, many datasets available that differ from each other in size, objects in the scene, image background like simple or complicated and the GT. We employ different datasets to assess the performance of our proposed algorithm. We assess our SRD model on four different standard datasets that are: 1) ASD [35], 2) ECSSD [36], 3) DUT-OMRON [5], 4) SED2 [37], and 5) MSRA [38]. We prefer these databases for the following reasons: 1) the background nature, 2) the complexity level, 3) the number of images, 4) the potential benchmark databases, and 5) the number of objects in the scene.

Evaluation metrics
Numerous techniques are applied to evaluate the concurrence between the obtained results and the GT. Before computing the evaluation metrics, the produced salient region maps should be changed in binary form to estimate the generated map. We also apply the adaptive threshold as discussed in [46], the thresholding is used to get the binary mask of salient region map S, that is calculated as: whereas, w and h represent the height and width of saliency map, respectively.

Precision-Recall.
The saliency map S is converted to the binary-mask M using the given ground truth T. The PR-curve is computed using this expression: 4.3.2 F-score. F-score is calculated using the Precision-Recall, the evaluation of the SRD is not complete without F-score. The F-score is computed using the following expression: All of the compared method take the value of ν = 0.3. So, we have take the value of ν = 0.3 for a fair comparison.

Receiver operating characteristics.
The ROC-curve is obtained using the binary mask M with a fixed threshold as: where, � T is opposite of T and � M is opposite of M. The ROC-curve is obtained through TPR and FPR with changing the value of the fixed threshold.

Mean absolute error.
To check the worth of SRD maps might have high significance as compared to binary mask. We also applied the MAE between the continuous SRD map S and the ground truth T, both are normalized in the range [0, 1]. The MAE value is defined as:

Impact of parameters and features on the performance of our model
In this section, we discuss the impact of different parameter on the performance of our proposed model. These parameters play a key role in obtaining the optimal results. These parameters are GR parameter, MMR parameter, and the sparsity parameter. We perform a series of experiments to adjust the values of these parameters. The details are discussed in the next subsection.

Parameter settings.
The performance of proposed model is severely affected by the GR parameter, MMR parameter, the sparsity parameter and the number of superpixels. Therefore, for the optimal results the values of these key parameters are adjusted as α 1 = 1, α 2 = 0.1, α 3 = 10 5 , and N = 300.

Impact of GR parameter.
Normally, this parameter is theoretically utilized to penalize the discontinuities and geometrical features in representation coefficients. We also perform a series of simulation with varying the values Laplacian parameter. The results revealed that with the higher values the geometrical structures of the salient objects are lost. While, with the lower values of the Laplacian parameter, structural and geometrical features are preserved as revealed in Fig 7(a). The Laplacian regularization parameter saves the structural and geometrical features and means discrepancy parameter reduces the distribution divergence and connections among similar regions.

Effect of sparsity factor.
To check the impact of the sparsity parameter, we use different values to run the simulation for the sparsity parameter in the range starting from {0.1} to {0.5}. Generally, the sparsity parameter is engaged to keep away from deterioration and over-fitting. From the simulation, we note that with the smaller value of α 2 or as we move toward the zero a few numbers of iteration are required. In the contrast, if the value of sparsity parameter moves in the direction of infinity the computed saliency maps not remain significant due to the supremacy of the sparsity as demonstrated in Fig 7(b).

Impact of MMR parameter.
To check the impact of MMR parameter, we perform the simulation with varying its values in the range between {10 3 , 10 7 }. From the experimental results shown in Fig 7(c), we found that with the larger value of α 3 the distribution connection between the regions is assured while the sparsity is badly affected. In the contrast, if we choose the smaller value for the MMR parameter the geometrical structure of the salient object and the sparsity is secure while the distribution connection between the regions is lost. In both of the cases, the sparsity of the approach is also not robust. For the ease and efficiency, we select the value of α 3 = 10 5 .

Impact of number of superpixels.
We also check the impact of the number of superpixels on the performance of the proposed model as shown in Fig 7(d). The numbers of superpixels increase the performance at a fixed value. The performance is unaffected by some values at a certain level. However, we selected the value of N = 300 and at this value, the proposed model performs well.
4.4.6 Effect of visual and textural feature on the performance of our proposed model. We perform simulations and used PR curve to check the effect of features on the efficiency of our model. The comparisons between the different features of our model are shown in Fig  8(a). It is clear from the revealed results that the PR-curve only with textural features is slightly lower than the other two curves and the middle curve is obtained using the visual features. In both the cases, the obtained results are not accurate and losing a lot of image information. Consequently, both the curves are below than the curves contain both the visual and textural features. Hence, the visual and textural features are necessary to obtain the final saliency results and without including this information the precise results cannot be calculated.

Effect of regularization terms on the final saliency.
We also analyze the impact of regularization terms like Laplacian regularization and mean discrepancy term on the final https://doi.org/10.1371/journal.pone.0213433.g008 saliency map. By regularizing the low-rank representation through these terms can save the similarity as well as the locality of the regions. Here, we exploited the Laplacian term for representation coefficients and a MMRT for representation errors. By using, these terms the representation coefficients and representation errors of similar regions contain similar saliency values when sparsely encoded with the discriminative dictionary. We can note from the Fig  8(b) the remarkable improvements in the final saliency results are obtained by adding these regularization terms.
https://doi.org/10.1371/journal.pone.0213433.g009 performance of LC [42], DS [6], HS [36], and [40] is satisfactory in all these images. LC [42] and MC [40] perform well for centered salient objects but as the salient objects touch the any side of the image. LC [42] and MC [40] lose accuracy and start adding the background part with the salient objects. MC is not good in dealing with cluttered images. DS [6] perform well, however, lost the salient objects part when the salient object pops up near the image boundaries due to incompetent background dictionary. While the HS [36] totally loss the salient objects in the case of less contrast between the background and foreground as shown in the Fig 9. We can note that in all the type of images the proposed performance remains consistent and does not lose any image information as compared to the existing schemes. 4.5.1 ASD dataset. We assess the performance of our SRD scheme with state-of-the-art methods on the ASD database [35] as revealed in Fig 10. The reason for selecting ASD database is to investigate the behavior of our scheme with images having different complexity levels and diversified pattern. We examine the proposed method against fourteen most well-known SRD schemes such as: AC [39], CH [44], FT [35], GB [9], HC [10], HS [36], MC [40], GM [5], RB [45], RC [10], SR [43], UF [41], DS [6], and LC [42]. The above-discussed metrics are engaged for evaluation of our method, we found that the proposed model performs against the chosen methods with the higher accuracy as revealed through the Tables 1, 2 and 3. However, HC, DS, and MC as well performed persuasively. On this dataset, we note that our method remains extremely reliable and accurate in dealing with the salient objects.

DUT-OMRON dataset.
We also use DUT-OMRON dataset [5] to analyze the performance of our proposed approach. The motive for selecting DUT-OMRON database [5] is that it contains a large number of images with different complexity levels of the background. Therefore, we use this database to evaluate our approach. We verify the performance of our proposed model graphically PR and ROC-curves. The resulting graphs are illustrated in Fig 11. Nevertheless, DS [6], GM [5], and RB [45] also demonstrate persuasive results as described in Tables 1, 2 and 3. We notice from our analysis that our approach is more effective in highlighting the salient objects than the other discussed methods.

ECSSD dataset.
Moreover, we as well engaged ECSSD dataset [36] to assess and certify our SRD mechanism visually and graphically. ECSSD database contains more natural images with a diversified pattern for both foreground and background. The reason for selecting ECSSD database is to investigate the behavior of our scheme with images having different complexity levels and diversified pattern. We examine the proposed method against fourteen most well-known SRD schemes such as: AC [39], CH [44], FT [35], GB [9], HC [10], HS [36], MC [40], GM [5], RB [45], RC [10], SR [43], UF [41], DS [6], and LC [42] on the ECSSD database to declare the strength of our algorithm. We pick four different criteria, which are mainly used in the literature to assess the performance of SRD methods. These criteria are PRcurve, ROC curve, F-score, and MAE to check the performance of our proposed approach. From the series of experiments as given in Tables 1, 2 and 3, we found that our proposed method achieves very good results as compared to above-defined approaches. On the other hand, DS [6], GM [5], and RC [22] as well accomplished fine results on all four SRD metrics. Our approach remains unswerving in all defined evaluation measures and demonstrates significant performance as shown in Fig 12. 4.5.4 SED2 dataset. Additionally, we employed SED2 dataset [37] to evaluate and validate the proposed method graphically. The motive for electing SED2 database is to assess the performance of our scheme through an image with two objects. We analyze and compare the proposed method against fourteen most famous state-of-the-art approaches such as: AC [39], CH [44], FT [35], GB [9], HC [10], HS [36], MC [40], GM [5], RB [45], RC [10], SR [43], UF [41], DS [6], and LC [42] on SED2 database to assure the validity of our algorithm. We choose four different criteria like PR-curve, ROC curve, F-measure, and MAE to estimate the strengths   [42], and our proposed model on the DUT-OMRON database [5].
and bounds of our SRD approach as revealed in Tables 1, 2

Limitation
Our designed model performs against the current salient regions detection methods. On the other hand, in some typical images, our model computed results are poor as demonstrated in the Fig 14. In these typical images, the salient object is almost in the same color as the background. During the SRD, some background parts are incorrectly considered as a foreground and embed with the salient object leading to poor performance. This problem is famous in object detection when the contrast is very the objects are not detected properly.

Saliency cut
In the current literature, different methods employ the shape prior to segment the salient object, which is later utilized to produce the saliency cut. Some others used a rectangular locale to capture the salient object and then this captured region is converted to a fuzzy region. Both of the above-mentioned methods use the different strategies to compute the saliency cut. The proposed method also segments the image first and then produces a precise saliency cut, we use different datasets and found that the proposed model produces quite promising saliency cut results as demonstrated in Fig 15.

Conclusion
In this paper, we propose a novel SRD method through salient and non-salient dictionaries. Initially, a new feature space is constructed by concatenating four feature spaces like CIELab, RGB, HOG, and LBP. Then, we combine a boundary metric, candidate objectness metric and a candidate distance metric to compute a low-level saliency map. After that, we extract a salient template and a non-salient dictionary from that low-level saliency. We regularize the low-rank representation through GRT that saves the structural and geometrical features and using a MMRT that reduces the distribution divergence and connections among similar regions. The proposed model is tested against over a dozen latest SRD method using four evaluation metrics. The proposed model remains persistent in all the tests and outperformed against the selected models with higher precision.