No-reference quality assessment for image-based assessment of economically important tropical woods

Image Quality Assessment (IQA) is essential for the accuracy of systems for automatic recognition of tree species for wood samples. In this study, a No-Reference IQA (NR-IQA), wood NR-IQA (WNR-IQA) metric was proposed to assess the quality of wood images. Support Vector Regression (SVR) was trained using Generalized Gaussian Distribution (GGD) and Asymmetric Generalized Gaussian Distribution (AGGD) features, which were measured for wood images. Meanwhile, the Mean Opinion Score (MOS) was obtained from the subjective evaluation. This was followed by a comparison between the proposed IQA metric, WNR-IQA, and three established NR-IQA metrics, namely Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE), deepIQA, Deep Bilinear Convolutional Neural Networks (DB-CNN), and five Full Reference-IQA (FR-IQA) metrics known as MSSIM, SSIM, FSIM, IWSSIM, and GMSD. The proposed WNR-IQA metric, BRISQUE, deepIQA, DB-CNN, and FR-IQAs were then compared with MOS values to evaluate the performance of the automatic IQA metrics. As a result, the WNR-IQA metric exhibited a higher performance compared to BRISQUE, deepIQA, DB-CNN, and FR-IQA metrics. Highest quality images may not be routinely available due to logistic factors, such as dust, poor illumination, and hot environment present in the timber industry. Moreover, motion blur could occur due to the relative motion between the camera and the wood slice. Therefore, the advantage of WNR-IQA could be seen from its independency from a “perfect” reference image for the image quality evaluation.


Introduction
Wood is a plant tissue consisting of a porous and fibrous structure. It is widely used as a source of energy, and for furniture making, millwork, flooring, building construction, and paper production [1]. Thousands of wood producing tree species are present, which comprise materials of distinct physical characteristics in terms of structure, density, colour, and texture [2]. These characteristics define the preferred usages and monetary values of the trees [3]. Furthermore, although the timber production at high latitudes is based on a small number of species, a wide range of tropical forests is present. For example, conifers of the genus Pine are widespread in the Northern Hemisphere. Subsequently, this phenomenon leads to the production of moderate-priced wood of high resin content, which is widely used for the making of indoor furniture. Discovered in the native of Central America, Bocote (Cordia gerascanthus) is used to produce high-cost hardwood, which is suitable for high-quality furniture and cabinetry due to the glossy finish created by the oily surface of the wood. Meanwhile, the rosewood (Dalbergia sp.) is another high-cost wood, which is sought for instrument making and flooring due to its high strength and density. Provided that each wood species consists of various price and characteristics, misclassification of the wood could lead to financial losses. Therefore, the correct identification of the different wood species is essential.
Although the recognition of wood species is traditionally performed by humans [4], the process of it is time-consuming and incurs a high cost to the lumber industry. Therefore, various algorithms have been developed for automatic recognition of wood samples [1,2,5,6]. A scope is present for the improvement in the accuracy of automatic wood recognition systems through high-quality microscopy images, which are sometimes pre-processed to enhance the recognition. However, the processes of image enhancement require more time and may impart a checkerboard artefact to the wood images [7]. Besides, the environment of timber factories is surrounded by dust, poor illumination, and heat [8], which lead to the degradation of the image quality. Therefore, a suitable Image Quality Assessment (IQA) metric is essential to evaluate the captured images before proceeding to the pipeline for recognition algorithms.
Image Quality Assessment (IQA) may be specified into two categories, namely subjective and objective evaluations. Subjective evaluation occurs when the images are evaluated by human, who provide scores based on their perception on the image quality, while objective evaluation involves mathematical algorithms to calculate the quality score for the images [9]. Although subjective evaluation is regarded as the gold standard in IQA, it is not practical in the industrial setting due to the high cost and long duration required. Therefore, an incentive is made to develop objective evaluation procedures of the comparable quality to subjective IQA evaluation [9].
The objective evaluation consists of three categories, namely Full-Reference-IQA (FR-IQA), Reduced Reference-IQA (RR-IQA), and No-Reference/Blind-IQA (NR-IQA) [10,11]. Specifically, FR-IQA evaluates an image by comparing the image with its reference image, while NR-IQA evaluates an image without involving reference images. Meanwhile, RR-IQA assesses an image using partial information from reference images [12]. Notably, NR-IQA is the most suitable metric used to assess wood images due to the impediments (dusty environment and poor illumination) to the achievement of high-quality images in the environment of lumber mills.
Several NR-IQA metrics were proposed, such as Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) [13], deepIQA [12] and Deep Bilinear Convolutional Neural Networks (DB-CNN) [14]. Specifically, BRISQUE [13] is trained with Generalised Gaussian Distribution (GGD) and Asymmetric Generalised Gaussian Distribution (AGGD) features by Support Vector Regression (SVR) model for modelling of the images in the spatial domain. Furthermore, deepIQA and DB-CNN are CNN-based NR-IQAs, in which the deepIQA is trained end-toend. It also involves 10 convolutional layers, five pooling layers for feature extraction, and two fully connected layers for regression [12]. Meanwhile, DB-CNN is trained by two sets of features, namely CNN for synthetic distortions (S-CNN) and VGG-16, which are bi-linearly pooled to measure the quality of the image [14]. However, provided that a limited number of labelled training data often leads to overfitting problem in CNN, the CNN-based NR-IQA model requires a larger size of the training database [14].
Accordingly, an investigation was conducted on the NR-IQA procedure, which was based on a widely-used NR-IQA, the Blind/Referenceless Image Spatial Quality Evaluator (BRIS-QUE) model. As an IQA model, BRISQUE was not a distortion-specific model. Instead, it considered the luminance and image features of the natural images [13]. Furthermore, the BRISQUE model was trained with subjective scores to enable the emulation of human judgement on the quality of the images. Provided that BRISQUE was trained to evaluate natural images, it was not optimal for the assessment of wood images. Therefore, an NR-IQA was proposed specifically for the assessment of wood images. Following that, the proposed metric, Wood NR-IQA (WNR-IQA) was then compared with BRISQUE [13], deepIQA [12], DB-CNN [14], and five types of established FR-IQA metrics, such as Structural Similarity Index (SSIM) [15], Multiscale SSIM (MS-SSIM) [15], Feature Similarity (FSIM) [16], Information Weighted SSIM (IW-SSIM) [17], and Gradient Magnitude Similarity Deviation (GMSD) [18]. The relative performances of the WNR-IQA, BRISQUE, deepIQA, DB-CNN, and FR-I-QAs were identified based on the correlation between the human mean opinion scores (MOS) and the metrics. In this case, the Pearson Linear Correlation Coefficient (PLCC) [19] and Root Mean Squared Error (RMSE) [20] were used.

Training and testing database
To cater to image quality assessment, specifically for wood images, Generalized Gaussian Distribution (GGD) and Asymmetric Generalized Gaussian Distribution (AGGD) features were calculated for wood images. This process involved the subjective MOS obtained from a subjective evaluation for wood images. These GGD, AGGD features, and MOS were used as the training and testing database for the SVR model. Wood images. Ten wood images of ten wood species in the lumber industry, namely Turraeanthus africanus (Avodire), Ochroma pyramidale (Balsa), Cordia spp. (Bocote), Juglans cinerea (Butternut), Tilia Americana (Basswood), Vouacapoua americana (Brownheart), Cornus florida (Dogwood), Cordia spp. (Laurel Blanco), Swartzia Cubensis (Katalox), and Dipterocarpus spp (Keruing), were obtained from a public wood database: https://www.wooddatabase.com/ [21]. The ten wood images are presented in Fig 1. The images were then converted to grayscale, followed by normalisation of the pixel values to the range of 0-255 to facilitate the application of the same levels of distortion across all the reference images. Furthermore, the images consisted of a matrix of 600 x 600 pixels, which corresponded of an image area of 9525 cm 2 . The ten reference wood images were then distorted by Gaussian white noise and motion blur to represent the image distortions, which were encountered in the industrial setting. To be specific, Gaussian white noise often arises during the acquisition of wood images due to the sensor noise [22] caused by poor illumination and high ambient temperature in the lumber mill [8]. Meanwhile, wood images were subjected to motion blur upon the presence of relative motion between the camera and the wood slice [6]. Provided that these distortions resulted in a low quality of the wood image, the features of the pores on the wood texture could not be distinguished from one another. As a result, misclassification of the wood species occurred as the feature extractor would not be able to effectively extract distinctive features from the wood texture images [23].
The Gaussian white noise with a standard deviation of σ GN and a motion blur with a standard deviation of σ MB were applied to the reference images at five levels of distortion. For example, the σ GN for Gaussian white noise amounted to 10, 20, 30, 40, 50 and, while the σ MB for motion blur amounted to 2, 4, 6, 8 and 10. As a result, 110 wood images, ten reference images, 50 images distorted by Gaussian white noise, and 50 images distorted by motion blur were produced. This was followed by the measurement of GGD and AGGD features for these images, which were then used to train the SVR.
The local mean, μ(m,n), and local variance, σ(m,n), were calculated using the equations in (2) and (3), respectively [12]: sðm; nÞ ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi Where, w = {w k,l |k = −K, . . ., K, l = −L, . . ., L} denotes a 2-dimension (2D) circularly-symmetric Gaussian weighting function, which was sampled in three standard deviations. This function was then rescaled to unit volume, in which K and L represent the window sizes. It could be seen in Fig 2 that [12]. Therefore, the MSCN coefficients were plotted for

PLOS ONE
No-reference quality assessment for image-based assessment of woods the reference image, including the images distorted with Gaussian white noise and motion blur to illustrate the resultant changes in the coefficients, as shown in Fig 3. Based on Fig 3, a Gaussian distribution is presented in the reference images, while the distribution of the images distorted with Gaussian white noise and motion blur consisted of different tail behaviours.
Two types of Gaussian distribution functions were incorporated in this study to accommodate the diverse characteristics of MSCN coefficient, namely the Generalized Gaussian Distribution (GGD) and Asymmetric Generalized Gaussian Distribution (AGGD) [12].
There are two parameters computed for the GGD, where α represents the shape of the distribution and σ 2 represents the variance. These two parameters are calculated for wood images using the moment-matching principle. The GGD is computed using (4) [24]: Where GðaÞ ¼ Next, the MSCN coefficients were computed throughout the eight orientations, namely horizontal (H 1 and H 2 ), vertical (V 1 and V 2 ), and diagonal (D 1 , D 2 , D 3 and D 4 ) as shown in  The computation of the pairwise products of MSCN coefficients throughout the eight orientations: H 1 , H 2 , V 1 , V 2 , D 1 , D 2 , D 3 and D 4 are shown from Eqs (8)   The histogram of the pairwise products of MSCN coefficients throughout the eight orientations is presented in Fig 5. The difference between pairwise products of MSCN coefficients along H 1 and H 2 , V 1 and V 2 , D 1 and D 3 , and D 2 and D 4 were calculated, which indicates that H 1 = H 2 , V 1 = V 2 , D 1 = D 3 , and D 2 = D 4 . Hence, four orientations, namely H 1 , V 1 , D 1 and D 2 were chosen for the AGGD calculations.
Four parameters were computed for AGGD, namely η, v, s 2 l and s 2 r . Specifically, υ represents the shape of the distribution, s 2 l and s 2 r represent the left-and right-scale parameters, and η represents the mean of the distribution. The four parameters of AGGD, namely η, v, s 2 l , s 2 r , were calculated using the formula in (16) [25]. The AGGD parameters, η, v, s 2 l ; s 2 r were calculated throughout H 1 , V 1 , D 1 and D 2 orientations as shown in Eq (17) and this forms 16 parameters of AGGD.

Fig 5. Histogram of pairwise products of MSCN coefficients in eight directions: (a) D 1 (b) D 2 (c) D 3 (d) D 4 (e) H 1 (f) H 2 (g) V 1 (h) V 2 for the reference image and images distorted with Gaussian white noise (GWN) and motion blur (MB
In total, calculations were performed on 18 parameters of GGD and AGGD for the wood images, such as two parameters of GGD: α, σ 2 , 16 parameters of AGGD: four AGGD parameters, η, v, s 2 l ; s 2 r x 4 orientations, including H 1 , V 1 , D 1 , and D 2 , as shown in Table 1. According to Mittal et al., accurate assessment of images could be conducted through IQA, which presents multi-scale information of an image [12]. Therefore, the aforementioned 18 parameters were computed at two scales (original image scale and image reduced by a factor of 0.5). Therefore, 36 parameters were generated from the full procedure to represent the features of wood images, and all parameters were used to train the SVR. As a result, only two scales were used, which reflected Mittal et al.'s statement that no improvement took place in the performance of the metric when more scales were incorporated [12]. The computation time would also increase with the increasing number of scales.
MOS. Ten students from the Department of Electrical and Electronics Engineering in Manipal International University (MIU), Nilai, Malaysia, who aged 20 to 25 years old, volunteered to evaluate the wood images. The evaluation was performed using a 21 inch LED monitor with a resolution of 1920 x 1080 pixels based on the procedures recommended in Rec. ITU-R BT.500-11 [26] within an office environment. The uncorrected near vision acuity of every subject was checked using the Snellen Chart prior to the subjective evaluation to confirm their fitness to perform the evaluation task.
After the examination of the uncorrected near vision acuity, a subjective evaluation was conducted. Consisting of a process which took 15 to 20 minutes, the evaluation was performed based on the Simultaneous Double Stimulus for Continuous Evaluation (SDSCE) methodology [26,27]. In this case, the reference and distorted images were displayed on the monitor screen side-by-side, where the reference image was displayed on the left and the distorted image was displayed on the right. The distorted image was evaluated by each subject through the comparison between the distorted image (right side) and reference image (left side) in terms of quality. The image was either rated as Excellent (5), Good (4), Fair (3), Poor (2), or Bad (1) based on each displayed image. However, the numerical scores were not revealed to the subjects due to the potential bias created between the subjects [25]. The ratings obtained from the subjects were used to calculate MOS using the formula in (21) [28]:  Where S ip refers to the score by i th subject for p th image, while N represents the number of human subjects as N = 10. The MOS values obtained for wood images were also used to train SVR.
Regression module. An epsilon-SVR, 2 − SVR model was used in this study [29]. As previously mentioned, the 2 − SVR was trained using MOS, 36 GGD, and AGGD features of wood images. Following the calculation of 36 image features for the wood images, mapping of the features to MOS values of the respective wood images were performed. The 36 features and MOS of wood images were then divided randomly into two sets, where one set was used for training and another set was used to test the system. While 80% of the 36 features and MOS values were used to train the SVR model, the remaining 20% were used to test the system. The training and testing datasets were permutated randomly to avoid any bias during the training and testing of the system [12].
The difference between BRISQUE and WNR-IQA could be seen from how BRISQUE is the generalised form of IQA, which is made to obtain quality score for natural images, while WNR-IQA is created specifically for wood images. Natural images are any natural light images which are captured by an optical camera without any pre-processing [12]. While, wood images are captured using a portable camera which has ten times magnification lens [3]. The differences between BRISQUE and WNR-IQA flowcharts are presented in Fig 6. Pearson's Linear Correlation Coefficient (PLCC) [19] and Root Mean Square Error (RMSE) [20] between the MOS values and the quality score, which were obtained from the WNR-IQA, were calculated to evaluate the performance of the system. The accuracy of the system was indicated through higher PLCC and lower RMSE values due to the high similarity between the quality scores obtained from the WNR-IQA to the MOS values in terms of magnitude. The training and testing of the system were iterated 100 times, while the PLCC and RMSE values were recorded for every iteration. As a result, the medians of PLCC and RMSE amounted to 0.935 and 0.361, respectively. Moreover, the optimised cost parameter (C) and width parameter (g) of the SVR model, which amounted to 512 and 0.25, respectively, were selected based on the median of the PLCC and RMSE values. Following that, these parameters were used to form the optimised SVR model.

Performance evaluation
The second dataset was created specifically to evaluate the performance of WNR-IQA, where only the second dataset was used instead of the wood images highlighted in the Wood images sub-section for the evaluation of the performance of WNR-IQA. To illustrate, provided that the wood images were used for the training of the SVR, the second dataset was created to avoid any bias in performance evaluation. This dataset was generated using 10 'perfect' reference images obtained from ten different wood species, namely Julbernardia pellegriniana These images were obtained from the same wood image database [21] and distorted with Gaussian white noise with σ GN = 10, 20, 30, 40, and 50, including a motion blur with σ MB = 2, 4, 6, 8, and 10. Using motion blur, further distortion was performed on the images distorted by the Gaussian white noise. In this case, following the distortion of images with σ GN = 10 was further distortion with σ MB = 2, 4, 6, 8 and 10, and these procedures were repeated for images distorted with σ GN = 20, 30, 40, and 50. Overall, 360 wood images were generated in the dataset.

The relationship between MOS and different distortion levels
The relationship between MOS and different distortion levels of Gaussian white noise, motion blur, and a mixture of Gaussian white noise and motion blur is presented from Fig 8a-8g. Higher MOS values indicated higher image quality, while higher distortion levels represented lower image quality. Therefore, lower MOS values would be produced for images with higher distortion levels. Based on the scatter plot presented in Fig 8a-8e, the MOS value was reduced with the increase in distortion level. Accordingly, it was indicated that human subjects were able to differentiate the images distorted with different levels of Gaussian white noise, motion blur, and the mixture of both distortions. It could be seen from the scatter plot in Fig 8f and 8g that the MOS value amounted to 1, while the images distorted with Gaussian white noise, σ GN , amounted to 40 and 50 at all the levels of motion blur due to the poor quality of the images.

Relationship between MOS and proposed WNR-IQA, BRISQUE, deepIQA, DB-CNN, FR-IQAs
The calculated PLCC and RMSE values between MOS and the WNR-IQA, BRISQUE, dee-pIQA, DB-CNN, and the five FR-IQA metrics are presented in Table 2. PLCC values close to 1, indicate a close correlation of MOS with the IQA metric, while lower RMSE values indicate a correlation of MOS with the IQA metric. Table 2 shows that the highest PLCC values were recorded for Gaussian white noise, motion blur, the mixture of Gaussian white noise and motion blur, and the overall database obtained for the WNR-IQA compared to BRISQUE, It is also indicated from Table 2 that the lowest PLCC values were recorded for BRISQUE, indicating that BRISQUE was not compatible with the assessment of wood images. This incompatibility was also indicated by the highest RMSE values recorded for BRISQUE. However, WNR-IQA had a higher performance compared to BRISQUE, deepIQA, DB-CNN, and FR-IQAs as it was adapted for wood images. The model was also trained with GGD and AGGD features, including the MOS obtained for wood images unlike BRISQUE, deepIQA, DB-CNN, and FR-IQAs, which were designed based on the features and their similarities, luminance, contrast, and structure of natural images. Additionally, WNR-IQA also had a higher performance compared to FR-IQAs as it does not require a perfect reference image.

Conclusion
In this article, Wood No-Reference Image Quality Assessment (WNR-IQA), was proposed for the evaluation of wood images prior to classification of species. Provided that the established NR-IQA metrics, BRISQUE, deepIQA and DB-CNN were designed for the assessment of natural images, they were not optimal for the assessment of wood images. Therefore, the WNR-IQA was trained using MOS and a set of features calculated specifically for wood images. This was followed by the evaluation of the performance of the WNR-IQA by comparing the correlation between MOS, WNR-IQA, BRISQUE, deepIQA, DB-CNN, and five FR-IQA metrics using PLCC and RMSE. It was indicated from the values of PLCC and RMSE that WNR-IQA exhibited higher performance compared to BRISQUE, deepIQA, DB-CNN, and the five FR-IQAs. Furthermore, the proposed WNR-IQA performed an accurate assessment of the quality of wood images, which should function in the selection of suitable images to be included in the wood recognition algorithm. Essentially, the acquirement of a perfect image is impossible in the timber industry due to its environment, which consists of dust, poor illumination, hot environment, and motion blur caused by relative motion between the camera and the wood slice. However, the quality assessment in this study did not require a perfect reference image for the evaluation of the quality of the test wood images.