Adaptive Tensor-Based Principal Component Analysis for Low-Dose CT Image Denoising

Computed tomography (CT) has a revolutionized diagnostic radiology but involves large radiation doses that directly impact image quality. In this paper, we propose adaptive tensor-based principal component analysis (AT-PCA) algorithm for low-dose CT image denoising. Pixels in the image are presented by their nearby neighbors, and are modeled as a patch. Adaptive searching windows are calculated to find similar patches as training groups for further processing. Tensor-based PCA is used to obtain transformation matrices, and coefficients are sequentially shrunk by the linear minimum mean square error. Reconstructed patches are obtained, and a denoised image is finally achieved by aggregating all of these patches. The experimental results of the standard test image show that the best results are obtained with two denoising rounds according to six quantitative measures. For the experiment on the clinical images, the proposed AT-PCA method can suppress the noise, enhance the edge, and improve the image quality more effectively than NLM and KSVD denoising methods.


Introduction
Since the inception of X-ray computed tomography (CT) in the 1790s, it has revolutionized diagnostic radiology and increased rapidly in usage [1]. CT utilizes computer-processed X-rays to produce tomographic images of specific areas of the scanned object. A 3D image of the interior of an object is generated from a large series of 2D radiographic images taken around a single axis of rotation. Thus, CT involves larger radiation doses than conventional X-ray imaging procedures. Moreover, X-rays indirectly or directly ionize DNA to cause strand breaks that are less easily repaired. Misrepair can occasionally induce point mutations, chromosomal translocations, and gene fusions, all of which are linked to cancer development [1]. Therefore, the use of diagnostic X-rays involves a low risk of developing cancer. Low individual risks applied to an increasingly large population may be a public health issue.
CT-related risks can be reduced through two ways. The first is by substituting CT with magnetic resonance imaging (MRI) or ultrasound. MRI has better descriptive powers than CT, but it cannot be used if metal is implanted in the body of the patient. It also produces low detailed images in bony structure examinations. MRI is more expensive than CT and ultrasound, and is As shown in Fig 1, the proposed AT-PCA algorithm has two stages. The steps within each stage are similar, except that the noise level is updated after the first stage is finished because the initial estimation from the first stage is further refined by the second stage. Each stage consists of the following five steps: (1) similar patches are grouped within an adaptive searching window; (2) adaptive tensor-based PCA bases are calculated in terms of each similar patch group; (3) the linear minimum mean square error (LMMSE) is used to obtain the clean coefficients from local principal components; (4) shrinkage clean coefficients are employed for patch reconstruction; and (5) all reconstructed patches are aggregated, and the denoised image is finally obtained.

Related Work
Principal component analysis is a fundamental linear subspace learning technique that uses an orthogonal transformation to convert a set of observations into a set of linearly uncorrelated variables. PCA mainly aims to reduce the dimension of the observations using uncorrelated variables and retain information that characterizes the variation in the observations as much as possible.
If we assume that X = [x 1 ,x 2 ,. . .,x N ] denotes N training sample sets, we obtain the mean vec- To ensure that all samples are centered, we havex i ¼ x i À x. Then, a set of orthonormal basis vectorsŨ ¼ ½u 1 ; u 2 ; . . . ; u M spanning an m-dimensional subspace is searched. The eigenvectors u m ,(1 m M) of the covariance matrix Cx ¼ Efxx T g are the orthonormal basis vectors required by PCA. To complete the dimension reduction of data, eigenvectors U = [u 1 ,u 2 ,. . .,u k ],(1 k<M) that correspond to the largest m eigenvalues λ 1 !λ 2 !. . . λ m are retained to minimize the mean-square error betweenx i and its reconstructionx i ¼ x þ Uy i . In this equation, y i is the m-dimensional uncorrelated variable of the original centered samplex i and is called principal component. Thus, the original sample is represented by a low-dimensional vector by projecting it into the PCA subspace. This representation can be expressed as follows: As a linear subspace learning method, PCA requires the input data unfolded as vectors and may cause computation problems and destroy the structure of the input data. Furthermore, the number of the largest eigenvalues m is usually decided by the experience or Q-based method. It is not specific or adaptive to different denoising situations.

Methodology
The proposed AT-PCA for CT image denoising contains four parts: adaptive searching windows design, similar patch grouping, adaptive tensor shrinkage, and patches aggregation. A reference patch centered on a pixel is first decided. Searching windows are adaptively designed to search similar patches as the reference patch and exclude patches with very different structures.
In each searching window, tensor-based PCA transformation matrices are calculated by using the grouped similar patches as a training tensor. All the patches are projected into the tensorbased PCA subspace. In this subspace, LMMSE is used to shrink transformed components. Then, the shrinkage components of patches are reconstructed into the image space, and highfrequency noise is removed. After all patches sounding on each pixel is denoised, we aggregate the processed patches and obtain the denoised CT image. The flowchart of AT-PCA for lowdose CT denoising is shown in Fig 1.

Adaptive Searching Window
Patch-based image denoising has been widely used in recent research. Performing noise reduction on the patch (considering neighboring pixels) instead of the single pixel can preserve edge, which constitutes important semantic information of an image. Patch size is empirically decided and investigated in the experimental results of the study. In general, pixel denoising estimates the variable of its noisy observations within similar patches that can be searched around the entire image. However, this procedure is time consuming. To reduce the calculation time, we can search similar patches within a L×L window centered on the specific patch. This procedure is also based on the fact that similar patches are located near each other. This image has a complex structure, so an adaptive searching window with the size L×L is required for different locations of the specific patch. In Fig 2, the underlying pixel x 0 to be noised is presented by a P×P patch shown in the red patch, denoted as X 0 2R P×P . Similar patches [X 1 ,X 2 ,. . .,X N−1 ] as X 0 are searched in a reasonable range with a size of L×L rather than in the whole image with a size of Row×Col. L is significantly smaller than both Row and Col. To decide the reasonable range for a searching window B2R L×L , two types of evaluation standards are investigated within a few larger ranges of K×K than L×L. The evaluation standards include the median absolute deviation (MAD) and the inter-quartile range absolute deviation (IQRAD). Comparable ranges for deciding the optical size of B are limited between the range P+1 and the range K. All comparable searching windows can be unfolded as vectors and denoted as MAD is a robust scale estimator, and is presented as follows: where median(b i ) is simply the middle order statistic when L×L is odd but the average of the order statistics with ranks LÂL 2 and LÂL 2 þ 1 À Á when L×L is even. IQRAD is a robust statistic that measures the difference between the upper and lower quartiles, as shown in Eq 3 where b iQ 1 is the first quartile and b iQ 3 is the third quartile of b i . The adaptive size of the searching window is determined by where median(X 0 ) is the value of centered pixel x 0 to be noised, and L = i is the adaptive size of the searching window.

Patch Grouping
Both spatial location and the intensity describe an image pixel that needs to be implemented by denoising. Thus, the noise of a whole image can be further removed. The semantic information of an image is contained by the edge structures that cause the high-frequency coefficients similar as noise. In general, the local edge structure can be represented by a pixel x 0 and its nearest neighbors centered on x 0 , which is denoted as a patch X 0 2R P×P . Denoising is performed on the patch instead of the single pixel. In the searching window with the adaptive size L, except for the target patch, we can find (L−P+1) 2 −1 patches X i ,i = 1,. . .,(L−P+1) 2 , in which a significant difference from the target patch X 0 may be contained. Patch grouping is implemented to select the patches that are similar to the target patch X 0 . Methods such as clustering and matching can achieve patches grouping. The most popular clustering method is K-means, which aims to partition observations into k clusters. This is a hard clustering and each observation belongs to exactly one cluster. For a soft clustering, such as Fuzzy c-means [14], observations can belong to more than one cluster. In the process of clustering, recursive procedures are required, which are rather time consuming.
In our study, block matching is employed to select and group similar patches because of its simplicity and effectiveness. Block matching aims to find observations similar to a given reference, which meets our purpose. The patches around the underlying pixels are the references and similar observations located at different spatial locations can be found within the adaptive searching windows. If the distance between an observation and the reference is smaller than a given threshold, this observation can be considered similar to the reference and subsequently grouped. Distance can measure similarity between observations. Therefore, a smaller distance suggests a higher similarity. To numerically measure similarity of two observations, various distance measures can be used, such as Minkowski distance and Mahalanobis distance [15]. In the Euclidean space R N , which is the space of image patches, the distance between observations and the reference is usually given by the Euclidean distance (2-norm distance). The distance between the reference patch and others is presented as follows: where P×P is the number of dimensions, and X 0 (p) and X i (p) are the p th data objects of X 0 and X i , respectively. We select (N-1) similar patches [X 1 ,,. . .,X N−1 ] with the shortest distances from the reference patch X 0 and group all of them as a three-order tensor X2R P×P×N , in which the patch size is exhibited in modes-1 and -2, and the training sample is displayed in mode-3.

Adaptive Tensor Shrinkage
Based on the reasonable number of training patches as similar as the reference patch X 0 , tensor-based PCA can be used to remove the noise from X 0 . Patch grouping eliminates the large difference among training patches that may lead to the inaccurate estimation of the tensorbased PCA transformation matrix and further to noise residual. Unlike PCA which has to unfold the patch into a vector, tensor-based PCA directly processes the three-order tensor to obtain the transformation matrix for each mode. To this end, the elements of a three-order tensor must be stacked in a matrix to fit the algorithm of traditional PCA. In accordance with the definition of the matrix representation of a three-order tensor X 2 R I 1 ÂI 2 ÂI 3 , the matrix unfolding X ðnÞ 2 R I n ÂðI nÀ1 I nþ1 Þ on mode n contains the element x i 1 i 2 i 3 at the position with row number i n and column number equal to (i n+1 −1)I n−1 + i n−1 . The definitions of matrix unfolding involve the tensor dimensions I 1 ,I 2 and I 3 in a cyclic way. In our case, X (1) 2R P×(PN) and X (2) 2R P×(PN) , (I 1 = P, I 2 = P, I 3 = N) are used. Thus, it overcomes the limitation that similar patches for training data are insufficient.
In accordance with three-order singular value decomposition, χ can be decomposed as the product where U (1) 2R P×P , U (2) 2R P×P , and U (3) 2R N×N are unitary matrices, and Y2R P×P×N is the core tensor. Thus, applied to a tensor χ, the orthogonal transformations of each mode can be found T is all-orthogonal and ordered [16]. In the tensor-based PCA transformed domain of χ, i.e. Y, the energy of noiseless useful data is mostly concentrated on the several most significant components, whereas the energy of noise has a uniform distribution. To suppress the noise for different target patches with an adaptive shrinkage technique instead of a uniform dimension reduction, we use the linear minimum mean square-error estimation (LMMSE) technique in tensor-based PCA transformed domain. Let y i ðnÞ ; n ¼ 1; 2 be the i th column of Y (n) , which is the matrix unfolding in mold-n of Y. The LMMSE of y i ðnÞ is obtained as follows: where E½ðx i ðnÞ ðpÞÞ 2 and E½ðv i ðnÞ ðpÞÞ 2 are the signal and noise variances, respectively, and x i ðnÞ is the component of the i th column of X (n) , which is the matrix unfolding in mold-n of χ. Using the roust median estimator of the highest sub-band of a Daubechies two wavelet transform, we can estimate the noise variance as follows [17]: where HH is the high-high band wavelet coefficients. The signal variance is estimated using the maximum likelihood estimator as follows: where M is the column number of Y (n) . In flat zones, ððy i ðnÞ ðpÞÞ 2 À E½ðv i ðnÞ ðpÞÞ 2 Þis often less than zero, so that w n (p) becomes 0 and the noise in y i ðnÞ is removed asỹ iðnÞ .

Tensor Reconstruction
Let W (n) 2R P×P×N have w n in each column. The shrunkỸ is expressed as follows: where (Á) denotes the inner product of tensors. The denoised result of χ can be obtained by transformingỸ in the tensor-based domain back to the time domain, which is shown as follows:X The denoised estimation of the reference patch X 0 , denoted asX 0 , can be obtained fromX .
The final denoised result of the underlying central pixel x 0 , denoted asx 0 , can be extracted fromX 0 . For a whole image I 2 R M 1 ÂM 2 , M 1 ×M 2 pixels should be denoised by applying the aforementioned procedure, which leads to the full denoised image of I, which is I 2 R M 1 ÂM 2 . Each pixel in an image has many denoised values because reference patches representing the underlying central pixels are overlapping. An accumulation is performed by averaging all of the overlapping denoised patches for a final value of denoised pixels, such that we obtain where E i has the same size asX i and all elements of E i are one, and x R is the coordinate iterated in patches. This process generates a denoised image.
2. MSE: whereĨ 2 R M 1 ÂM 2 is the pure image without any disturbing noise. 3. SNR: 4. PSNR: where N max is the maximum fluctuation in the input image.
6. Parameter β given by [23] is indented to assess the ability of the denoising method to preserve sharp details of the images: where DĨ and D I are the high-pass filtered images ofĨ and I; hÁ,Ái is the standard innerproduct. The closer to 1, the better is the ability of denoising to retain the image edges.

Experimental Results
Two types of image data are utilized to investigate the effect of denoising: one standard test image and a group of clinical images. "House" image, which is widely used in the image processing literature, is used for quantification. It has a size of 256×256. All the clinical images used in this study are provided by the Laboratory of Image Science and Technology, Southeast University, and the study on these data was approved by the institutional ethical review board of the university [24][25][26]. The patients involved in our study (two females and one male aged 63, 38, and 53, respectively) provide written consent. All patients suffered from cancer as confirmed by biopsy examinations. All of the CT images were acquired on a 16-channel multi-detector row CT scanner (Somatom Sensation 16; Siemens, Forchheim, Germany). A reduced tube current of 50 mAs and a routine tube current of 260 mAs were used to acquire LDCT and SDCT images, which were exported as DICOM data with a size of 512×512.
Two-round denoising has recently become more popular than one-round processing. In our study, the times of the denoising round are investigated on the standard test image based on six metrics. The best time is set for the clinical images. Further processing is performed using a standard PC with MATLAB as the developing language.

Standard Test Image Experiments
The denoising procedures described in Section 3.1-3.3 can remove most of noise. However, the signal variances directly calculated from the noise disturbed image may not be accurately evaluated because of the strong noise contained in the original image. This result consequently leads to the estimation bias of the shrinkage coefficient. Several visually unpleasant noise residuals remained. Therefore, a second round of the aforementioned denoising procedure is necessary to further enhance the denoising result obtained from the first round. The noise level in the second stage should be renewed because most noise has been eliminated in the first stage. In our study, the noise variance of the input image in the second stage is estimated as follows [27]:  Table 1 indicates the STDs of different ROIs for the original image, the noisy image, and the denoised images in the four stages. The values of STD decrease as the number of rounds increases. A low STD indicates that the data points are very close to the mean, whereas a high STD indicates that the data points are spread over a large range of values. In Fig 3, region 1 is smooth, but region 2 contains texture information. Thus, the value of STD is lower in the region 1 than in region 2. In the first-round denoising stage, the values of STD have already been dramatically reduced. The remaining three rounds reduce the STD subtly. Table 2 shows the MSE, SNR, PSNR, SSIM and β results of the noisy "House" image and denoised images in the four procedure stages. After the procedure in the first-stage MSE dramatically decreases, the SNR, PSNR, SSIM, and β obviously increase. Thus, most noise has been removed by using only one denoising round. These five metrics are further upgraded in the second round. Then the denoising ability and edge preservation are both enhanced. Fig 4b  to 4d show the same performance, that is, the first round for Fig 4b obtains a more impressive denoising result of Fig 4c, and the second round for Fig 4b refines Fig 4c to 4d. Furthermore, the MSE decreases while the SNR and PSNR slightly increase for the third and fourth stages, respectively ( Table 1). The denoising result is better in the third and fourth stages than in the first stage. However, the values of SSIM and β gradually reduce. SSIM measures the similarity between two images and considers image degradation as perceived change in structural information. β is a measure of edge preservation, which is a significant piece of information for denoising investigation. Decreases in both SSIM and β indicate that more rounds may not improve the effect of denoising.

Clinical Experiments
Quantification of Patch Size. Most of the parameters in our algorithm are automatically set in accordance with the character of the specific image. However, one parameter is left for adjustment. This parameter is the size of patch P for presenting the underlying pixel to be denoised. We investigate three values of P (P = 2, P = 3, and P = 4) to compare the denoising results and select the best parameter that can remove visually unpleasant noise residual and   retain tumor information. In Fig 6, the first row displays the LDCT and SDCT directly obtained from a CT scanner. Considering that LDCT and SDCT are produced by two scans that cause unavoidable displacements, we select the most similar visual SDCT slice as the LDCT slice, which is shown in Fig 6b. The second to the fourth rows indicate the results of the LDCT image obtained in the first stage on the left column (Fig 6c, 6f and 6i) and the corresponding results obtained in the second stage on the middle column (Fig 6d, 6g and 6j). After the refinement in the second stage, visually unpleasant noise residual is noticeably removed. Zoomed regions specified in Fig 6e, 6h and 6k are further shown in the third column. The tumors indicated by the arrows are more clearly presented in the second stage with P = 3. Thus, noise is suppressed and significant tumor information is retained within P = 3 in the second stage. Quantitative Assessment. As described in Section 3, six metrics can be used to evaluate the quantitative performance of different denoising algorithms for images whose purely images without any disturbing noise can be obtained simultaneously. However, in the clinical case, no exact spatial correspondence exists between LDCT and SDCT in two different scans because of the unavoidable displacements caused by the breath or body movements of patients. The metrics based on Euclidean distance cannot be used for the quantitative evaluation of CT image quality. In this study, only the STD of the ROIs is used to compare the effect of denoising in LDCT, processed LDCT, and selected corresponding SDCT.
Two types of CT images are used for denoising comparison: those is from the 53-year-old (Fig 7) and 61-year-old (Fig 8) female patients with liver tumors. Three ROIs are highlighted in   tumors are merged by intense noise. After the denoising algorithms are implemented, the noise can be effectively reduced and quantitatively assessed as Table 3. In Table 3, the highest STDs are obtained from the LDCT image either for the tumors or for the background, which is consistent with the fact that a low dose leads to high noise and degrades image quality. The proposed AT-PCA can obtain the lowest STDs for both tumors and background. It only removes the redundant noise; no additional artifact is introduced unlike in the NLM case (Fig 7d'), or the edge is enhanced unlike in the KSVD case (Fig 7e'). The same experimental results can be achieved in Fig 8 and Table 4. Seven ROIs, including five tumor regions and two background   Table 4. Equally, the lowest STDs for both tumors and background are calculated by the proposed method. Since the NLM case introduces the artifacts and the KSVD case blurs the edge, all of the STDs for these cases are higher than those for the AT-PCA case.
Visual Assessment. Five different unprocessed and processed CT images scanned from a 56-year-old male patient with multiple hepatic metastases. Two discrete slices of the same person are displayed in   among several patches and reduces the redundancy of these patches [28], whereas K-SVD finds the best dictionary to represent the image as sparse compositions [29]. These two methods have achieved efficient denoising results in many image processing fields. The second columns of Figs 9 and 10 are the zoomed regions corresponding to (a) to (e). Hepatic metastases can be clearly shown in the AT-PCA based method; the noise is suppressed, and the edge of the organ is enhanced. NLM can indeed reduce the noise, but some of the new artifacts are introduced as the zoomed region [Figs 9d' and 10d']. The reason may be that weights decided by distance are not suitable for the similarity estimation of pixels in the image space because not all the pixels near the pixel being filtered are similar to the target pixel. For KSVD, the effect of denoising is not as good as that of AT-PCA. Meanwhile, the edges are blurred and artifacts are drawn closer to the edge of organs [Figs 9e' and 10e']. The reason may be that only one dictionary is utilized to remove the noise of the whole image, which is not adaptive to the specific detail information. Fig 11 shows the denoising results for a 61-year-old female patient with liver tumor. This is a different slice from that in Fig 7. The first column from (a) to (e) shows the SDCT image, LDCT image, and images denoised with AT-PCA, NLM and KSVD, respectively. The second column from (a') to (e') shows the zoomed regions corresponding to (a) to (e). Particularly, in the case of liver tumor in Fig 11, the proposed AT-PCA processed image allows a better discrimination of lesions (arrows in figures) compared with the original LDCT image and the NLM and KSVD processed image.

Conclusions and Discussion
This paper described and applied a novel image denoising framework to suppress the pixel noise of the low-dose CT image. The algorithm of adaptive tensor-based principal component analysis is proposed in which corresponding parameters can be automatically decided for each specific noise-disturbing image. Local image structure is retained by presenting each pixel with its nearby neighbors, which are modeled as a patch. Denoising of this pixel estimates the variable of its noisy observations within similar patches. An adaptive searching window is calculated in which the similar patches are selected and very diverse structures are excluded. Tensor-based PCA are directly used on each similar patch group to adaptively train the transformation matrices on each patch mode. Clean coefficients from local principal components are adaptively shrunk by using the linear minimum mean square error method and then employed for patch reconstruction to obtain noise removal patches. After all the reconstructed patches are aggregated, the denoised image is finally obtained. The refinement is made by iterating the denoising procedure. Two types of experiments are implemented for investigation. For standard test images, the proposed AT-PCA effectively removes the noise and enhances the edge. Double denoising procedures are necessary for refinement. For clinical images, AT-PCA is compared with NLM and KSVD methods. AT-PCA showed significantly improved denoising effect. The noise is obviously removed, edges are strengthened, and no extra artifact is introduced.
The current method sequentially reduces the noise in each pixel presented by a patch. The whole denoised image is obtained after all pixels are denoised. To accelerate the denoising process, a parallel processing architecture for the proposed method should be achieved. Thus, a graphical processing unit (GPU) with parallel processing architecture will be used in our future work. We will formulate the adaptive denoising process on the GPU to increase computer speed.