Low-dose CT reconstruction using dataset-free learning

Low-Dose computer tomography (LDCT) is an ideal alternative to reduce radiation risk in clinical applications. Although supervised-deep-learning-based reconstruction methods have demonstrated superior performance compared to conventional model-driven reconstruction algorithms, they require collecting massive pairs of low-dose and norm-dose CT images for neural network training, which limits their practical application in LDCT imaging. In this paper, we propose an unsupervised and training data-free learning reconstruction method for LDCT imaging that avoids the requirement for training data. The proposed method is a post-processing technique that aims to enhance the initial low-quality reconstruction results, and it reconstructs the high-quality images by neural work training that minimizes the ℓ1-norm distance between the CT measurements and their corresponding simulated sinogram data, as well as the total variation (TV) value of the reconstructed image. Moreover, the proposed method does not require to set the weights for both the data fidelity term and the plenty term. Experimental results on the AAPM challenge data and LoDoPab-CT data demonstrate that the proposed method is able to effectively suppress the noise and preserve the tiny structures. Also, these results demonstrate the rapid convergence and low computational cost of the proposed method. The source code is available at https://github.com/linfengyu77/IRLDCT.


Introduction
X-ray computed tomography (CT) is an essential imaging modality for clinical purposes, as it provides high-resolution images of the internal structure of the human body.However, X-ray radiation is known to be harmful to healthy tissues.In some major clinical tasks, a single CT scan can expose patients to radiation doses of up to 43 mSv [1], which may increase the risk of cancer.Consequently, reducing radiation dose while obtaining high-resolution images has become a significant area of research in CT scanning.
Currently, there are two primary strategies for reducing CT radiation dose: (1) decreasing the number of projection views and (2) lowering the X-ray tube current.This approach is commonly referred to as LDCT.LDCT algorithms can be broadly categorized into four groups: sinogram domain filtering, iterative reconstruction, and deep learning-based reconstruction.
Sinogram domain filtering methods exploit the distinct distributions of desired signals and noise in the sinogram domain to reconstruct CT images.This technique involves filtering out components corresponding to artifacts or noise in the sinogram domain and then inverting the filtered sinogram data into the image domain using analytic algorithms.Numerous analytic filtering methods have been proposed based on the distribution of noise.For instance, filtered back projection (FBP) is a classical reconstruction method for CT images that performs high-pass filtering in the sinogram domain before back-projection.Sinogram domain filtering can produce high-quality CT images when the noise distribution is accurately characterized.However, determining this distribution can be challenging, particularly since artifacts or noise often correlate with image structures.
Compared with sinogram domain filtering methods, iterative reconstruction approaches are more flexible and stable.Iterative reconstruction approaches can be further divided into hybrid iterative reconstruction methods and model-based iterative reconstruction methods.Hybrid iterative reconstruction method produces an image by adjusting the statistical characters of the sinogram domain and the image domain.Model-based iterative reconstruction method utilizes the process of alternative performing the forward-projection (i.e., sinogram data generation) and back-projection (i.e., CT image reconstruction) to achieve iterative filtering in the sinogram domain and the image domain.Furthermore, the cost function of model-based iterative reconstruction method usually consists of a fidelity term with the noise model in the sinogram domain and a regularization term with the prior model in the image domain.The regularization term plays a vital role in reconstruction, and many regularizations have been proposed, such as total variation (TV) [2,3], low-rank [4], non-local means (NLM) [5,6], and dictionary learning [7].The model-based iterative reconstruction method usually has better performance than hybrid iterative reconstruction method, but it is also computationally expensive.Additionally, model-based iterative reconstruction method requires manually designing the proper regularization and choosing the weight to obtain satisfactory reconstruction results.
In The SDP reconstruction algorithm aims to use a pre-trained neural network to inpaint the LDCT measurements into sinogram data that is very close to normal-dose CT (NDCT) measurements.For instance, [8] proposed a sinogram domain denoising approach using a convolutional neural network (CNN) with a filter loss function.Compared with image domain denoising methods, these approaches can easily estimate the noise level in the projection.Reference [9] proposed a sinogram data interpolation method by leveraging a conditional adversarial network (GAN).Although sinogram domain processing can correct errors in the sinogram domain, errors produced by the shortcomings of conventional methods can still negatively affect the final reconstructions.
In contrast to the SDP algorithm, IDP produces high-quality CT images by using a neural network to denoise the initial reconstructed images with artifacts.Most deep learning methods employ IDP to improve the quality of reconstructed images obtained using existing methods such as FBP [10,11].Reference [12] introduced a collaborative technique to train multiple Noise2Noise [13] generators simultaneously and learn the image representation from LDCT images.Reference [14] proposed Noise2Self that does not require any additional clean or noisy data.IDP is more straightforward compared to the SDP algorithm.Reference [15] proposed to a framework for sparse-view tomographic image reconstruction combining an early-stopped rapid iterative solver with a subsequent pre-trained neural network to complete the missing iterations of rapid iterative solver.One main disadvantage of IDP is that it is difficult to recover information lost from the initial reconstructed images, which serve as inputs to the neural network.
DDP is a method that combines SDP and IDP.It leverages the advantages of both SDP and IDP to achieve higher-quality images compared to single-domain processing reconstruction methods.Reference [16] combined a deep convolutional neural network (CNN) with directional wavelet transform to extract the directional component of artifacts in low-dose CT images and exploit intra-and inter-band correlations.Reference [17] proposed a deep learning-based function optimization method for LDCT imaging, which incorporated the Radon inverse operator and disentangled each slice.To address of the limitation of acquiring independent noisy reference image of Noise2Noise [13], [18] proposed a method to generate both training inputs and training labels from the existing CT scans for count-domain and image-domain, which does not require any additional high-dose CT images or repeated scans.Although DDP can achieve good inversion results, it requires a larger training dataset due to its two training procedures: sinogram domain and image domain.
SIDP is an end-to-end reconstruction algorithm that directly transforms sinogram data into CT images.This method has the lowest complexity as it only requires training a neural network without extra processing such as sinogram data correction and inversion.For example, [19] presented a unified framework for image reconstruction called Automated Transform by Manifold Approximation (AUTOMAP), which directly converts sinogram data into CT images.Reference [20] proposed a direct reconstruction framework exclusively using deep learning architectures, which consists of denoising, reconstruction, and super resolution (SR).SIDP is a highly efficient reconstruction method but demands massive memory as the entire sinogram data needs to be fed into the neural network.
MBDL, also known as optimization unrolling scheme or plug-and-play, is an effective approach that replaces the parameters or regularization of conventional iterative schemes with learnable/pre-trained neural networks.Reference [21] unrolled the proximal gradient descent algorithm for iterative image reconstruction to finite iterations and replaced terms related to the penalty function with trainable CNN to reduce memory requirements and training time.Reference [22] incorporated the benefits from analytical reconstruction methods, iterative reconstruction methods, and DNNs.They unrolled proximal forward-backward splitting into iterative reconstruction updates of CT data fidelity and DNN regularization with residual learning.Reference [23] developed a unified reconstruction framework combining supervised and unsupervised learning, and physics and statistical models to enhance the accuracy and resolution of LDCT reconstruction images.By leveraging the advantages of deep learning and Recently, training dataset-free method have drawn much attention in LDCT imaging, which does not need to pre-train a neural network and works on a single image by utilizing the consistency between the CT measurements and sinogram data modeled on the reconstructed image.For instance, the deep image prior (DIP) [24], originally proposed for natural image denoising by using early stopping to fit the noisy image, has been widely exploited in medical imaging [25][26][27].Also, DIP treats noise as i.i.d random noise rather than artifacts correlated to the entries of CT images.Reference [28] proposed an dataset-free reconstruction method based on Bayesian inference, which takes the J -invariant transform of the FBP reconstructed image as the initial value.This method can reconstruct high-quality images from measurements; however, its reconstruction time is significantly higher than that of its competitors.
In this paper, we propose an iterative LDCT reconstruction method that ultizes neural network to improve the CT images reconstructed by FBP method without training data.During the iterative LDCT reconstruction, we minimize the loss, which consists of two components: the ℓ 1 -norm distance between the CT measurements and the sinogram data modeled on the post-processed image, and the TV value of the post-processed image.We achieve this by training a neural network.The proposed method does not require collecting any training data and balancing the contribution of data fidelity and TV regularization in the loss.Once the network training is complete, the high-quality reconstructed results will be output immediately.
The rest of the paper is organized as follows: Methodology section describes how to build and solve the optimization problem.Experimental Results section presents the experimental setup and results using the 2016 Low-dose CT Grand Challenge data and the LoDoPaB-CT data [29].Discussion and Conclusion section is the discussion and conclusion.

Methodology
In this section, we introduce a proposed method for reconstructing LDCT from noisy measurements.This method utilizes a DNN to enhance the CT image reconstructed by the FBP method, without the need for training data.

Problem Setup
The forward formulation of LDCT can be formulated as where y represents the CT measurements, A is the projection matrix of CT imaging, ϵ denotes the background contributions of scatter and electrical noise, and x represents the ground-truth CT image.Typically, we can solve the inverse problem of Eq. 2 by using the FBP method F, where x f is the reconstructed image by FBP.However, due to the low source intensity of X-ray and/or the random noise, the quality of x f bp is unsatisfactory, often suffering from noticeable streaky artifacts, random patterns, and low resolution.
Considering a DNN N N with parameters θ that can enhance the image's quality by N N (x f ; θ), which means that we can re-formulate Eq. 1 as According to Bayes's rule, we can obtain the posterior density of N N (x f ; θ) by Supposing p(y | N N (x f ; θ)) as Gaussian distribution, where Σ ϵ represents the covariance of the noise.Furthermore, taking the logarithm on both sides of Eq. 5 then we obtain Taking the logarithm on both sides of Eq. 4 and substituting ln p(y | N N (x f ; θ)) with Eq. 6, we obtain ) Therefore, we can obtain the maximum a posterior (MAP) objective, 6/17 Assuming the noise is Gaussian independent and identically distributed (iid), i.e., Σ ϵ = σ 2 ϵ I. Furthermore, considering to regularize N N (x f ; θ) with R(•), Eq. 8 can be further rewritten as where η denotes the weight.To linearize this problem, we reformulate Eq. 9 as In fact, the artifacts in LDCT images are highly correlated to the entries of CT images rather than random noise, and the results inverted through ℓ 2 -norm loss tend to be over-smoothed, which is not beneficial for preserving the tiny structures and/or sharp edges.Hence, we propose to optimize θ by minimizing the ℓ 1 -norm misfit, Furthermore, we add the TV term of the reconstructed CT image x = N N (x f ; θ) into Eq.11 as a smooth penalty to overcome the potential over-fitting induced by the noise in the CT measurements.Eq. 11 thus becomes where x ∈ R N ×M .Eq. 12 can be solved by NN training, and we can derive x * with the forward propagation of N N once θ be optimized by

Solving the MAP
The proposed method can be considered as a kind of NN training-based reconstruction method, which optimizes the NN's parameters by minimizes the loss from both sinogram domain and image domain.The proposed LDCT reconstruction method can be divided into two steps: (1) Reconstructing the initial CT image: The initial CT image is reconstructed using the the FBP method.Although this initial CT image may contain many artifacts due to the low intensity of X-ray, FBP provides fundamental information about the internal structure of the human body, which is helpful for enhancing the reliability of the inversion result by NN.Moreover, FBP performs much faster than iterative reconstruction approaches such as compressive sensing; (2) Post-processing the initial reconstruction result: Once the initial CT image is achieved, it will be fed into a pre-defined NN and will be improved through the NN training.To achieve θ * , we establish the loss function for NN training based on Eq. 12, and we use gradient descent-based optimization algorithms such as stochastic gradient descent to optimize θ to minimize the loss function.It is worth noting that the proposed method does not require setting weights for both the data fidelity term and the regularization term, which significantly reduces the difficulty of manually setting the

NN Architecture
To enhance the quality of the initial CT image, we have designed a DNN with a straightforward structure.As depicted in Fig. 1, the network primarily consists of 2-D convolution, batch normalization, and LeakyReLU layers.The first layer is a convolution layer, followed by a LeakyReLU layer and a block composed of convolution, batch normalization (BN), and LeakyReLU.The convolution layers are employed for feature extraction, the BN layers for enhancing the stability of network training, and the LeakyReLU layers to ensure non-linearity throughout the network.LeakyReLU is defined as, LeakyReLU(x) = max(0, x) + ϕ * min(0, x). ( In the subsequent experimental test, we set the value of ϕ to 0.01.

Experimental Results
In this section, we evaluate the performance of the proposed method by comparing it with four representative methods: FBP, TV (post-processing and unsupervised method), DIP (unsupervised and data-free method), and RED-CNN (post-processing and supervised model).
For TV reconstruction, the weight for ℓ 1 -norm term is set to 2.15 × 10 −7 , and the number of iterations is set to 200, and we utilize the Douglas-Rachford Primal-Dual method as the solver.In addition, the initial reconstruction results for the TV method are obtained by the FBP method, and the parameters for the FBP method are the same as those for the proposed method, which means that both the TV method and the proposed method have the same input for the neural network.For DIP reconstruction, we use a learning rate of 0.0005, 6 scales, 1000 iterations for AAPM challenge data and 2000 iterations for LoDoPaB-CT data, and 128 channels for the U-Net at every scale.We adopt mean square error (MSE) as the loss function for both TV and DIP reconstruction.In the proposed method, we set the iterations to 2000, and save the result with the highest peak signal-to-noise ratio (PSNR).For all reconstruction methods, the filter and frequency scaling of FBP reconstruction are set to Hann and 0.8, respectively.
For RED-CNN training, we use the AAPM Challenge Data as the training dataset.We train the RED-CNN using full-dose CT scans from nine patients, reserving one patient (L067) for evaluation.In the training data generation process, we use a patch size of 64.The batch size for RED-CNN training is set to 32, the number of training epochs is 100, the loss function is MSE loss, and we use Adam optimizer with a learning rate of 10 −5 .We train three models for different low-dose levels by using pairs of FBP reconstructions of low-dose simulations and corresponding full-dose CT images.
There are 30 convolution layers in our NN, the size of filter kernels of the first convolution layer is 64 × 1 × 3 × 3, where the format is number of filters × number of channels × width × height.From the second to the penultimate convolution layer, we set the size of all filter kernels to 64 × 64 × 3 × 3.For the last convolution layer, the size of filter kernels is set to 1 × 64 × 3 × 3. We minimize the loss defined by Eq. 8 by using the AdamW method with learning rate of 10 −3 .

Data Specification
To evaluate the effectiveness of the proposed method, we test its performance on on two datasets: AAPM challenge data and LoDoPaB-CT data [29].The AAPM challenge data consists of reconstructed simulated data from human abdomen CT scans provided by Mayo Clinic for the AAPM Low Dose CT Grand Challenge (https://www.aapm.org/GrandChallenge/LowDoseCT/).We use 1-mm slice thickness reconstructions with dimensions of 512 px × 512 px for RED-CNN training and performance comparison.The CT images form LoDoPaB-CT data are sampled from AAPM challenge data and have been cropped to dimensions of 362 px × 362 px.Additionally, these images have been subjected to dequantization noise uniformly distributed in [0,1] for each pixel.
For sinogram data simulation, we construct a 2-D fan-beam geometry with 1000 angles, 1000 pixels, source to axis distance 500 mm, and axis to detector distance 500 mm [32].The LDCT image are simulated by adding Poisson noise with I i = [1e3, 1e4, 5e4] following the Poisson distribution according to the process of photon generation, attenuation, and detection, which can be expressed as, where I i denotes the source intensity of the i -th X-ray, y i represents the CT measurements produced by the i -th X-ray, A is the projection matrix of CT imaging, σ i denotes the background contributions of scatter and electrical noise, and x represents the full-dose CT image.Additionally, the full-dose CT images x are normalized before sinogram simulation by

Quantitative Indices
We adopt two quantitative indices, PSNR and structural similarity index (SSIM), to quantify the quality of the reconstructed CT images.The PSNR expresses the ratio between the maximum possible power of a signal and the power of corrupting noise, which is measured by the mean squared error (MSE), PSNR( x, x) = 10 log 10 max(x) 2 MSE( x, x) , where x and x denotes the ground truth image and the reconstruction, respectively, and n is the number of pixels in the reconstructed image.A higher PSNR value indicates better reconstruction quality.The SSIM, which lies in the range [0, 1], is used to measure the similarity between the ground-truth image and the reconstruction image, where µ j and µ j are the average pixel intensities, σ 2 j and σ 2 j represent the variances, and Σ j is the the covariance of x, x at the j-th local window.The constants 2C 1 = (K 1 L) 2 and C 2 = (K 2 L) 2 tend to be zero to avoid instability.Following [33] and [34], we choose K 1 = 0.01, K 2 = 0.03, L = max(x) − min(x), and the window size is 7 × 7. A higher SSIM value indicates better reconstruction quality.

Reconstruction Results
AAPM challenge data: We randomly select three full-dose CT images from AAPM challenge data to evaluate effectiveness of the propose method with the X-ray source intensity I i = [1e3, 1e4, 5e4].From Fig. 2, Fig. 3 and Fig. 4, we can observe that the quality of the FBP reconstruction images degraded significant as the X-ray source intensity decreased, resulting in amplified noise and artifacts distributed throughout the entire image.As a post-processing method, TV achieves higher quality images by post-processing the reconstructed images through FBP.Another post-processing and supervised method RED-CNN, can effectively remove noise and artifacts, but it tends to smooth out some tiny structures.Although DIP is unsupervised and takes random noise as input, it can effectively remove noise while producing images with higher resolution than RED-CNN.Comparing the reconstructed results by different methods, we can see that the proposed method achieves the best performance in terms of noise and artifacts attenuation and preservation of tiny structures.
To better illustrate the effectiveness of the proposed method, we further demonstrate the zoomed-in results corresponding to the red box in each ground truth.As shown in Fig. 2, Fig. 3 and Fig. 4, the reconstructed results by FBP and TV are contaminated by noise and artifacts.Although RED-CNN and DIP can suppress the noise, many valuable details are smoothed out.In comparison, the proposed method achieves better reconstruction accuracy than the competitive methods.It is worth noting that although the ground-truth images are norm-dose CT images, slight noise and artifacts still remain in them.Furthermore, the reconstructed results by the proposed method outperform the ground-truth images in terms of resolution, particularly with I i = 1e4 and 5e4.
LoDoPab-CT data: For the LoDoPab-CT data, the reconstruction results are shown in Fig. 5. From Fig. 5, we can observe that the performance of each reconstruction method is similar to their performance for the above AAPM challenge data reconstruction.The reconstructed results by FBP and TV suffer from noise and artifacts, although TV can suppress a lot of noise.The textures and edges in the reconstructed results by RED-CNN are smoothed out, whereas DIP can remove noise and preserve tiny structures more effectively.The proposed method achieves the best performance with regard to noise suppression and preservation of tiny structures.Furthermore, the reconstruction errors (Fig. 6) further demonstrate that FBP method sacrifices a lot of useful information.TV and RED-CNN can effectively improve the reconstructed results by FPB; however, TV can not preserver edges well, and RED-CNN tends to smooth edges and textures.DIP has slighter residual errors in terms of edges and textures.Compared with the competitive methods, the proposed To quantitatively analyze the performance of our method, we calculate the the PSNR and SSIM values of the above reconstruction results, including the AAPM challenge data and the LoDoPab-CT data.As shown in Table .1, our method achieves the highest PSNR and SSIM among the five approaches, except for the reconstruction task of AAPM-2 with respect of the SSIM under I i = 1e3 and of AAPM-3 with respect of the PSNR under I i = 1e3.Specifically, the SSIM and PSNR of DIP are 0.06 and 0.24 dB higher than those of the proposed method.
In addition, we take the evolution curves of PSNR and SSIM versus iteration of LoDoPab-CT data reconstruction as an example to illustrate the convergence of the proposed method.As shown in Fig. 7, the PSNR and SSIM increase while the loss decreases rapidly, which reveals that the proposed method can converge quickly.Specifically, the curves of PSNR and SSIM begin to converge after about 250 iterations, and the curves of loss start to converge after about 100 iterations.Although there are some fluctuations in these curves since the measurements contain noise, they converge quickly again, which indicates the good robustness of our method.Table 2 lists the computation time of different method on a single GPU (Nvidia Tesla K80), it can be seen that FBP, TV and the proposed method have great disadvantages in terms of reconstruction time.Although RED-CNN only need one inference to reconstruct the high-quality image, the process of NN training is time consuming.Therefore, one can set a larger number of iterations to ensure that good reconstruction results can be obtained due to the rapid convergence and low computational cost of the proposed method.

Discussion and Conclusion
For the initial LDCT reconstruction, we utilize the results reconstructed by FBP as the initial model for the proposed method.FBP can extract fundamental information about the internal structure of the human body, despite potential contamination from artifacts  caused by the low intensity of X-ray.This is crucial for neural network-based LDCT imaging, as the black-box nature of these networks can significantly decrease the reliability of LDCT reconstruction results.It's also important to note that the quality of the initial reconstructed image can impact the performance of the proposed method.One could substitute the FBP input with a high-quality image to further improve resolution.Additionally, FBP often performs much faster than iterative reconstruction approaches such as compressive sensing, which aids in enhancing inversion efficiency.
Although our proposed method can converge rapidly, fluctuations due to noise in measurements might negatively impact the reconstruction efficiency.In future work, we aim to investigate better regularization techniques to promote convergence stability.
In this work, we propose an unsupervised and training data-free method for LDCT imaging.The proposed method aims to improve the initial reconstruction results with low quality, which reconstructs the high-quality image by DNN training without any training samples.We implement the DNN training by minimizing the ℓ 1 -norm distance between the CT measurements and their corresponding simulated sinogram data on the reconstructed image and the TV value of the reconstructed image.Notably, the proposed method dose not need to set weights for both the data fidelity term and the regularization term, which significantly reduces the difficulty of manually setting the weights.Experimental results on the AAPM challenge data and LoDoPab-CT data demonstrate that the proposed method could achieve better performance than the representative non-learning methods and supervised method, with higher resolution and lower computational cost.The proposed method can be implemented flexible and has the potential to be applied to other medical image reconstruction problems, including sparse-view CT reconstruction and image reconstruction from sparse samples in MRI.These applications are particularly useful when collecting training samples is either expensive or difficult.

Fig 1 .
Fig 1.Schematic diagram of the proposed method.
recent years, deep learning techniques have been widely employed in LDCT reconstruction, and they have demonstrated better performance than conventional LDCT reconstruction methods.Deep learning-based LDCT reconstruction methods can be categorized into four groups: sinogram domain processing (SDP), image domain processing (IDP), dual-domain processing (DDP), sinogram-image direct mapping (SIDP), and model-based deep learning (MBDL).

Fig 2 .
Fig 2. Reconstruction results of case AAPM-1 at different dose levels by different methods.Zoomed parts over the region of interest (ROI) marked by the red box in the ground-truth image.

Fig 3 .Fig 4 .
Fig 3. Reconstruction results of case AAPM-2 at different dose levels by different methods.Zoomed ROI images from the ground-truth image.

Fig 5 .Algorithm 1
Fig 5. Reconstruction results of LoDoPab-CT data at different dose levels by different methods.Zoomed ROI images from the ground-truth image.

Fig 6 .
Fig 6.Reconstruction errors of LoDoPab-CT data at different dose levels by different methods.

Table 2 .
Computation Time of Different Algorithms for LoDoPab-CT Data Reconstruction.