Fast alternating projection methods for constrained tomographic reconstruction

The alternating projection algorithms are easy to implement and effective for large-scale complex optimization problems, such as constrained reconstruction of X-ray computed tomography (CT). A typical method is to use projection onto convex sets (POCS) for data fidelity, nonnegative constraints combined with total variation (TV) minimization (so called TV-POCS) for sparse-view CT reconstruction. However, this type of method relies on empirically selected parameters for satisfactory reconstruction and is generally slow and lack of convergence analysis. In this work, we use a convex feasibility set approach to address the problems associated with TV-POCS and propose a framework using full sequential alternating projections or POCS (FS-POCS) to find the solution in the intersection of convex constraints of bounded TV function, bounded data fidelity error and non-negativity. The rationale behind FS-POCS is that the mathematically optimal solution of the constrained objective function may not be the physically optimal solution. The breakdown of constrained reconstruction into an intersection of several feasible sets can lead to faster convergence and better quantification of reconstruction parameters in a physical meaningful way than that in an empirical way of trial-and-error. In addition, for large-scale optimization problems, first order methods are usually used. Not only is the condition for convergence of gradient-based methods derived, but also a primal-dual hybrid gradient (PDHG) method is used for fast convergence of bounded TV. The newly proposed FS-POCS is evaluated and compared with TV-POCS and another convex feasibility projection method (CPTV) using both digital phantom and pseudo-real CT data to show its superior performance on reconstruction speed, image quality and quantification.


Introduction
Iterative image reconstruction methods have shown many advantages over the conventional analytic reconstruction methods, such as sophisticated imaging system modeling and easy incorporation of prior information for increased image quality and/or reduced radiation dose. In X-ray computed tomography (CT), to deal with data inconsistence caused by noise or insufficient data such as compressed-sensing sampling, two major constrained iterative reconstruction frameworks are extensively investigated: unconstrained optimization [1][2][3][4][5][6][7][8][9] and constrained optimization [10][11][12][13][14][15][16]. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 The unconstrained optimization is in line with maximum a posteriori (MAP) estimation, which combines the data fidelity and regularization terms together for a single unconstrained objective function and balances their contributions using a weighting parameter. For example, the Gauss-Seidel algorithm was applied to solve the reconstruction with either Gaussian or non-Gaussian priors [1,3]. A conjugate-gradient preconditioning method was proposed to achieve faster convergence rate for gradient-based iterative reconstruction [2]. Recently, the advance in applying variable splitting approaches and Nesterov's momentum techniques in image processing has resulted in novel fast algorithms for image reconstruction [6][7][8][9]. There is also active interest in developing advanced regularization terms (priors) beyond the pioneer total variation (TV) [11,14], such as normal-dose image induced TV [17], total generalized variation [18], TV-stokes [19], l 0 gradient [20], Haralick texture measures [21]. However, the unconstrained optimization methods usually face the challenge of choosing the appropriate weighting parameter to balance the data fidelity and regularization terms.
In contrast, constrained optimization uses either data fidelity or image properties, such as TV, as the objective function and puts others into constraints. It is often more straightforward in such a way that reconstruction parameters can have physical meanings and be determined from the data, e.g. the error bound of the data fidelity term in constrained TV minimization. For constrained optimization, alternating projection methods have pioneered successful reconstruction for few-view and limited-angle CT data [11,14]. These methods typically combine projection onto convex sets (POCS) for data fidelity and nonnegative constraints with TV minimization, a.k.a. TV-POCS. Although TV-POCS is easy to implement and effective, there are several drawbacks: 1) it may converge to the mathematical optimum, but not the physical optimum (see Section "Rationale of using convex sets of feasibility instead of the constrained optimization" for more details); 2) the convergence of the method is lack of analysis and is sensitive to empirically selected parameters; and 3) the gradient descent method for (smoothed) TV minimization is generally slow [11][12][13][14].
In this work, we focus on addressing the issues with TV-POCS and propose a more efficient and accurate solution for constrained optimization of CT image reconstruction by using 1) the concept of convex feasibility; 2) alternating projection/POCS; and 3) a primal-dual hybrid gradient descent (PHGD) method [22] for TV projection. Convex feasibility approaches are to find feasible solutions of CT reconstruction that fall in convex sets [15] in case that the imaging system is highly ill-posed and the single optimum is elusive. In this work, these convex sets are defined to satisfy the conditions of 1) data fidelity, 2) non-negativity, and 3) bounded TV function, although other proper definitions can be used as well. The data fidelity is usually enforced by algebraic reconstruction techniques (ART) and the non-negativity by forcing negative values as zero. For the bounded TV function, we use the PHGD method [22], which was shown to be faster than state-of-the-art methods for the TV-l 2 image denosing problem, such as fast iterative shrinkage thresholding (FISTA), Nesterov, alternating direction method of multiplier (ADMM), and Chambolle-Pock primal-dual algorithms [23]. It is noted that the Chambolle-Pock method has been adapted to solve the convex feasibility projection problem with the TV constraint (CPTV) for image reconstruction [15,16] that is closely related the proposed method. Thus, we focus on comparing our method with TV-POCS using gradient descent [11,14] and CPTV using projection onto l 1 ball in the image gradient space [16] since all three methods can be generalized under the hood of vector space projection and solved by a full sequential alternating projection or POCS (FS-POCS). We also show that gradient-based methods can converge under a certain condition in the framework of FS-POCS and provide a theoretical way to safeguard parameters to meet the convergence condition. These developments lead to a theoretically sound and efficient iterative reconstruction. Furthermore, the proposed FS-POCS method can be easily extended to CT reconstruction with other types of data fidelity and image constraints as long as the convexity is satisfied.
The paper is organized as follows: the principle of the proposed method is described in section Methods, along with the rationale of convex feasibility projection and the introduction of the closely-related methods for comparison. The experiments and results are presented in the following section. The last two sections are devoted to Discussion and Conclusions, respectively.

Methods
CT imaging model and reconstruction with constrained TV minimization X-ray CT imaging can be represented by a simplified linear model as, where p∊R M is projection data (i.e. logarithm of inversely normalized detected X-ray photon fluence), M∊R M×N is the system matrix that describes CT imaging, x∊R N is the image of linear attenuation coefficients to be reconstructed, and e is noise. The CT reconstruction can be expressed as constrained optimization as follows: subject to :x ¼ arg min , which is denoted as TV(x) as well in the following context, and ε is the squared error caused by the noise and imperfect image modeling.

Rationale of using convex sets of feasibility instead of the constrained optimization
The minimization of TV function in Eq 2 under the constraints of the data fidelity in Eq 3 and non-negativity in Eq 4 originates from the compressed sensing theory [24,25], where the sparsity of image gradient space represented by TV function is enforced to achieve high-quality image reconstruction with few-views or limited-angels of projection data [11,14]. As these three functions are convex, a global optimal solution is existed and can be obtained through convergent iterative methods, such as efficient first order methods [5,23]. As shown in Fig 1  (for an easy conceivable 2-D case), the global optimal solution of constrained TV minimization, x Ã (the red dot in Fig 1), will lie on the boundary of kMx − pk 2 = ε, which produces the minimal TV value in a non-trivial case (In a trivial case, the minimal TV value is zero when the uniform image is inside the ellipsoid region of kMx − pk 2 ε).
Although the constrained optimization leads to a unique solution with the minimized TV value, the underlying true image may not attain such a low TV value. If the true image indeed has a TV value τ ! TV(x Ã ), the optimal solution would be at the blue dotx in Fig 1. As the TV value of the true image is unknown, a relaxed TV value may be used either based on some prior information or derived from large data sets and lead to a feasible solution set shown in the shaded region in Fig 1 (Note that in this case τ represents estimated TV upper bound that is greater than the true TV value, and thus the physical optimal solutionx is in the shaded area). With this treatment, two advantages can be achieved: 1) the solution is close to the physical optimum (inside the shaded area); and 2) POCS can be very fast to converge to a convex set, while converging to a single point is often slow due to collinearity, where the convex sets' boundaries are often tangent to each other.

Convex sets of feasibility and proximal mapping
For the reasons stated in 2.2, we replace the TV minimization by the following constraint, The constrained CT reconstruction is then converted to find a solution that satisfies the three constraints of Eqs 3, 4 and 5, which define three convex sets in R N space. The data fidelity term, Eq 3, defines points in R N that are transformed to the R M space and have a summed Euclidean distance to all hyperplanes no greater than ffiffi ffi ε p . It can be understood as an l 2 spherical ball with a radius of ffiffi ffi ε p in R M space. Eq 4 is the nonnegative constraints confining in positive orthant and boundaries. Eq 5 is the TV constraints representing in an l 1 diamond ball of image gradient with a radius of τ. The reconstruction goal is to find the image that falls into the intersection of the three convex sets (in consistent cases, i.e. C 0 = C 1 \ C 2 \ C 3 is not empty) or somewhere closest to them (in inconsistent cases, i.e. C 0 = C 1 \ C 2 \ C 3 is empty) [16]. It can be expressed as the following optimization: where the sets C 1 , C 2 , and C 3 satisfying the constraints defined in Eqs 3, 4 and 5, respectively, are convex in R N space and F i are affine transform functions. The indicator function is defined Given any arbitrary initial image or intermediate image x, the proximal mapping to the convex set C i is defined as: where P C i ðxÞ represents the Euclidean projection onto each convex set C i , i.e. from any start point, a feasible solution is a point in set C 0 that has the closest Euclidean distance to the start point. Such a solution can be solved using POCS, i.e. enforcing Euclidean projections on C 1 , C 2 and C 3 either sequentially or simultaneously. The POCS will converge to the solution as the R N Euclidean space is a Hilbert space [26].

Projection onto convex sets (POCS)
Using convex sets of feasibility, a solution of the set C 0 can be achieved by the sequential projections onto C 1 , C 2 , and C 3 , respectively, denoted as P C1 , P C2 and P C3 [26]. The details of these projections are given as follows.

1) Projection onto C 1 -P C1
Due to the huge size of M, the projection onto C 1 is achieved using iterative methods. The noise free data fidelity set, C 0 1 ¼ fx : ðMx À pÞ T ðMx À pÞ ¼ 0g, is a subset of set C 1 . It is equivalent to the linear equation Mx = p defining M hyperplanes in R M . The projection onto the i th hyperplane can be achieved by the following equation, where m i is the i th row of M and p i is the i th element of p. Eq 8 is well-known algebraic reconstruction techniques (ART). The ART can also be performed in parallel to speed up the reconstruction, such as simultaneous algebraic reconstruction technique (SART) [27]. The projection onto set C 1 can be obtained by conducting ART until the summed Euclidean distance in R m is less or equal to ε. In real implementation, due to the computationally intense forward and backward projections, each projection onto one hyperplane is treated as one step of POCS and P C1 is defined as P C11 P C12 . . .P C1M of M projections before moving to P C2 and P C3 . Note that the ART in Eq 8 is used for projection P C1 in this work although other iterative methods can be used as well.

2)Projection onto C 2 -P C2
The projection onto a nonnegative set is straightforward and given in the following equation,

3)Projection onto C 3 -P C3
Given a starting point v that is outside of C 3 (i.e. the image after P C1 and P C2 in the proposed algorithm), the projection onto set C 3 to satisfy the TV constraint is the nearest point to v on where TVðxÞ kxk TV . Eq 10 is equivalent to the classical TV denoising problem although here the image TV value needs to be no greater than τ instead of approaching the minimum. It can be solved using either the projection onto the diamond ball in the gradient space of the reconstructed image [28] or first order methods for TV minimization [13,22,23]. In this work, the Lagrange method is used to derive P C3 so that first order methods can be applied accordingly. The projection in Eq 10 can be wrote as minimizing the Lagrange function where the Lagrange multiplier α is a positive constant, whose range can be determined as follows. Assuming x p is the projection of v onto C 3 , i.e. TV(x p = τ and J(x p ) = min J(x), this leads to where δ = kx − vk. Since TV(x p ) − τ = 0, Eq 12 can be simplified as If the TV function is replaced by it smooth version (for gradient based optimization, see Eq 17 below), the smoothed TV function satisfies the Lipschitz inequality [5], where L TV is the Lipschitz constant of the smoothed TV, which can be determined by Eq 3.5) in [5] and was set as 80 in this work. Combining Eqs 13 and 14, we have It means that the Lagrange multiplier α must satisfy the following condition, (On the other hand, a is strongly convex with strong convexity parameter μ.) We thus set α to be the lower bound of Eq 16, and determine P C3 by minimizing Eq 11.
The minimization of Eq 11 is a standard Rudin, Osher, and Fatemi (ROF) TV denoising problem with a given constant of α [29]. It can be solved iteratively using existing algorithms such as the gradient descent [5,16], the conjugate gradient [30] and the first order primal dual methods [22,31,32]. When the conventional gradient needs to be calculated for the projection onto the TV set (e.g. TV-POCS), the smooth version of the original TV norm can be represented by the Huber function, ( where γ is a small number [5,14]. In the proposed FS-POCS, we enforce P C3 using a first order primal-dual hybrid gradient (PDHG) method in this study [22]. Specifically, the minimization problem Eq 11 can be rewritten as a primal-dual formulation: where a dual variable y is introduced that satisfies ky j k 1 for j ¼ 1; 2; . . .; Ng) and the constant term −ατ is omitted from the optimization process. Given any intermediate solution (x k , y k ) at iteration step k, the PDHG updates the solution following alternated dual and primal steps: i) Dual step: fix x = x k , apply one step of gradient ascent method to the maximization problem max y2 Y Fðx k ; yÞ along the ascent direction r y Fðx k ; yÞ ¼ A T x k , y k is updated as where P Y is the projection onto the set Y and β is the dual step size. ii) Primal step: fix y = y k+1 , apply one step of gradient descent method to the minimization problem along the descent direction À r x Fðx; y kþ1 Þ ¼ À ðAy kþ1 þ 2 a ðx k À vÞÞ, x k is then updated as where θ is the primal stepsize. Note that the convergence of PDHG is insensitive to the choice of step sizes (β, θ). Simple fixed values of (β, θ) can give satisfactory results although adaptive ones may further increase the convergence rate. In this work, we used fixed values: 2 for β and 0.2 for θ. Please refer to [22] for more details of PDHG. It was shown in [16] that PDHG is among the top performers in solving the ROF problem. It is worth noting that the gradient descent method used in TV-POCS [14] can also lead to the projection onto C 3 if certain conditions are satisfied (see S1 Appendix).
Once we defined projections P C1 , P C2 and P C3 , x = (P C3 P C2 P C1 )x can be conducted repeatedly. Based on the theory of vector space projection, the sequence converges weakly to a point in convex set C 0 if C 0 is non-empty (or strongly in a finite-dimensional space) [26].

FS-POCS reconstruction algorithm
We thus propose a reconstruction algorithm by enforcing projections onto set C 1 , C 2 , and C 3 sequentially. We call this method as full sequential POCS (FS-POCS). The details of the algorithm are listed below: 1. Initialize image x, data fidelity error bound ε, and TV bound τ 2. Conduct projection P C1 (Eq 7) and P C2 (Eq 8)

Conduct projection P C3
a. Determine α using Eq 16 b. Conduct projection P C3 (Eq 11) using the PDHG method 4. Repeat step 2) and 3) until the change of images in two successive iteration is less than a predefined value.
The method proposed in [33] is used to determine the parameter ε as the sum of the projection noise variance among all projection bins. The parameter τ can be determined either from prior images, as the same type of images will generally have similar TV values, or from less accurately reconstructed images with a scaling factor for desired TV values. These parameters are physical meaningful and can be determined independently from reconstruction algorithms, especially when large data sets exist, while the parameter used in unconstrained optimization methods leveraging the balance of data fidelity and prior knowledge is algorithmdependent and hard to determine beforehand. [11]. TV-POCS [11] falls in the framework proposed in this work. It solves the constrained CT reconstruction problem in two stages: POCS for the two constraints in Eqs 3 and 4, and TV minimization in Eq 2. Similar to FS-POCS, the two constraints in TV-POCS are achieved through projections P C1 and P C2 represented by Eqs 8 and 9, respectively. The TV minimization is achieved by using the gradient descent of TV function,

TV-POCS
where s k ¼ rTVðx k Þ krTVðx k Þk is the normalized gradient of smoothed TV function and η is the step size to assure non-increasing of TV function. Denoting ρ as a small positive number and d A as the sum of the absolute difference between images of POCS and TV minimization stages, η is determined by η = ρd A .
The step size is a key parameter to achieve desired solution in TV-POCS methods, which was empirically selected in [14]. To better balance the data fidelity and TV terms, an adaptive steepest descent method was proposed, in which ρ becomes a variable and is updated according to image changes during iterations [11]. Another adaptive approach to determine this parameter based on image iterations were proposed in [12]. As shown in the S1 Appendix, the gradient descent of TV minimization can lead to a non-increasing J(x) if the step-size satisfies the following condition: CPTV [16]. In this work, we form CT reconstruction problem as a convex feasibility problem similar to the CPTV method [16]. In [16], the reconstruction problem was casted as a primal minimization problem, where the objective function is composed of the Euclidean distance between the start image and the reconstructed image and two indicator functions of data fidelity and l 1 norm of the gradient tensor image. Instead of directly solving the primal problem, the dual problem was formed and the solution was found by minimizing the primal and dual gap. The major difference of our work from CPTV [16] is: 1) POCS is used in this work to get the solution sequentially, whereas in [16] the Chambolle-Pock primal-dual method was used to solve a unified primal minimization and dual maximization problem, where the constraints were expressed as indicator functions; 2) the projection onto the TV constraint set is achieved by PDHG in this work, whereas in [16] the TV constraint was satisfied by projection onto the l 1 diamond ball of the gradient tensor image; and 3) the proposed method is remarkably simpler than CPTV, yet achieves faster convergent reconstruction, as shown in the following results. It is worth noting that CPTV may not be computationally efficient, but is useful for algorithm prototyping.

Experiments and results
We investigated the proposed method using digital phantom and pseudo-real CT data. The performance of the proposed FS-POCS method was then compared to the existing methods, filtered backprojection (FBP), TV-POCS and CPTV. We also validated that the TV-POCS method satisfies the convergence condition of gradient descent methods shown in Eq 22. All iterative reconstruction methods ran 1,000 times and the reconstructed image size was 256x256, unless otherwise stated. The same reconstruction parameters as the original publications were used for TV-POCS [11] and CPTV [16], while for FS-POCS ε was calculated as the sum of the projection noise variance among all projection bins (1.79 for the Shepp-Logan data; 0.8717 for the pelvis phantom data; and 1.4682 for the cadaver head data) and τ was calculated from the ground-truth images (152 for the Shepp-Logan data; 298 for the pelvis phantom data; and 723 for the cadaver head data), unless otherwise stated.

Numerical experiment settings
Shepp-Logan phantom. In the simulation experiments, a 256x256 Shepp-Logan phantom with 1mm 2 pixels is used. Its background is filled with water equivalent material with a linear attenuation coefficient of 0.02 mm -1 at 80 KeV. The projection data were collected using the fan beam scanner with 720 bins under the Poisson noise assumption. The pitch of bins was set to 1mm. The source to detector distance (SDD) was 800mm, and the source to iso-center distance (SAD) was 400mm. The projection data of the phantom were generated with different number of projection views evenly distributed along a circular orbit. The radiation intensity was set to 5.0x10 5 counts per incident ray. Quantitatively, we use two metrics for reconstruction image quality: root mean square error (RMSE) between the reconstructed image and the ground truth image and contrast to noise ratio (CNR) of the region of interest (ROI). The Shepp-Logan phantom image (using linear attenuation coefficient mm -1 ) is shown in Fig 2. The regions marked with a red rectangle are the ROIs selected for CNR measure = (mean of the bright region-mean of the dark region)/(standard deviation of the dark region). These two measures reflect the global and local reconstruction accuracy, respectively.
Although the digital Shepp-Logan phantom data provide benchmark evaluations for FS-POCS, it is instructive and practically meaningful to assess its capability of reconstructing more complex structures of physical phantoms and real human anatomy. In this regard, we tested different reconstruction methods on two sets of pseudo-real data as described below.
Pseudo-real data 1 -An anthropomorphic pelvis phantom. The slice images of an anthropomorphic pelvis phantom (CIRS 801-P) (Computerized Imaging Reference Systems Inc., Norfolk, VA) were acquired by a multi-detector CT (MDCT) (Phillips Brilliance Big Bore, Philips Healthcare, Nevada, US). The images (512x512x198) at 1 mm 3 resolution were reconstructed from 1,000 projection views per rotation using the vendor built-in reconstruction algorithm. The first set of pseudo-real data was then generated by projecting one slice image into 60 views evenly over 360 degrees using a fan-beam scan with 1.0x10 5 photon counts per incident ray.
Pseudo-real data 2 -Cadaver head. The second set of pseudo-real data was generated using the real human male cadaver head CT data from the Visible Human Project (http:// www.nlm.nih.gov/research/visible/getting_data.html). The selected head volume has a dimension of 256 x 256 x 245 and a voxel size of 1 mm 3 . Its slices were re-projected into 60 views over 360 degrees using a fan-beam scan with 1.0x10 5 photon counts per incident ray.

The influence of TV priors for reconstruction
The TV value of the original Shepp-logan phantom is τ = 152. In this experiment, we chose TV bounds equal to [0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5] times of τ, and then applied FS-POCS to reconstruct images, respectively. The RMSE and CNR in the ROI were calculated and plot   versus the TV bounds in Fig 3. As can be seen, the best RMSE and CNR values are achieved at the TV bound equal to the ground truth 1.0 τ. It is also noted that the behavior of the global measure RMSE is not exact as that of the local measure CNR. This experiment demonstrates Fast alternating projection for tomographic reconstruction that the TV value does not necessarily to attain the lower value in order to recover the image closest to the ground truth image. The quantitative measures of RMSE between the ground truth image and the reconstructed images and CNR in selected ROIs changing along the iteration numbers are plotted in Figs 5 and 6, respectively, for TV-POCS, CPTV and FS-POCS (with the optimal prior 1.0 τ value). As can be seen, FS-POCS converges either comparable to or much faster than other iterative methods and usually attains better reconstruction results (lower RMSE and higher CNR). This reflects that the mathematic optimal (approached by TV-POCS) may not be the physical optimal as shown in Fig  1 and FS-POCS solutions could be closer to the physical optimal than TV-POCS and CPTV. Again, the behavior of the global measure RMSE is not exact as that of the local measure CNR.

Comparison of different methods for Shepp-Logan phantom
The computational time for 1,000 iterations is shown in Table 1 for different reconstruction methods. Each method was implemented in MATLAB on a computer using an Intel Core I7- Fast alternating projection for tomographic reconstruction 770 CPU with 32GB memory. FS-POCS takes much shorter time than TV-POCS and CPTV due to the fewer TV minimization steps. TV-POCS suffers from the same 20 TV minimization sub-iterations. Combined with the faster convergence rate (i.e. fewer iterations for convergence shown in Fig 5), FS-POCS can significantly shorten the total reconstruction time.

Numerical behavior of TV-POCS and FS-POCS
In this part, we investigate the convergence of TV-POCS by analyzing the change of J(x) during TV minimization and the corresponding step size change along the threshold T k . The reconstruction with 60 views is conducted using TV-POCS described in [14] while keeping step-size of gradient descent satisfying Eq 22.
The changes of the distance function J(x) with TV-POCS iterations are shown in Fig 7A, during 10 th to 12 th main iterations (left) and during 101 st to 103 rd iterations (right). Between two successive main iterations, the TV gradient descent sequence is shown to demonstrate the change of J(x). In all these cases, J(x) decreases in each TV gradient-descent subroutine. Thus, Eq 21 is effective to achieve P C3 . FS-POCS is also shown for comparison in Fig 7B. First, the iteration number of P C3 in FS-POCS is controlled by τ value and much smaller than fixed twenty TV minimization iterations used in TV-POCS. Second, the iteration number of FS-POCS decreases along the iteration (eight for 10 th to 12 th main iterations and seven for 101 st to 103 rd main iterations). Finally, the J(x) values are generally lower for FS-POCS compared to TV-POCS at the same main iteration.
It is also noted that the oscillation between P C3 and other two projections, even though the amplitude of the oscillation decreases along the iteration. The small size oscillation may be caused by the numerical precision, but the non-negligible oscillation is likely due to the tight selection of the parameters ε and τ, which results in an empty intersection. To avoid the oscillation, either the parameters ε and τ can be adjusted to be have a non-empty intersection, or simultaneous POCS can be used to reach the unique solution even for an empty intersection, which is the closest point to all constraint sets. One example using FS-POCS is shown in Fig 8, where the TV bound is gradually relaxed from the top to the bottom. It is observed that when the TV bound is sufficient large (1.5 τ), the oscillation diminishes as the iteration goes on. However, the caveat is that such a solution without oscillation may not necessarily be the closest to the physically optimal solution.
As shown in Fig 9, the step-size η is always smaller than T k during the whole iteration and satisfied the convergence condition of Eq 22. Therefore, we observed the convergence behavior in Fig 7A. However, the smaller value of the step-size than T k may lead to slower convergence.

Discussion
In this work, we treat the constrained CT reconstruction as a convex-set feasibility problem and propose an alternating projection algorithm, FS-POCS, to find the solution. Due to the uncertainty of the data fidelity error bound and the TV value of the original image, the rationale behind convex-set feasibility is that the mathematical optimal point of the constrained optimization may not be the same as the physical optimal point as shown in Fig 1 theoretically as well as the experimental results in Fig 3. In this regard, the proposed FS-POCS framework may find a solution of closer to the ground truth image than the constrained optimization problem. In addition, the advantage of two-stage methods is that the optimization problem is composed of two (major) separate parts: 1) P C1 for data fidelity constraint and 2) P C3 for the TV constraint. The improvement of the computational complexity can be greatly simplified through this separation and lead to faster computation. For instance, due to use of the projection and backprojection operations, ART is generally more computational intense than TV minimization. Two-stage algorithms usually perform more TV minimization steps than ART steps for a good balance of reconstruction performance and computational speed [11][12][13][14]. Furthermore, the relaxed projection algorithms (P relaxed = (1 − λ)x + λPx for λ 2 (0, 2)) may be used for faster POCS convergence. Finally, the initial starting point may affect the final solution and the convergence speed in theory since the convex feasibility approaches seek solutions falling into the union of a set of convex constraints, instead of a unique point. For the phantom and pseudo-real data used in this study, we conducted further investigations on FS-POCS using different starting points (all zeros, all 0.2, random (uniformly distributed in [0 1]), and all ones) at the noise level of 5x10 4 photons per incident ray. Although the initial convergence   Fast alternating projection for tomographic reconstruction speed is different, after 400 iterations, all of them converge to the very similar result. It demonstrates that FS-POCS is robust to the choice of the initial points, at least for the data used in this study. The solution of FS-POCS is determined by the parameters ε and τ as shown in Fig  1. Since 03C1ε is determined by the noise level of data, prior knowledge of the physical constraining parameters, such as τ, shall be included as much as possible, either from the patient previous images [17,34] or from a large repository of high-quality images with similar anatomical and pathologic structures [35].
We also derive the convergence condition if the gradient descent of TV minimization is used. If this condition is satisfied, the TV-POCS method can be adapted to find the solution similar to that of the convex-set feasibility problem even though it is not as efficient as FS-POCS, which uses an adaptive number of iterations to satisfy the TV constraint. A specific convergence condition on the step size of gradient descent can be exploited to assure the convergence for two-stage algorithms instead of heuristic parameter tuning [11][12][13].
The optimization problems of image reconstruction can take various forms that take into account the imaging model, the noise model, and desired physical properties of image objects. In this work, we focus our effort on developing an efficient and accurate reconstruction framework FS-POCS, which resolve the complicated optimization problem of image reconstruction into simple projections onto convex sets. Not only can the fast convergence be achieved, but also the reconstruction performance can be well controlled by physically meaningful parameters instead of the weighting parameters used in unconstrained optimization, where trial-anderror is usually needed to choose the suitable ones. As more advanced regularization constraints have been vigorously pursued [17][18][19][20][21], FS-POCS can provide a powerful and easy tool to efficiently reconstruct images with desired properties if the constraints are convex or can be well approximated by a convex projection.
Finally, as the early stage of the development of FS-POCS, we focused on high X-ray photon counts cases, i.e. low noise cases, since the least-squares criterion (Eq 3) works well in these cases, where Gaussian distribution is a good approximation of noise. To further shed light on its performance at different noise levels, we tested FS-POCS for 1x10 4 photons per incident ray (high noise), 5x10 4 photons per incident ray (medium noise), and 1x10 5 photons per incident ray (low noise) using Shepp-Logan phantom. The proposed algorithm can reconstruct satisfying images for all case, with only a slight deterioration at extremely few views (24 views) and high noise (1x10 4 photons per incident ray). Nevertheless, the caveat is that for high noise data, the weighted least-squares criterion [3,13] or even more advanced noise models may need to be used as well as more advanced regularization constraints [17][18][19][20][21].

Conclusions
In this work, we proposed a convergent convex feasibility based reconstruction framework, FS-POCS. It relies on physically meaningful parameters instead of the arbitrary weighting parameter used in unconstrained optimization. The experimental results on digital phantom and pseudo-real CT data demonstrated the superior image quality, quantitative results and convergence speed. FS-POCS can potentially shift the dependence on empirical parameters to that on physical parameters for iterative CT reconstruction. Further investigation is in plan to demonstrate the effectiveness of FS-POCS on real patient data.
Supporting information S1 Appendix. Convergence of J(x) using the gradient descent of TV function. (DOCX)