Sparse/Low Rank Constrained Reconstruction for Dynamic PET Imaging

In dynamic Positron Emission Tomography (PET), an estimate of the radio activity concentration is obtained from a series of frames of sinogram data taken at ranging in duration from 10 seconds to minutes under some criteria. So far, all the well-known reconstruction algorithms require known data statistical properties. It limits the speed of data acquisition, besides, it is unable to afford the separated information about the structure and the variation of shape and rate of metabolism which play a major role in improving the visualization of contrast for some requirement of the diagnosing in application. This paper presents a novel low rank-based activity map reconstruction scheme from emission sinograms of dynamic PET, termed as SLCR representing Sparse/Low Rank Constrained Reconstruction for Dynamic PET Imaging. In this method, the stationary background is formulated as a low rank component while variations between successive frames are abstracted to the sparse. The resulting nuclear norm and l 1 norm related minimization problem can also be efficiently solved by many recently developed numerical methods. In this paper, the linearized alternating direction method is applied. The effectiveness of the proposed scheme is illustrated on three data sets.


Introduction
Positron emission tomography (PET) holds one of the most important applications in nuclear medical imaging device as a biomedical research technique and clinical diagnostic procedure. A fundamental characteristic of the biological system is that its metabolism is inherently timedependent. Thus, the ability of PET to observe physiological and biochemical processes in living subjects in a dynamic mode has potential to enhance our understanding of drug activity during preclinical drug development and diseases like kinds of tumors or cancers.
Dynamic PET imaging is usually performed with the collection of a series of frames of sinogram data taken at ranging in duration from 10 seconds to minutes. Earlier dynamic image reconstruction approaches largely fall into two groups. The first one attempts to reconstruct the activity maps in the same manner as static PET imaging. Iterative statistical methods have been a primary focus on many efforts, including notable examples of maximum likelihoodexpectation maximization (ML-EM) [1][2][3], maximum a posteriori (MAP) [4][5][6], penalized weighted least-squares [7][8][9][10], and penalized-likelihood (SAGE) algorithms [11][12][13]. With the continuing progresses of PET imaging, much attention has also been paid on 3D PET reconstruction [14][15][16][17] and TOF-PET reconstruction [18,19].
The second group attempts to improve the signal-to-noise ratio (SNR) by integrating the iterative statistical methods with prior temporal knowledge as reconstructing tokens and some recent works use the noise reduction technique. It includes the use of temporal voxel smoothing [20,21] and temporal basis function [22]. On the other hand, there have been considerable efforts aimed at using time-varying filters. Some of the most interesting ideas include the use of wavelet filter [23,24], the use of principal components transformation (also called as Karhunen CLove transform) [25,26], and the use of a tensor product spline basis [27,28].
Although of great progresses achieved, there are still something to be improved. In general, specific assumptions on the measurement distribution (Poisson or Shifted Poisson) is required in conventional methods. It results in a relative long acquisition time for each frames. Otherwise, when the acquisition time is not sufficient, the proportions of the scatter and random events in all events would increase, which generally leads to a poor visual image quality and a poor contrast of the target region in reconstruction images. In addition, the correlation between different frames was usually ignored in the prior techniques, and it leads to all of them are powerless to extract the useful motion or shape deformation information during reconstruction.
Inspired by the recently developed sparse and low rank representation, we develop a novel dynamic PET reconstruction model that aims at making full use of the information of the adjacent frames to achieve a high quality reconstruction without any specific assumptions on the measurement distribution. Following the robust principal component analysis [29,30] paradigm, the background is formulated as a low rank component while variations between successive frames are abstracted to the sparse. Then, the linearized alternating direction method is applied to tackle the optimization problem with affine constraint of the PET imaging. To demonstrate the effectiveness and robustness of the proposed method, three experiments are designed and shown in this paper.
The rest of this paper is organized as follows. In section 2, problem formulations and solutions are presented. Section 3 provides experiments along with results compared with the previous methods.

Notations
In this section, a brief summary of the notation used in the following paper is given. Matrices are all capital, vectors are lowercase. For instances, Y is a matrix and y is a vector. The lowercase i represents the number of the frames of PET sinogram or images. Nuclear norm of matrix is denoted by kXkÃ, defined as the sum of the input matrix singular values, it represents the rank of matrix. kXk denotes Frobenius (or Euclidean) norm and hX, Yi is the standard inner product, and k X k 2 F ¼ hX; Xi. kXk 1 is l 1 -norm which represents the sum of the absolute value of all elements in a matrix, kXk 0 is l 0 -norm which represents the number of the non-zero elements in matrix.

Basic Data Model for Dynamic PET Imaging
For non-dynamic PET imaging, the signal and reconstructed image are related by the following equation: Where y represents the sinogram (available data), G represents the detection probability, x is the reconstruction image and n is the noise. In dynamic imaging, a series of the temporal sinogram data are acquired. Denoted by y i the sinogram from the i − th scan, and by x i the image of the i − th frame. Stack each x i and y i as a column vector of matrix X and matrix Y respectively, that is: X ¼ x 1 ; x 2 ; :::; x i ; :::; x n ½ : ð2Þ Y ¼ y 1 ; y 2 ; :::; y i ; :::; y n ½ ¼Gx 1 þ n 1 ; :::; Gx n þ n n ½ ð 3Þ with G the system matrix. Then the data model for dynamic PET imaging can be written as follows: where N = [n 1 , n 2 , . . ., n i , . . ., n n ].

Sparse and Low rank Representation
Sparse and low rank representation, also widely known as robust principle component analysis techniques [29,30] in image analysis or processing, is a novel concept in the medical imaging community. In recent years, some researchers have proposed some related methods in CT and MRI [31,32]. However, it is the first time to be used in PET imaging. The separated background and dynamic information are useful for PET preclinical/clinical application like Image-guided radiation therapy (IGRT) [33]. But it is difficult to find the strict stationary or background component in PET imaging directly. Because PET images reveal the radio-activity distribution in body which means that all the regions in PET images are not constant. However, compared to the target region, the varying rate of the background component in PET images is relative low. Therefore, this component could be considered as an invariant component. However, this assumption would result in the sparsity of the dynamic PET data is not pronounced in some situations. To achieve high quality reconstruction images, a framelet domain transformation is used to constrain the sparsity during the reconstruction. This transformation not only ensures the mathematical constraint required by sparse and low rank representation, also enhances the tolerance of noise and data loss for reconstruction. It is quite helpful to obtain high contrast reconstruction images in the low count or under-sampling situation.

Model description
Hence, we stack each frames in the sequence as a column to form a matrix X, and decompose it into two disjoint parts: where X 1 is the low-rank component of X, which models the stationary background (or reference state) over time, and X 2 is the sparse component of X, which represents the variation in intensity from one frame to another. Now, rewriting the Eq (4) as: Based on the idea about decomposing the images matrix X into a low rank matrix and a sparse matrix, and the model Eq (6), the following matrix minimization problem is a natural choice for sparse and low rank representation model in Dynamic PET imaging: Where A is a tight framelet transform operator, the sparsity of X 2 is enhanced in the wavelet domain. λ > 0 and τ > 0 are parameters balancing the weights of the low rank matrix, sparse matrix, and reconstruction error in the decomposition. In the model Eq (7), kÁk F is the Frobenius norm. kAX 2 k 0 is defined as the total number of non-zero elements in matrix AX 2 . By minimizing the first two terms, X 1 and X 2 are forced to be low rank and sparse respectively. The accuracy of the reconstruction could be improved by minimizing the last term in Eq (7). However, the model Eq (7) is non-convex and hard to solve. As suggested in the literatures [34] to make problem tractable, we consider the following relaxed problem of Eq (7): Where kÁkÃ is the nuclear norm. For a matrix M of dimension m × n, k M k Ã ¼: P min ðm;nÞ i s i , and σ i is the i-th largest singular value of M.

Alogrithm
In this section, we give a description about the solution of the proposed model. To solve model Eq (8), Linearized Alternating Direction Method (LADM) is applied. Denoted by LðX 1 ; X 2 Þ the objective function in the proposed model Eq (8), i.e., Then alternating direction scheme iterates as follow Where k is the number of the iterations. To solve the X 1 subproblem we linearize the data fidel- where g k X 1 ¼ 2G > ðGðX k 1 þ X k 2 Þ À YÞ is the gradient of the linearized data term at X k 1 and β 1 > 0 is a constant.
As for X 2 , by using the fact that A −1 A = I, we linearize k GðX kþ1 After above linearization, the subproblems of X 1 and X 2 can be transformed to the following standard forms with closed form solutions as follows respectively: Here, W is the remaind terms of optimal function, for example, when X 1 is the desired matrix, terms including X 2 will be the W, USV T is the Singular Value Decomposition(SVD) of W and for a matrix W S ½W is also a matrix, and ðS ½WÞ i;j ¼ max f0; jw i;j j À gsgnðw i;j Þ: We alternatively solve Eqs (12) and (13), but only do one iteration for each sub-problem. The program should be stopped when the relative stopping criterions (based on empirical estimations) are reached: The convergence of the scheme can be proved similarly as that in [35] if β i ! kGk 2 , (i = 1, 2).

Parameters and convergence
Since constrain parameters have a great influence on the final results, they should be chosen carefully. In this work, λ is used to balance the low rank and sparse decomposition, its value will influence the proportion of the stationary and dynamic components in decomposition. Candes [29] has recommended that the most appropriate value of λ is expressed as l ¼ 1 ð max ðn;mÞÞ 1=2 , where n(m) is the number of rows (columns) of sinogram. The constants β 1 , β 2 are viewed as the stepsizes for iterations, and affect the speed of convergence. In this work, the boundary of β is set from 0.1 to 10. The maximum number of iteration is set 1000 and iteration will be stopped if the stopping criterions Eq (15) are met.

Results
Three experiments were designed to evaluate the effectiveness of the SLCR in this work. Dynamic PET data corresponding to Zubal-thorax, brain and cardiac were used in these experiments respectively. Monte Carlo simulation (using a toolbox GATE) was used to create the experimental data sets. All experiments are well designed and focus on distinguishing target region boundary (Zubal-thorax), accurate and high contrast and clear boundary image reconstruction for low count data (brain) and extracting dynamic and structural information respectively when the organ has a large deformation (cardiac).
Maximum likelihood expectation maximization (ML-EM, code is based on image reconstruction toolbox by Fessler) was used as the comparison in this work. The maximum iterative number of ML-EM is set to 100 in all three experiments. All codes in the three experiments are implemented in Matlab R2011a (MathWorks Corporation, USA) and run in a desktop computer with i3 Intel Core CPU and 4 GB memory.
In order to analyze the reconstruction results quantitatively, we define the measurements as follows: where x i is the ith pixel of ground truth x,x i is the ith pixel of the reconstructed images x, since the decomposition of SLCR results in the values of pixel of both stationary and time-varying components are less than the ground truth. For a fair comparison,x i is defined as the sum of the value of the pixels in the stationary and time-varying components. Furthermore, we also compute the contrast recovery coefficient (CRC), which is defined as follow: where S is the mean activity of the region of interest and B is the mean activity of the white matter region (background) in the reconstructed image. CRC is used to indicate the contrast of the region of interest in reconstructed images.

Zubal-thorax Experiment
In the first experiment, the schematic representation of the Zubal-thorax phantom is given in Fig 1. It includes four main regions of interest (ROIs) with different colors. Yellow, red, deep red and soft blue were used to indicate ROI1, ROI2, ROI3 and ROI4 respectively. The deep red region is the target region (marked in black rectangle in phantom and red rectangle in reconstructed images) and this region is the major dynamic part in zubal phantom. The simulated PET scanner was Hamamatsu SHR74000 from Hamamatsu Photonics K.K. The radioactivity tracer was C11-acetate, total scanning time was 38 mins, and divided into 53 frames (only results of # 1, 17, 33, 49 frames are shown in figures), all the device settings are the same as in application, including dead time, energy resolution, time resolution and energy window et.al. The images reconstructed by ML-EM (Fig 2(1)) and SLCR method (Fig 2(2) is the stationary component of SLCR and Fig 2(3) is the time-varying component) are given in Fig 2. In Fig 2(1), there are aliasing artifacts in the reconstruction images of the ML-EM method. It is difficult to distinguish the target region and detail boundary information from adjacent ROIs, even worse, the differences between different frames (from #1 to 20) are not distinctive. Compared with the results of the ML-EM, the results of the SLCR present an obvious reduction of the aliasing artifacts and improve the contrast in the target region (Fig 2(2) and Fig 2(3)). In the meantime, the results of time-varying component indicate the variations between adjacent frames, and it extracts successfully the target region information from the dynamic data set (Fig 2(3)). Moreover, in time-varying images, the radio-activity variation between different frames could be observed easily. It is obvious that the SLCR could provide more helpful information than the ML-EM method.

Brain Experiment
The data set based on Hoffman brain phantom (Fig 3) was simulated by Monte Carlo simulation in the second experiment. This phantom contains complicated physical structural information and eight highlight areas, the areas marked by red and green rectangle are target areas which present two tumors in human brain, and the reconstruction pixel value of these regions are used for quantitative analysis. The blue line marks the lateral displacement profile. The spatial resolution of the simulated scanner was 3.5 mm full width at half maximum (FWHM) in sagittal or coronal plane and 3.2 mm FWHM in axial plane. The radioactivity tracer was fluorodeoxyglucose (FDG), and the concentration was 333-444 MBq (9-12 mCi/cc). The total scanning time was 30 mins. The data set was divided into 20 frames. The total count of the recorded event was 1.92 × 10 7 (count level 1) in this data set, the proportion of the scatter   Table 1. The calculated biases and variances shows that the SLCR provides a more accurate reconstruction than ML-EM. The values of CRC shows that the decomposition of SLCR results in improving the contrast in time-varying component.   To evaluate the robustness of the SLCR in low count data, two extra count levels were simulated and the corresponding proportions of the random and scatter events were recorded. Level 2: the total count was 1.32 × 10 6 , the proportion of the scatter events was 19.7%, and the proportion of the random events was 1.63%); Level 3: the total count was 6 × 10 5 , the proportion of the scatter events was 39.75%, and the proportion of the random events was 3.363%. Data sets in both level 2 and level 3 are considered as the low-count data. The images reconstructed by ML-EM and SLCR methods for these two levels are shown in Fig 6. Though all images reconstructed by ML-EM and SLCR go worse when the data count decrease, the SLCR method provide more accurate and less aliasing artifacts reconstructions than ML-EM in both count Table 1. Comparative studies of estimated activity distribution in the red and green rectangle region in brain on synthetic data.

Red rectangle region
Green rectangle region   Table 2.
The quantitative analysis shows that the bias and variance for ML-EM go up faster than SLCR when the data count decrease. All of these demonstrate that the SCLR is better and more robust reconstruction method.

Cardiac Experiment
In the third experiment, a series of the cardiac phantoms with respected to the short axis of cardiac in stress (only five frames were shown in the results due to the limitation of the number of pages) were simulated. And the region selected for quantitative analysis is marked by the black rectangle in the 5th picture. Such a highlight area always indicates a potential lesions or abnormal tissue in clinical situation. These cardiac phantoms were based on a 61-year-old patient with arterial hypertensionand type 2 diabetes mellitus. A distinctive shape deformations (volume variation and myocardial wall motion) were included in these data sets. The main purpose of this experiment was to evaluate the effectiveness of the motion extraction of the SLCR method. The radioactivity tracer was 13N − Ammonia. In addition, to evaluate the robustness of the SLCR method, two count levels were simulated. The proportions of the scatter and random events in these two levels were recorded, level 1: the total count is 5.4 × 10 6 , and it contains 0.1% scatter events and 0.06% random events. level 2: the total count is 1.2 × 10 5 (low count  2. The SLCR also has a good reconstruction result even for the low count data. When data count is not sufficient, the results of the ML-EM method are heavily corrupted by the random/scatter events. There are aliasing artifacts and noise points in the reconstruction images of the ML-EM method. The SLCR method solved this problem. Though the image quality of the SLCR method also deteriorates in the low count data, the results of both stationary and time-varying components show a more clear outline and detailed information than ML-EM. In addition, the sum of these two components of SLCR provides the better reconstruction. Therefore, the SCLR could provide a better reconstruction results than ML-EM in any count levels.
The bias, variance and CRC in the target region are calculated and shown in Table 3. The quantitative analysis results shows that the bias and variance for ML-EM go up faster than SLCR when the data count decreases. And the SLCR could provide more accurate and robust reconstruction images than ML-EM in different count levels data. All of these demonstrate that the SCLR is better and more robust reconstruction method. In addition, the values of CRC also shows that the decomposition of SLCR results in the improvement of the contrast in timevarying component. It is easy to locate the position of the target regions in the SLCR especially in the time-varying component.

Discussion
Summarizing all results of the three experiments, the SCLR is capable of extracting the background and dynamic components during reconstruction, and producing a high contrast and accurate reconstruction images even in a low count data. Compared with the conventional ML-EM method, both of the accuracy and robustness of images are improved. However, the interference or information sharing between background and time-varying components caused by strong coherence between the adjoint frames exists in all experiments, especially when the difference between adjoint frames is not remarkable. One possible reason for this phenomenon is that the sparsity constraint in the framelet domain is not enough to force the time-varying component only contains dynamic information. In the future work, we will focus on how to solve this problem.

Conclusion
A novel method called SLCR is proposed, analyzed and tested for dynamic PET imaging in this work. The advantages of this method are that 1) it is capable of separating the background and dynamic components during reconstruction, 2) it is able to afford a high contrast and accurate  reconstruction images even in a low count data. Three experiments have been used to demonstrate the effectiveness and robustness of the SLCR method. However, for a real clinical application of dynamic PET imaging such as monitoring of radiotherapy, more work needs to be done. For example, the rate of convergence in SCLR is still too slow, the maximum number of iteration is too large for a clinical application. In addition, the interference between stationary background and time-varying components also needs to be solved. Therefore, we will cope with these problems in the future work.