Clustering-Based Linear Least Square Fitting Method for Generation of Parametric Images in Dynamic FDG PET Studies

Parametric images generated from dynamic positron emission tomography (PET) studies are useful for presenting functional/biological information in the 3-dimensional space, but usually suffer from their high sensitivity to image noise. To improve the quality of these images, we proposed in this study a modified linear least square (LLS) fitting method named cLLS that incorporates a clustering-based spatial constraint for generation of parametric images from dynamic PET data of high noise levels. In this method, the combination of K-means and hierarchical cluster analysis was used to classify dynamic PET data. Compared with conventional LLS, cLLS can achieve high statistical reliability in the generated parametric images without incurring a high computational burden. The effectiveness of the method was demonstrated both with computer simulation and with a human brain dynamic FDG PET study. The cLLS method is expected to be useful for generation of parametric images from dynamic FDG PET study.


INTRODUCTION
Positron emission tomography (PET) is a powerful quantitative tool for in vivo imaging compounds labeled with positron emitting radioisotopes that trace biological processes in the body. However, in many occasions, the biological parameters on the image voxel level as determined by conventional statistical estimation methods suffer from large statistical uncertainty. This paper addressed this problem to make the quantitative estimation of physiological parameters more reliable.
Parameter estimation methods [1][2][3][4][5][6][7][8], such as nonlinear least square (NLS), linear least square (LLS), and graphic analysis, are used in kinetic analysis of PET data. Due to its simplicity and computational efficiency, LLS-based parameter approaches are commonly used for estimation of macroparameters such as FDG uptake rate constant K i (= K 1 k 3 /(k 2 + k 3 )) and distribution volume (= (K 1 /k 2 )(1 + k 3 /k 4 )) for reversible ligand-receptor PET study. Regardless of whether the data is from ROI or from a single voxel, the noise-induced bias has been reported in previous stud-ies [9][10][11]. The application of conventional LLS method for generation of microparameter images of FDG kinetic model could be limited by high noise levels of pixel kinetics. The option of reducing the noise by increasing the injection dose is limited by clinical practice. Averaging over a larger volume or setting a big voxel size could reduce the noise, but it is limited by tissue heterogeneity and the partial volume effect. The clustering-based analyses developed recently [12][13][14][15][16][17] reduced the noise effectively because these methods averaged the data over a large volume that included many tissues with similar tracer kinetics or physiological characteristics. The clustering operation automatically segments the tissues into different clusters, within each the time activity curves (TACs) of all voxels have a similar shape. The combination of clustering analysis and the LLS method may give the required robustness and reliability in the parameter estimation and yet without significantly increasing the computation burden for generation of parametric images.
Comparing with previous methods, clustering analysis for kinetics (CAKS) method was originally developed based on the principle component analysis (PCA) [12,13] with only two principle components. A variation of the original CAKS method was based on the mixed-Gaussian model [14], and was applied at the voxel level. Nonlinear ridge regression with spatial constraint (NLRRSC) was used for nonlinear least square (NLS) estimation after hierarchical cluster analysis [15]. Simultaneous estimation (SIME) or simultaneous estimation with postestimation (SIMEP) method was a simultaneous estimation approach using a K-means-like cluster analysis [16]. In the present study, a combination of Kmeans and hierarchical cluster analysis is used for clustering dynamic FDG PET data, and the kinetics of clusters of high signal-to-noise ratio are applied to regression matrix for LLS to produce parametric images. The method was verified using computer simulated data and a human FDG PET data set to show its superior performance compared to the conventional LLS or Patlak graphic analysis.

Modeling theory
The following three-compartment FDG model [18] has been shown before to be appropriate for analysis of dynamic FDG PET data [1,19,20], where C e , C m , and C p are, respectively, the concentrations of free FDG, FDG-6-phosphate in tissue, and the FDG concentration in plasma. C t is the total radioactivity in tissue, C pet is total counts from the whole field of view (FOV). K 1 is a transfer constant for free FDG from plasma into tissue, k 2 is a rate constant for free FDG from tissue back to plasma, and k 3 is rate constant for FDG phosphorylated into FDG-6-PO 4 .V B is fractional volume of blood in tissue (0 ≤ V B ≤ 1). Then, the cerebral glucose metabolic rate can be calculated as where [Glc] is the glucose concentration in plasma. The lumped constant (LC) is usually a constant. Therefore, to get glucose metabolic rate, one only needs to calculate the uptake rate constant K i . The linear description of (1) can be written as follows: cLLS is a modified linear least square (LLS) fitting method that incorporates a clustering-based spatial constraint for generation of parametric images from dynamic PET data of high noise levels. The 3D autoclustering processing was applied to the smoothed dynamic PET data by a method that combines K-means and hierarchical clustering with average linkage. The number of clusters was determined by referring to previous work [15]. The average TAC and V B of each cluster (C pet cluster and V B cluster ) were then determined and were used to improve the LLS method as shown below in (4) and (6) (note that all the tissue TACs in the following equations are the measurements after being corrected for The conventional way of calculating the value of K i as K 1 k 3 /(k 2 + k 3 ) may have a large error propagation, because the estimates of k 2 and k 3 determined from high noise TAC usually have large variability. In order to obtain a more robust K i estimate, (3) can be rearranged for estimating Substituting C t with C t cluster (of lower noise level) on the right side of (5) is expected to improve further the estimate

Validation method (1) Computer simulation
Time activity curves for 100 pixels within a cluster were simulated to investigate the influence of noise on parameter estimation with conventional LLS and cLLS at various noise levels. The regular FDG model was used for the simulation. Parameters used were K 1 = 0.13 (mL/min/g), k 2 = 0.08 (1/min), k 3 = 0.05 (1/min). The dynamic PET scan time sequence (4 × 0.5 min, 4 × 2 min, 10 × 5 min) was the same as the one commonly used in human FDG study. A plasma TAC from a real human FDG study was used as the input function. The noise-free tissue TAC was generated according to the analytical solution of the model, that is, Xinrui Huang et al.

3
Pseudorandom Gaussian noise was added to the TAC according to the noise variance formula shown by Chen et al. [21] and Feng et al. [22] before. That is, the variance of the noise was proportional to the radioactivity concentration and inversely proportional to the scan duration (Δt i ), where σ 2 (t i ) is the variance of ith scan at its midtime (t i ), α is the proportionality constant that determines the overall noise level in a TAC, and λ is the physical half-life of FDG (= 110 min). In this simulation, α was set as 0.1, 0.2, 1.0, and 1.5 to yield noisy TACs at various noise levels. The simulated TACs were then processed using LLS and cLLS, separately to give the estimates of the two methods. For the cLLS, the cluster average tissue TAC in (4) and (6) used was the average TAC of the 100 simulated TACs. The estimated parameters for each of the 100 pixels were obtained for each noise level. Bias and root mean squared error (RMSE) were used as criteria to evaluate the reliability of cLLS. The percentages of RMSE and bias are defined as [15]: where p i is the parameter estimation result of the ith simulated pixel TAC at one noise level, p is the "true" parameter value for the simulation, and N is the number of pixels in the cluster (i.e., N = 100).

(2) Clinical data validation
A set of dynamic FDG PET data was acquired from a normal volunteer with an ECAT EXACT HR+ PET scanner (axial field of view = 15.5 cm; intrinsic full-width-at-halfmaximum (FWHM) at the center = 4.3 mm) in 3D acquisition mode. Before FDG administration, transmission scanning was performed using 68 Ge line sources for attenuation correction. Dynamic emission scans (4 × 0.5 min, 4 × 2 min, 10 × 5 min) were initiated simultaneously with an IV injection of 155 MBq FDG. For each PET scan frame, 63 transaxial images (128 × 128 pixel; pixel size 1.471 mm; 2.425-mm plane thickness) were reconstructed using a filtered back-projection algorithm with a Hanning filter (cutoff frequency of 0.3 cycle per projection element), resulting in an in-plane spatial resolution of ∼8 mm FWHM. Dead time, scatter, and measured attenuation corrections were applied. Arterial blood samples were collected via a catheter in the radial artery during the study. The acquired image data was processed with the following procedure.
(1) After the mean images were obtained by averaging all the frames of the time series, they were smoothed with a 6mm FWHM Gaussian filter to get an image mask of the brain using SPM99. All PET dynamic images were masked to zero out all the pixels outside of the head.
(2) The clustering of the 3D PET dynamic data follows the procedure described as follows. (a) 3D smoothing of the masked PET 3D data of each frame; (b) classifying pixel TACs of the masked and smoothed PET 3D dynamic data into 15 clusters with the K-means method; (c) classifying the 15 average TACs from the 15 clusters into 4 final clusters (white matter, gray matter, scalp, and vasculature) with the average linkage hierarchical clustering method.
(3) The cluster average TAC was obtained as the average of all the voxel TACs in each cluster. Fitting the cluster average TAC with the FDG model (1) using the Levenberg-Marquardt algorithm gave the estimates of the parameters, K 1 , k 2 , k 3 , V B , for each cluster.
(4) After correcting the dynamic PET data and the clustered average TAC of each cluster with V B cluster , the parametric images (K 1 , k 2 , and k 3 ) were estimated using (3) and (4). K i parametric image was estimated from K 1 k 3 /(k 2 + k 3 ) for the conventional LLS and according to (6) for cLLS.
(5) To validate the robustness of the cLLS method, the parameters estimated by cLLS were compared against those from the conventional LLS method and those from the Patlak graphical method.

(1) Results of the clustering
After the clustering processing, a brain image is classified into 4 clusters. Figure 1 shows the cluster image results at the 12th, 18th, and 24th slices. To investigate how the K-means' results effect the average linkage output, various numbers of clusters (10, 15, and 25) from the K-means clustering were tested and the results from the average linkage were shown in different rows in Figure 1. Some detail anatomic features were lost with 10 clusters. Although some voxels were assigned to the wrong cluster with the use of 25 clusters, the differences cannot be exactly distinguished between 15 clusters and 25 clusters. To get an idea of how well separated the resulting clusters are, we can make a silhouette plot using the final cluster indices output. The silhouette plot displays a measure of how close each point in one cluster is to points in the neighboring clusters. A quantitative way to compare the different solutions is to look at the average silhouette values for the different cases. The bigger average silhouette value indicates the better cluster result, for example, the average silhouette values of 12th slice are 0.8084, 0.8132, and 0.8177, respectively, for the final cluster results from 10, 15, and 25 clusters from the K-means clustering.
Considering the computation efficiency and simplicity simultaneously, the estimated parameters of each cluster, from the average linkage results with 15-cluster K-means results, are summarized in Table 1.

(2) Results of the simulation
The estimated parameters (by LLS and by cLLS) from the simulated data at different noise levels are listed in Table 2. The bias and RMSE of the estimates are shown in Figure 2 estimates are small for both methods when the TAC noise level is low. For higher noise levels, the results from the conventional LLS quickly deteriorate, while the estimated uptake rate constant k i from cLLS remains stable (i.e., the absolute values of bias and RMSE with the cLLS are both small).

(3) Results of clinical data validation
The parametric images of the 12th slice as estimated by LLS and by cLLS are shown in Figure 3. It is evident that the parametric images from cLLS have a better image quality and higher SNR.
To quantitatively compare the parametric images, we applied a number of small ROIs to the parametric images, and compared the mean and standard deviation of the pixel values within the ROIs. The results are shown in Table 3. Compared to LLS, cLLS gives higher means and lower SDs for the estimates.
To investigate noise effects on real data, big-to-small ROIs were selected and kinetically analyzed. The ROI parameter values were obtained with three methods separately: conventional nonlinear regression method fitting the average TAC of all ROI pixel TACs; LLS method fitting each ROI pixel TAC and giving out the mean of all pixel parameters; cLLS method fitting each ROI pixel TAC with the average ROI TAC substi-  (4) and (6). RMSE of estimates (K 1 , k 2 , k 3 , and K i ), with conventional nonlinear regression method as reference, the RMSE of cLLS estimate is significantly lower (22.7% for K 1 ; 12.6% for k 2 ; 16.5% for k 3 , and 91.2% for K i ) than that of LLS (paired t-test, P < .01).

(4) Correlation comparison of different methods
Squared correlation coefficients (R 2 ) between ROI averages of parametric images (by cLLS) and parameters estimated by ROI kinetic analysis (with LLS) are 0.94 for K 1 , 0.92 for k 2 , 0.88 for k 3 , and 0.99 for K i (Figure 4). The K i estimate for each pixel from cLLS also correlates well with that from the Patlak graphical analysis (R 2 = 0.99, y = 0.002 + 1.089 * x) ( Figure 5).
Xinrui Huang et al.  International Journal of Biomedical Imaging

DISCUSSION
(1) Advantage of the clustering processing in the cLLS method From the result shown in Figure 1, one sees that the clustering technique in our cLLS method clearly segmented the dynamic PET images into 4 clusters. Compared to previous works based on subjective ROI drawing that is labor intensive and error prone, our clustering technique performed well. In addition, the process of clustering is not subject to tissue heterogeneity, with which manual ROI method is difficult to deal. Comparing with the relevant studies that the clustering method has been used in dynamic PET FDG data analysis before, the novelty of our study is the integration of modified clustering method and LLS based on voxel level.

(2) High accuracy and reliability of cLLS
From the simulated results in Figure 2, one can see that the absolute values of percent bias and percent RMSE of the estimates from cLLS are usually smaller than those of LLS at all noise levels. The use of the cluster average TAC on the right-hand sides of (4) and (6) is considered to be the main factor that accounts for the improvement. This result is in agreement with those previously reported by Kimura et al. [12] which showed that direct parameter estimation using the conventional LLS would result in a large bias compared with the linearization method with cluster analysis.

(3) cLLS analysis saving more calculation time
For cLLS, the right-hand-sides of (4) and (6) are calculated only 4 times (the number of clusters) for each study, while the same computation (3) needs to be repeated for each voxel for LLS. The calculated time of cLLS is about 2 hours, less than about 6 hours of LLS, when dealing with the clinical data. The program was edited in Matlab 6.5 language under Windows XP system and run on the PC with 2.0 G RAM and P4 CPU.
In conclusion, the present study showed that the cLLS method is of high computational efficiency and provides estimates of high statistical reliability. The cLLS method is thus expected to be useful for generation of parametric images from dynamic brain FDG PET of high noise levels. In the paper, the data of a normal volunteer was used to validate the effectiveness of the method, because the anatomic structure of a normal volunteer is clear for us especially when we 8 International Journal of Biomedical Imaging understand the clustering result. In future, we also can try some dynamic PET FDG data of patients with brain disorder to validate the cLLS method.