Construction of Classifier Based on MPCA and QSA and Its Application on Classification of Pancreatic Diseases

A novel method is proposed to establish the classifier which can classify the pancreatic images into normal or abnormal. Firstly, the brightness feature is used to construct high-order tensors, then using multilinear principal component analysis (MPCA) extracts the eigentensors, and finally, the classifier is constructed based on support vector machine (SVM) and the classifier parameters are optimized with quantum simulated annealing algorithm (QSA). In order to verify the effectiveness of the proposed algorithm, the normal SVM method has been chosen as comparing algorithm. The experimental results show that the proposed method can effectively extract the eigenfeatures and improve the classification accuracy of pancreatic images.


Introduction
Pancreatic carcinoma is a frequent digestive tract tumor. The malignant degree of this kind of cancer is always very high, and it is difficult to be early diagnosed and treated. Due to the fact that pancreatic carcinoma is often diagnosed when it is advanced, very few pancreatic tumors can be removed by operation. As we know, many famous people died of this disease. So, it is necessary to diagnose pancreatic carcinoma as early as possible. Computer-aided diagnosis (CAD) [1] technology was established with the development of imageprocessing technology and pattern recognition technology. Researching of CAD technology shows that CAD can provide advisory opinions for the doctor and help to improve the diagnostic rate. With the development of medical imaging, it is important to represent the pancreas by a model and it is also important to try to distinguish different appearance of pancreas.
Tensors are geometrical quantity that is used to describe linear relations among vectors, scalars, and other tensors. In this paper, the pancreas CT images can be treated as several third-order tensors, and then we extract the feature to gain the eigentensors for classification.
Principal component analysis (PCA) [2] is a famous method used in the recognition of subspace, which is one of the classical methods based on statistical feature. The core idea of PCA is to reduce the dimensionality of a dataset that consisted of a larger number of interrelated variables and, in the meantime, try to retain the variation in the original dataset as much as possible [3]. But this method has two problems.
(1) Basic using of PCA to transfer tensor objects to highdimension vector (vectorization) obviously results in high cost of processing and memory in next step [4]. For example, if there is a gray image (640 × 640), the vector which the image transfers to will be 409600 × 1.
(2) In using PCA, reshaping breaks the natural structure and correlation in the original data [4], which may affect the subsequent operation and lead to bad results.
In order to solve these problems, this paper uses multilinear principal component analysis (MPCA) referred to in [4]. MPCA follows the classical PCA paradigm and multilinear algorithm, to ensure the fact that it is able to reduce all the tensor dimensionality and that it is also able to get more  variational forms among the original tensors with tensorial mapping to keep original structure and correlation [4]. Support vector machine (SVM) [5] is commonly used to train a classifier. The main factor to affect the classification performance is the parameters used in SVM. Recently, there are many algorithms for SVM parameters optimization, such as ant colony (ACO) algorithm [6], simulated annealing (SA) algorithm, genetic algorithm (GA) [7], and quantum genetic algorithm (QGA) [8,9]. SA is a generic probabilistic algorithm, which is good at locating the optimal solution of the variable in a large search space. The advantages of SA are described as follows.
(1) The objective function can be nonlinear, discontinuous, and random. (2) The objective function can have any boundary conditions and constraints. (3) The programming workload of SA is low, so that it is easy to be implemented. (4) In statistics, we can find the optimal solutions. But there are also some problems of SA. For example, rapid cooling can lead to simulation hardening which cannot be ensured to find the optimal solution. Quantum evolution algorithm (QEA) [10] is also a probability optimization algorithm. QEA has good searching ability for low dimensional function. However, it is not good for high dimensional functions. Therefore, in this paper, we use quantum simulated annealing algorithm (QSA) [11], which is the combination of SA and QEA, to optimize SVM parameters for training classifier of pancreatic diseases. This paper is organized as follows. Section 2 introduces the proposed method; firstly, we will explain the construction of high-order tensor, then we briefly introduce the method of MPCA for feature extraction, and finally we introduce the support vector machine of quantum simulated annealing algorithm optimization (QSA-SVM) for classification. Section 3 presents the construction of pancreas images after MPCA and the results of classification, and we discuss the future of clinical implications of the results. In Section 4, we conclude the works in this paper.

Materials and Method
In this section, firstly we will introduce the whole procedure of the proposed method, which is shown in Figure 1, and then we give a detailed explanation of each process.
The process of the proposed method is as follows.
(1) Image preprocessing: first, we segment the CT images of abdomen to gain the pancreas region of image, and then we normalized the images after segmentation.
(2) High-order tensors construction: at first, we collect a group of pancreatic images and then combine them into a new dataset.
(3) The feature extraction: in this paper, we use the method of MPCA to extract the eigentensors for classification.
(4) Pancreas diseases classification based on QSA-SVM: after we obtain the eigentensors by MPCA, we can treat the eigentensors as samples, and then we use the approach of SVM optimized by QSA to classify pancreas diseases.

Construction of Tensors.
We treat the segmented pancreatic CT images as several third-order tensors with the column, row, and thickness modes. In this paper, we treat each CT image as one data sample. Hence, the input is several third-order tensors and the spatial column space, row space, and the thickness space were regarded as its three modes, as shown in Figure 2. Local optimization The size of each image is standard 128 × 128; the thickness of the CT image is 2.77 mm. Before providing the samples to MPCA, the tonsorial inputs need to be normalized to the same dimension in each mode, so the three modes of the tensor are normalized by default, and we can consider one sample as ∈ 1 × 2 × 3 .

Feature Extraction Based on MPCA.
In this paper, an MPCA [4] solution to the problem of dimensionality reduction for tensor objects is introduced; its research and analysis are also included. First we provided a series of zero-mean value N-order tensor ∈ 1 × 2 ×⋅⋅⋅× , we need to gain a group of new N-order tensor, ∈ 1 × 2 ×⋅⋅⋅× ( < ), that needs to be closed to the original tensor as much as possible. The procedure of MPCA algorithm is shown in Figure 3.
In the preprocessing phase, we center the input original tensors as = − , = 1, . . . , , where M is the number of the samples, and the tensor mean can be described as follows: In the initialization phase, we calculate the eigendecomposition of ( ) * = ∑ =1 ( ) ⋅ ( ) and set the nth mapping matrix ( ) which consists of the eigenvectors corresponding to the most significant eigenvalues, for = 1, . . . , ( = 3).
In the local optimization phase, we will focus on doing the local optimization to obtain the new N-order tensors ; the detailed method of the optimization is given in [4] and the pseudocode of the method is shown as Pseudocode 1. In the pseudocode, Ψ is the total tensor scatter of , = 1, . . . , , M is the number of the samples, and ( ) can be defined as (2). In (2), ( ) is the mean mapping matrix of ( −1) : In the projection phase, we project the centralized eigentensors using the nth mapping matrix ( ) obtained by the local optimization phase to get the new eigentensors . It is shown as follows: We used the eigenvector * projected by the eigentensors for classification [4].

Concept of Quantum Bit and Quantum
Gate. The term quantum comes from quantum mechanics. Quantum, which is the general name of all microscopic particles in the microscopic world, is different from the macroscopic object. Its movements obey the statistical law, not the deterministic law. Compared with the classical computing using 0 and 1 to represent information, the quantum computing uses |0⟩, |1⟩ and their superposition state to represent information. The superposition state is as follows: The measurement of quantum state can cause the collapse of quantum state, so that the final state can be confirmed. The relationship of quantum state, superposition state, and the collapse caused by measurement is shown in Figure 4.
In the quantum computing, the quantum state changes when we have a series of unitary transformations on it. The equipment (a unitary matrix) is called quantum gate which is as follows:  We exchange two probability amplitudes of a quantum bit by the quantum gate as follows

Construction of the Classifier Based on QSA.
We use SVM to train the classifier. SVM can be used to solve some problems, such as the small number of samples, nonlinear, high dimension pattern recognition, and local minimum point, but if the selection of the kernel function parameters, penalty factor , or other parameters is not reasonable, the SVM prediction accuracy will be greatly reduced in classification process. In this paper, QSA is used for optimizing the SVM parameters, penalty factor C, and the parameter of RBF .
We assume that there are m chromosomes in the population and n quantum bits in a chromosome. In QSA, the two probability amplitudes of the quantum bit are treated as the chromosome gene. In the fixed population scale, it can make the search space double, so that the convergence speed will be fast. In fact, the optimal solution is embodied in the optimal probability amplitude of the quantum bit of the optimal chromosome. We assume that the optimal probability amplitude is (cos 1 , cos 2 , . . . , cos ), when another chromosome has the quantum bits ( /2 − 1 , /2 − 2 , . . . , /2 − ), the sine item of this chromosome also has the optimal solution.
In the following description, we set that Φ is the ith chromosome in the population, is the phase, 1 ≤ ≤ , 1 ≤ ≤ . For C, its value ranges from 2 −10 (min c) to 2 9 (max c) and for , its value ranges from 2 −10 (min g) to 2 10 (max g).
The main flow of QSA-SVM is shown in Figure 5.
Step 3. Solution space transformation for chromosomes and computing fitness. For the quantum bit, [cos , sin ] , we  use the linear transformation [12] as (7) to transform to the solution space. In (7), is one gene on chromosome, min and max, respectively, are minimum and maximum of in practice, is the corresponding cos solution of phase , and is the sin solution. In our method, can represent the penalty factor C or the parameter of RBF : We use the SVM prediction accuracy as the fitness of chromosomes and leave one out (LOO) to evaluate. Then, we keep all information of the optimal individual.
Step 4. Computing the annealing temperature T. In (8), gen is the iterations, T 0 is the initial temperature, and is the cooling ratio of simulated annealing to control the rate of cooling: Step 5. The position update of new individual. We divide the neighborhood space for phase and then generate a random update vector , which specifies the location of the phase for updating. One phase is selected randomly to get a new chromosome in the neighborhood space. If the fitness of new chromosome is better than the old one, we will replace the old with the new one. We update the designated quantum bits by through quantum gate according to the above.
Step 6. According to the Metropolis criterion, we update the chromosomes. The probability of new chromosomes acceptation obeys the Boltzmann probability distribution. In (9), fit( ) is the fitness of parent chromosome and fit( ) is the fitness of child chromosome. If fit( ) is greater than fit( ), the new chromosome will be accepted with probability 1. Otherwise, the new chromosome will be accepted with probability : Step 7. Implement quantum variation operation using the following: Step 8. Update the current individuals and execute Step 3 to get the global optimal individual.
Step 9. Determine if it has met the end conditions true is the end to return the optimal parameters and false goes to Step 4.
Step 10. Use the optimal parameters to train an SVM classifier.
After we obtain the classifier using optimal parameters, we will use it to classify the testing samples. Then, we compare the classification labels with the known labels, so that we can get the classification accuracy for evaluating the performance of classifier. It is shown as (11). TN is the total number of

Results and Discussion
We select 114 groups of pancreas images; among them 81 groups are normal and 33 groups are abnormal. The resolution of each image is 128 × 128 and the thickness is 2.77 mm. Among these 114 groups, we select 40 groups of normal images and 16 groups of abnormal images as the testing samples and others as the training samples. Figure 6 shows two images; one of them is normal data and the other one is abnormal data. The pancreatic morphology of abnormal data is thick and big.
We can see the mean of the samples of each 81 normal pancreas and 33 abnormal pancreas in Figure 7.
In this paper, we use the SVM method to classify the pancreas data. We can see several results in Table 1. In Table 1, is the parameter gamma of the kernel function RBF and C is the penalty factor. From Table 1, we can see that the results on centered data are significantly better than the results on the data without centering, so the variation capture with respect to the data center is more powerful in classification than variation capture with respect to the original data.
The experimental result of 5 groups of QSA-SVM is shown in Table 2. We can see that the mean operation duration is approximate 129.35 s, the mean accuracy of classification is 94.64%, the parameter gamma of the kernel function RBF , and the mean values of penalty factor C are 543.12728 and 108.20392.
Compared with other classifiers, the accuracy of QSA-SVM is better which is shown in Figure 8. And in Figure 9, the comparison of running time is shown.
Classifier BPNN is BP neural network, the accuracy is 25% and the running time is 6.76 s; classifier Fisher is fisher linear classifier, the accuracy and running time are 35% and 0.98 s; classifier SVM is the common SVM, the accuracy is 71.4286% and the running time is only 0.13 s; classifier ACO-SVM [13,14] is the optimized classifier SVM using ant colony algorithm which has the same accuracy with common SVM, but the running time is 544.51 s, it is much slower than the proposed method. As Figure 8 shows, QSA-SVM has better efficiency in classification than others, and in Figure 9, it is slower than several methods, but it is faster than ACO-SVM which is also able to optimize the parameters of SVM. At present, radiologists usually diagnose pancreatic diseases with their own experience and the morphology information of image. But missed diagnosis sometimes inevitably happened due to individual differences of patients or limitation of doctor's knowledge of image information. Hence, the proposed method can be used in CAD technology and give early diagnosis of pancreatic diseases in the acceptable time of doctor, so that the classifier can help doctors to diagnose the disease of patient and improve diagnosis rate of disease.

Conclusions
In this paper, tensors have been used to represent the image and MPCA extended linear PCA to multilinear subspace learning for the tensor object analysis, and QSA-SVM method has been proposed to classify images. As an application for Computational and Mathematical Methods in Medicine 7 classifying pancreatic images, the method combining MPCA and QSA-SVM achieved the better classification accuracy, because MPCA method can preserve the relationship of features in the original tensor and the structure of the original image as much as possible; in the acceptable time, QSA which was used for optimizing SVM classified model is able to find the optimal model parameters. Therefore, the proposed method can improve the classification accuracy of pancreatic images and then assist doctors to diagnose diseases.