^{1}

^{2}

^{1}

^{3}

^{2}

^{2}

^{2}

^{2}

^{1}

^{2}

^{3}

Pulmonary nodule recognition is the core module of lung CAD. The Support Vector Machine (SVM) algorithm has been widely used in pulmonary nodule recognition, and the algorithm of Multiple Kernel Learning Support Vector Machine (MKL-SVM) has achieved good results therein. Based on grid search, however, the MKL-SVM algorithm needs long optimization time in course of parameter optimization; also its identification accuracy depends on the fineness of grid. In the paper, swarm intelligence is introduced and the Particle Swarm Optimization (PSO) is combined with MKL-SVM algorithm to be MKL-SVM-PSO algorithm so as to realize global optimization of parameters rapidly. In order to obtain the global optimal solution, different inertia weights such as constant inertia weight, linear inertia weight, and nonlinear inertia weight are applied to pulmonary nodules recognition. The experimental results show that the model training time of the proposed MKL-SVM-PSO algorithm is only 1/7 of the training time of the MKL-SVM grid search algorithm, achieving better recognition effect. Moreover, Euclidean norm of normalized error vector is proposed to measure the proximity between the average fitness curve and the optimal fitness curve after convergence. Through statistical analysis of the average of 20 times operation results with different inertial weights, it can be seen that the dynamic inertial weight is superior to the constant inertia weight in the MKL-SVM-PSO algorithm. In the dynamic inertial weight algorithm, the parameter optimization time of nonlinear inertia weight is shorter; the average fitness value after convergence is much closer to the optimal fitness value, which is better than the linear inertial weight. Besides, a better nonlinear inertial weight is verified.

The number of deaths from lung cancer is as high as 137 million annually around the world, accounting for 18% of cancer mortality ratio. Early surgical treatment is the most effective treatment for lung cancer, but most patients are diagnosed in the late stage of the disease. In 2015, the European Academy of Imaging and the European Respiratory Society published the latest white paper on lung cancer screening in European Respiratory Journal (ERJ) to guide clinical lung cancer screening for early detection and early treatment of lung cancer.

As early representation form of lung cancer in the lung CT image, a pulmonary nodule is defined as a nearly spherical opacity with a diameter smaller than 3 cm. Computed Tomography (CT) technology is an important means of early detection of pulmonary nodules disease. According to the CT characterization, pulmonary nodules can be divided into solid nodules (such as solitary pulmonary nodules, pulmonary wall adhesion nodules, and vascular adhesion nodules), ground glass nodules, and cavitary nodules.

Computer-Aided Detection (CAD) system of lung is one of the applications of machine vision; it can reduce overload visual fatigue of the radiologist and decrease the possibility of the resulting miscarriage or omission and also provide auxiliary diagnosis results for the doctor as “third party.” Usually, the lung CAD system includes the following modules: acquisition of the lung CT image data, preprocessing of CT image, lung parenchyma segmentation, detection of VOI (Volume of Interest) or ROI (Region of Interest) in candidate nodules images (mainly refers to the extraction or segmentation), calculation and selection of ROI or VOI features, and recognition of pulmonary nodules, where pulmonary nodules recognition is the core module of the CAD system. The algorithm of Support Vector Machine (SVM) has been widely used in the detection and recognition of pulmonary nodules (see, e.g., [

The Multiple Kernel Learning Support Vector Machine (MKL-SVM) algorithm has achieved good recognition accuracy not just in recognition of lung nodules in [

In this paper, the PSO algorithm and MKL-SVM algorithm are combined to realize the parameter optimization of the MKL-SVM. On this basis, the PSO algorithm with different inertia weights was compared and analyzed in order to obtain the parametric array similar or superior to that of the grid search algorithm aiming at quickly searching the optimal parametric array and the reasonable inertia weight and then precise identification of the pulmonary nodules.

SVM is a learning method using small amount of samples, which can be applied to predict or classify unknown samples by structural risk minimization. The training sample is represented as follows:

When SVM is used in the two classification problems, the original model can be written as the following nonlinear optimization problem:

In the feature space, SVM is used to map the input data

Various kernel functions have diverse advantages. One of the keys to improve the performance of SVM is to design an appropriate kernel function for a given problem. The common basic kernel functions are polynomial kernel function and radial basis function (RBF), which are presented, respectively, as follows:

The convex combination form of the kernel function is still a kernel function:

Let

Let

It is proven that the kernel function expressed by (

RBF kernel has a strong ability to learn, and polynomial kernel has strong generalization ability; thus the combination of the two can take into account the ability of both learning and generalization. If we use only two kinds of basic kernel functions of both RBF kernel and polynomial kernel, that is,

Particle Swarm Optimization (PSO) is a typical heuristic algorithm on the basis of swarm intelligence optimization theory. In 1955, PSO was first proposed by Kennedy and Eberhart in [

It is assumed that the population

The PSO algorithm is applied into MKL-SVM algorithm of (

After introducing the classic PSO algorithm, the recognition accuracy rate (ACC) of pulmonary nodules in the sense of CV is regarded as the final target and determined as the fitness function value of PSO, and then ACC is defined as follows:

The diagram of MKL-SVM-PSO algorithm.

The experimental data were collected from 20 groups from a third-grade class A hospital in Jilin province with a total of about 700 images, and each group was diagnosed with the diagnostic criteria of doctor. The size of each CT image was 512 × 512, and the slice thickness is 5.0 mm. 270 ROI are extracted: 80 pulmonary nodules and 190 false positive nodules. After feature selection, the data samples were randomly divided into two groups: the training group including 170 samples and the testing group including 100 samples.

The simulation experiments are carried out using the platform MATLAB with libsvm toolbox. In the process of model parameter optimization, 5-fold cross-validation is used to obtain the optimal parameter set corresponding to the highest ACC. Let the number

The fitness curve of the MKL-SVM-PSO algorithm with constant inertia weight.

The optimal individual fitness curve of MKL-SVM-PSO algorithm is obtained as shown in Figure

Compared with the grid search algorithm, the computational time of MKL-SVM-PSO algorithm requires shorter time, but as the iteration times gradually increase, the shock amplitude of average fitness value in each generation is more severe, and a certain gap exists with the optimal fitness value, which can be found in Figure

In [

The fitness curve of MKL-SVM-PSO algorithm of (

It can be seen from Figure

The fitness curve using the MKL-SVM-PSO algorithm of (

In order to ensure obtaining the global optimal solution, the following three kinds of nonlinear inertia weight are adopted to control convergence precision and convergence speed, so that the average fitness values reach the best fitness value index quickly and smoothly.

The fitness curve of optimal parameters group searching by MKL-SVM-PSO algorithm.

The fitness curve corresponding to

The fitness curve corresponding to

The fitness curve corresponding to

In order to compare the influence of different kinds of inertia weight on the parameter optimization, Figure

The variation curves of several dynamic inertia weights.

In summary, the Particle Swarm Optimization algorithm with constant weight has a fast convergence speed, but in the later stages it is easy to fall into local optimal solution with little accuracy. The linear inertia weights of (

In order to compare several kinds of Particle Swarm Optimization algorithms with different inertial weights and the parameters optimization time and recognition results of grid search algorithm, each algorithm is operated 20 times, and the average results of 20 times are listed in Table

Comparison of various indexes in parameter optimization stage of different algorithms.

Different inertial weight algorithm | Average parameter optimization time (s) | Average optimal fitness value | Average ACC value obtained from test set | Average SEN value | |
---|---|---|---|---|---|

1 | The constant is 1 | 370.7950 | 94.1176% | 90.45% | 86.85% |

2 | ( | 462.4134 | 94.1176% | 91% | 88.89% |

3 | ( | 457.0022 | 94.1176% | 91% | 88.89% |

4 | ( | 416.0204 | 94.1176% | 91% | 88.89% |

5 | ( | 448.1536 | 94.1176% | 91% | 88.89% |

6 | ( | 450.4456 | 94.1176% | 91% | 88.89% |

7 | Grid search algorithm | 3096.1427 | 94.1176% | 91% | 88.89% |

From the experimental results in Table

Figure

Statistical indexes corresponding to Figure

Different inertial weight algorithm | Upper adjacent(s) | Lower adjacent(s) | Median value(s) | Number of outliers | |
---|---|---|---|---|---|

1 | The constant is 1 | 369.049 | 364.151 | 366.343 | 3 |

2 | ( | 468.018 | 446.098 | 459.983 | 3 |

3 | ( | 468.646 | 446.225 | 459.6345 | 2 |

4 | ( | 443.129 | 394.638 | 148.49 | 0 |

5 | ( | 455.38 | 443.456 | 448.798 | 2 |

6 | ( | 461.342 | 451.524 | 455.122 | 3 |

The statistical boxplot of parameter optimization time.

From the data in Figure

Since the value of each inertial weight is different, the maximum number of convergence generations is different in various algorithms. Therefore, it is not reasonable to compare the Euclidean norm error between the optimal fitness value and the average fitness value after convergence, because it is difficult to express the merits and demerits of each algorithm. In order to reasonably express the Euclidean distance between the optimal fitness curve and the average fitness curve after convergence, we define the normalized Euclidean norm error as follows:

The average values of the indexes obtained after 20 operation times are shown in Table

Comparison of various indexes under different inertia weights.

Different inertial weight algorithm | Maximum of | Mean value of | Median value of | The Euclidean norm of the global error vector | Convergence generation | The Euclidean norm of the normalized error vector after reaching the convergent generation number |
---|---|---|---|---|---|---|

The constant is 1 | 93.6044% | 90.1465% | 63.3698% | 90.6721 | — | — |

( | 94.0279% | 93.6873% | 93.7169% | 8.4488 | 9 | 0.0294 |

( | 93.9941% | 93.2348% | 93.6419% | 28.2732 | 30 | 0.0348 |

( | 94.0290% | 93.2050% | 93.6719% | 31.5911 | 36 | 0.0320 |

( | 94.0044% | 93.2996% | 93.6618% | 27.2316 | 22 | 0.0324 |

( | 93.9926% | 93.2350% | 93.6206% | 28.1948 | 27 | 0.0341 |

From above experimental results, it can be seen that when the inertia weight is constant, it is very difficult to find which generation of curves converges obviously in Table

In summary, the Particle Swarm Optimization algorithm with dynamic inertia weight is better than the one with constant inertia weight, and the algorithm using nonlinear inertia weight is better than that one using linear inertia weight. The MKL-SVM-PSO algorithm has gained good results by use of dynamic nonlinear inertial weight of (

In this paper, a MKL-SVM-PSO algorithm with nonlinear inertial weight is proposed to search the optimal parameter set of hybrid kernel Support Vector Machine quickly and accurately and achieved better effects in pulmonary nodule recognition. The main innovative work goes as follows:

The PSO algorithm is introduced into the mixture Kernels SVM algorithm and used for the discrimination of benign and malignant pulmonary nodules.

On the basis of changing dynamic weights, the similarities and differences between linear weights and nonlinear weights are discussed, and the optimal dynamic nonlinear weights are obtained. The average fitness value of the algorithm is close to the optimal fitness value quickly and smoothly, so that the global optimal solution is easy to be obtained.

The Euclidean norm index of normalized error vector is proposed to measure the difference between the optimal fitness curve and the average fitness curve after convergence with different inertial weights. The index solves the problem that different convergence generations of different algorithms result in different dimensions of error vectors in various algorithms, and it is difficult to compare the discrepancy. The validity of dynamic inertial weight algorithm is verified from the point of view of statistics.

The experimental results show that the model training time of MKL-SVM-PSO algorithm is only 1/7 of the training time of MKL-SVM grid search algorithm with better recognition effect. It can be seen that the dynamic inertia weight is better than constant inertia weight in the MKL-SVM-PSO algorithm from Table

Although ACC, as a fitness value, has obtained good experimental results in this method, medical attention is often paid to the SEN index to prevent missed detection. Our next job is to extend the proposed MKL-SVM-PSO algorithm to multitarget search in order to achieve accurate identification and nonmissed detection of pulmonary nodules.

The authors declare that there are no conflicts of interest regarding the publication of this paper.

The authors gratefully acknowledge the support of the China Postdoctoral Science Foundation (no. 2016M591468), Jilin Province Postdoctoral Science Foundation (111900358), the Jilin Provincial Science and Technology Development Plan funded project (20170520050JH), the Education Department of Jilin Province (no. JJKH20170575KJ and no. JJKH20181041KJ), Jilin Province Postdoctoral Science Foundation, and the Key Program of the National Natural Science Foundation of China (nos. NSFC11690012 and NSFC11631003).