Early diagnosis and early medical treatments are the keys to save the patients' lives and improve the living quality. Fourier transform infrared (FT-IR) spectroscopy can distinguish malignant from normal tissues at the molecular level. In this paper, programs were made with pattern recognition method to classify unknown samples. Spectral data were pretreated by using smoothing and standard normal variate (SNV) methods. Leave-one-out cross validation was used to evaluate the discrimination result of support vector machine (SVM) method. A total of 54 gastric tissue samples were employed in this study, including 24 cases of normal tissue samples and 30 cases of cancerous tissue samples. The discrimination results of SVM method showed the sensitivity with 100%, specificity with 83.3%, and total discrimination accuracy with 92.2%.
Cancer is a disease that does great harms to the health of human beings. The development of tumor is a multistep, complex process, which is affected by many factors. By far, there have been few effective ways to cure the cancer. Thus, the survival of patients depends largely on the detection of cancer at an early stage. It is of great importance to explore the early cancer diagnosis method. But before the changes in cell morphology can be seen under light microscope, millions of cancer cells have already existed. In the process of carcinogenesis, nuclear acids, proteins, carbohydrates, and other biomolecules generate significant changes in their molecular structures. Fourier transform infrared (FT-IR) spectroscopy, as an effective tool for investigating chemical changes at molecular level, has been utilized to detect carcinoma. This method could detect human tissues directly without sample pretreatment, so the operation is simple, time saving, and convenient compared with the gene expression-based method. At present, with the development of biospectroscopy and spectral analysis technology, the application of FT-IR spectroscopy in distinguishing malignant tissues from normal ones has become a focus [
In this paper, the spectra of gastric tissues were collected by FT-IR spectrometer. In order to eliminate the high-frequency random noise, baseline drift, and light scattering, spectra preprocessing methods of smoothing and standard normal variate (SNV) were used. After spectra preprocessing, the spectra are presented with a high signal-to-noise ratio. In order to achieve a high discrimination accuracy of gastric cancer diagnosis, SVM method was adopted to classify the spectra of normal gastric tissues and cancerous gastric tissues.
A total of 54 gastric tissues were obtained from the Surgery Department of the Third Hospital of Peking University, China. The samples were washed by distilled water and divided into two equal parts: one part for pathological study, the other part for FT-IR spectroscopic measurement. According to the results from the pathology diagnosis, the studied samples consisted of 30 cases of cancer tissues and 24 cases of normal tissues.
The FT-IR spectra of samples were obtained on a Nicolet Magna 750 II FT-IR spectrometer. Spectra-Tech mid-IR optical fiber was utilized. Scan range is from 4000 cm−1 to 900 cm−1 with a resolution of 4 cm−1.
First, smoothing is utilized to filter high-frequency noise in FT-IR spectra. Then, the absorption band of CO2 is substituted with a straight line, which contains little useful information for measurement. Last, standard normal variate (SNV) method [
Each spectrum was preprocessed by SNV; the spectroscopic data of sample
Support vector machine (SVM) is a kernel-based learning method rooted in structural risk minimization [
For labeled training data of the form (
The optimization problem can be efficiently solved using an equivalent formulation to (
The original FT-IR spectra (Figure
The FTIR spectra of normal and malignant gastric tissues. (a) Original spectra; (b) preprocessed spectra with smoothing and SNV.
Before discrimination, resampling half mean (RHM) [
RHM scores of gastric tissue samples.
Number | Score | Number | Score | Number | Score | Number | Score |
---|---|---|---|---|---|---|---|
1 | 0 | 15 | 0 | 29 | 9 | 43 | 0 |
2 | 0 | 16 | 0 | 30 | 0 | 44 | 0 |
3 | 0 | 17 | 0 | 31 | 0 | 45 | 0 |
4 | 24 | 18 | 471 | 32 | 0 | 46 | 0 |
5 | 0 | 19 | 325 | 33 | 0 | 47 | 0 |
6 | 15 | 20 | 0 | 34 | 0 | 48 | 0 |
7 | 0 | 21 | 0 | 35 | 0 | 49 | 0 |
8 | 0 | 22 | 0 | 36 | 0 | 50 | 0 |
9 | 0 | 23 | 0 | 37 | 0 | 51 | 0 |
10 | 0 | 24 | 42 | 38 | 0 | 52 | 0 |
11 | 289 | 25 | 0 | 39 | 0 | 53 | 0 |
12 | 0 | 26 | 102 | 40 | 0 | 54 | 0 |
13 | 0 | 27 | 0 | 41 | 0 | ||
14 | 1 | 28 | 0 | 42 | 0 |
After removing the outliers, discrimination model was built for classification of cancer and normal tissue samples, and leave-one-out cross validation (LOOCV) was utilized to evaluate the discrimination results of SVM method. The LOOCV method attempts to predict the data of the unknown sample with the data of training sample set. One sample was randomly selected and excluded from the training set. The selected sample regarded is as an unknown one, and then the sample is classified using the model built with the rest training samples. Record the result. Repeat the above course until all samples had been selected for once and only for once.
The discrimination results were shown in Table
Discrimination results using SVM method.
Pathologic analyzing result | SVM method result | |
---|---|---|
Positive (T+) | Negative (T−) | |
Cancer (D+) | 27 | 0 |
Normal (D−) | 4 | 20 |
From Table
In this paper, the FT-IR spectra of normal gastric tissues and cancerous gastric tissues are acquired. After spectra preprocessing of smoothing and SNV, the spectra with high signal-to-noise ratio are presented and achieve a high discrimination accuracy with 92.2% by using SVM method. This indicates that FT-IR spectroscopy with chemometrics method is reliable, practical, and could be easily implemented in gastric cancer diagnosis.
This work is supported by Programs for Changjiang Scholars and Innovative Research Team (PCSIRT) in University of China (IRT0705) and National Natural Science Foundation (60708026).