A Terahertz Spectroscopy Nondestructive Identification Method for Rubber Based on CS-SVM

. A method is proposed for rubber identification based on terahertz time-domain spectroscopy (THz-TDS) and support vector machine (SVM). In order to improve the accuracy, the cuckoo search algorithm (CS) is used to optimize the penalty factor C and kernel function parameter g of SVM. The SVM model optimized by the cuckoo search algorithm is abbreviated as CS-SVM. Principal component analysis (PCA) is applied to decrease the dimension of the spectral data. The top ten principal component factors, whose accumulated variance contribution rate reaches 93.93%, are extracted from the original spectra data and then are applied to CS-SVM. The identification rate of testing sets for CS-SVM is 100%, which is significantly higher than 96.67% identification rate of testing sets for PSO-SVM and Grid search. Experimental results show that CS-SVM can accomplish nondestructive identification for different rubber. This method lays a theoretical foundation for the application of terahertz spectroscopy in rubber classification and identification.


Introduction
Natural rubber (NR), isobutylene isoprene rubber (IIR), and styrene-butadiene rubber (SBR) are the main materials of tires.With the continuous development of the automotive industry, their consumption has increased rapidly.The quality of the tire is closely related to the type and quality of the rubber material, so it is important to quickly and accurately detect the type and composition of the rubber [1,2].Traditional rubber identification methods include Fourier infrared spectroscopy [3], scanning electron microscopy [4], X-ray spectrometry [5], and ultraviolet visible spectrometry [6].They have problems such as complicated detection process, long consumption time, environmental pollution caused by improper treatment of chemical reagents, the destruction of the structure caused by X-ray, and the poor penetration of infrared light.The terahertz (THz) spectrum is a far-infrared spectrum with fingerprint characteristics [7].Because different materials have different characteristic responses to THz radiation, their type and composition can be identified by the characteristic spectrum [8].Rubber is a polymer compound with a molecular weight between 100,000 and 1,000,000.Its macromolecular vibration and rotational energy levels are in the terahertz band, showing strong absorption and resonance.Compared with the Xray, the THz photon energy is lower and does not damage the structure of rubber.Compared with the infrared light, the THz light wave has stronger penetration force to rubber and can more effectively identify the microstructure and fretting characteristics of rubber.Hirakawa et al. [9] have studied the rubber compounds by THz-TDS in the detection of rubber, and it has been observed that the carbon blacks in the samples are easily detectable below 1.5 THz based on THz absorption.Komatsu et al. [10] used THz spectroscopy to detect the content and distribution of the rubber additive talc and magnesium hydroxide added to Ethylene Propylene Diene Monomer (EPDM).Miao Qing et al. [11] used THz-TDS to investigate the optical and spectral properties of three types of rubber in 0.2 to 1.8 THz.

Advances in Condensed Matter Physics
This paper focuses on the identification of natural rubber (NR), isobutylene isoprene rubber (IIR), and styrenebutadiene rubber (SBR).The spectra of three types of rubber have been investigated in time and frequency domain, where some obvious characteristic absorption peaks can be observed in 0.3 ∼ 1.6THz.

Experimental
. .Experimental Apparatus.The experimental apparatus consists of Z-3 terahertz time-domain spectroscopy (THz-TDS) system (ZOMEGA, USA) and ultrafast femtosecond fiber laser (TOPTICA Photonics AG, Germany).Ultrafast femtosecond laser generates laser pulses at around 800nm of 100fs duration at a central wavelength with a repetition frequency of 80 MHz.The laser beam is divided into a pump beam and a probe beam by a beam splitter (CBS) [12].The schematic diagram of the THz-TDS system is shown in Figure 1.For the system in the frequency range of 0.1∼3.0THz, spectral resolution is less than 5 GHz, signal-to-noise ratio is greater than 70 dB, and spectral data acquisition time is about 1 min.
In order to decrease the strong absorption of moisture to the THz wave, the experiment was conducted in a particular environment, where dry air was continuously injected to make the relative humidity below 3%, and the temperature was kept at room temperature.
. .Samples.In this paper, there are three types of rubber to be identified, which are IIR, SBR, and NR.These three types of rubber look similar and are difficult to distinguish.Five experimental samples were made for each type of rubber.The sample was in the form of a disk having a thickness of about 1 mm and a diameter of about 12 mm.The inside of the sample was uniform and the upper and lower surfaces are parallel to each other.Considering the strong absorption of moisture to the THz wave, all the experimental samples were dried in a vacuum oven at a constant temperature of 50 ∘ C for 2 to 3 hours to reduce the moisture content before the experiment.During the experiment, each sample was measured 12 times, and the front and back sides were measured 6 times each.Therefore, there are 60 sets of data for each type of rubber.In order to eliminate coarse errors, any two of the 60 sets of data were averaged to obtain 30 sets of data for each type of rubber, and 90 sets of data were obtained for the three types of rubber.

Theory and Algorithm
. .Spectral Parameter Extraction.The reference signals E ref () and the sample signals E sam () in frequency domain are obtained from reference signal (air) and the sample signal in time domain by fast Fourier transform.Absorbance, a dimensionless relative amount that indicates material's absorption levels, is used as an optical parameter for spectral analysis to reduce the error caused by experimental sample thickness [13].Absorbance could be calculated as follows: In the above formulas (1), Absorbance is the absorbance and  is the vibration angular frequency of the THz wave.
. .Support Vector Machine.Support vector machine (SVM), an algorithm of learning machine based on statistical learning theory, is first proposed by Vapnik [14,15].SVM based on Vapnik-Chervonenkis (VC) theory and structural risk minimization (SRM) principle realizes the empirical risk minimization (ERM) and the confidence interval minimization, aiming to achieve the best generalization ability [16].To improve the generalization ability of learning machine, the nonlinearly samples are mapped into a higher dimensional space by kernel function and the optimal hyperplane is set up to make it linearly separable [17,18].SVM classification principle is to construct the classification hyperplane in the feature space and use    +  = ±1 to denote the two parallel hyperplanes of the classification function [19].The classification interval is 2/‖‖.As the classification interval reaches the maximum, ‖‖ should be the smallest.For the nonlinear problem, it can be converted into a linear problem using the kernel function.At the same time, the optimal classifier is obtained in the higher dimensional space.The solution of the optimal classifier can be transformed into a dual problem, where the objective function is as follows: After solving the dual problem, the optimal classification function is as follows: where (  , ) = Φ(  )Φ(  ) is the kernel functions satisfying Mercer conditions and b is classification threshold.
. .CS-SVM Algorithm.Cuckoo search (CS) is a novel heuristic global optimization algorithm proposed by YANG Xinshe and DEB Suash [20].It is based on the simulation of the behavior of the cuckoo bird's nest to find eggs.Compared with other algorithms, CS can obtain a globally optimal solution by using fewer parameters.It has been widely concerned with and applied in the engineering optimization issues [21,22].The flowchart of CS-SVM algorithm is shown in Figure 2.

Results and Discussion
. .Spectral Analysis.Experimental data of three types of rubber (IIR, SBR, and NR) were processed by MATLAB and imported into Origin software.Figure 3(a) shows the THz spectrum of the three types of rubber in time domain.Compared with the reference signal (air), all the three spectrum curves of the samples have some amplitude attenuation and time delay, and the curves are different from each other.The amplitude attenuation is caused by scattering and absorption of the sample surface, and the time delay is caused by sample thickness and refractive index.
The frequency-domain spectra are transformed from the corresponding time-domain spectra by using FFT showed in Figure 3(b).The spectra of three types of rubber have a certain degree of attenuation compared to the reference signal because some THz waves are absorbed by this rubber in varying degrees.However, the intensity and position of the absorption peaks are different.NR spectrum shows obvious peak at 0.31THz and 0.43THz, and the peak is higher than that of IIR and SBR.
Owing to the low signal-to-noise ratio over 1.4 THz for the spectral instrument, the effective spectral range is 0.3 ∼ 1.4THz.Figure 4 shows the absorption spectra of three types of rubber in 0.3 ∼ 1.4THz.IIR, SBR, and NR have obvious absorption peaks in 1.0∼1.4THz,which are the nonlinear absorption of rubber to terahertz radiation, similar to that of two-dimensional materials to optical bands and microwaves [23].The positions of these absorption peaks are shown in Table 1.The results show that the spectra of the three types of rubber are not significantly different and hard to distinguish directly.
. .Principal Component Analysis.In order to eliminate data redundancy caused by excessive data dimension, the PCA is utilized to reduce the dimension of absorption spectra.The absorption spectra matrix is reduced from 90 rows and 99 columns to 90 rows and 10 columns, and the top ten principal component factors are selected.Their cumulative variance contribution rate reaches 93.93%, and it means that the main information of original data is retained.The variance contribution rate and cumulative variance contribution rate of principal component are shown in Table 2.
The two-dimensional score of the first three principal components of three types of rubber is shown in Figure 5.In Figure 5(a), the clustering effect of three types of rubber is excellent, so they can be classified and identified.In Figure 5(b), the clustering effect of IIR is better than that of SBR and NR, so it is easy to identify in the score of the first principal component and the third principal component.However, the clustering effect of SBR and NR is relatively poor and difficult to identify.The three-dimensional score of the first three principal components of three types of rubber is shown in Figure 6.The clustering effect of three types of rubber is good, so they can be identified.Therefore, the principal component can effectively extract the sample classification characteristics.Then, a mathematical model was established to get a more accurate classification result using CS algorithm and SVM.
. .Cuckoo Algorithm Modeling.The spectral data of three types of rubber, dimensionality of which is reduced by the principal component, are divided into two data sets: a training set and a testing set.There are 30 sets of data for each type of rubber, 20 of which are used as training sets and the remaining 10 sets of data are used as testing sets.The labels for IIR, SBR, and NR are defined as 1, 2, and 3, as showed in Table 3.The experimental data with the label is added into the CS algorithm prediction model to extract the top 10 principal component data sets with the highest contribution rate as the new feature identification data.The identification rates of the training sets and the testing sets are obtained, respectively, and compared with the identification rate of PSO-SVM and Grid search model [24,25].The results are shown in Table 4.The identification rates of the training sets for three models were 100%.The identification rates of the testing sets for CS-SVM, PSO-SVM, and Grid search model are 100%, 96.67%, and 96.67%, respectively, and the identification times are 0.81s, 0.91s, and 3.26s, respectively.Experimental results show that CS-SVM is superior to the other two classification and identification models.
The identification results of CS-SVM, PSO-SVM, and Grid search for 30 testing sets of three types of rubber are shown in Figure 7. 30 testing sets of three types of rubber are completely classified and identified using the CS-SVM model, as shown in Figure 7(a).However, the Grid search and the PSO-SVM model cannot completely classify and identify 30 testing sets, as shown in Figures 7(b) and 7(c).Therefore, compared with Grid search and PSO-SVM, CS-SVM has a short identification time and a higher identification rate for three types of rubber.

Conclusions
Time-domain spectra, frequency-domain spectra, and absorbance of three types of rubber are obtained by THz-TDS.PCA is used to reduce the dimension of the spectral feature data, and the top ten principal components with cumulative variance contribution rate of 93.93% are extracted to establish three classification and identification models.The identification rates of the training sets and the testing sets for CS-SVM are 100%, and the identification time is 0.81s.CS-SVM is better than PSO-SVM and Grid search in testing sets identification rate and time.Experimental results show that the method proposed in this paper can identify three types

Figure 3 :Figure 4 :
Figure 3: Spectra of three types of rubber.

Table 1 :
The THz absorption peak of different rubber.

Table 2 :
Variance contribution rate and cumulative variance contribution rate of principal component.

Table 3 :
Data classification and sample label of three types of rubber.

Table 4 :
Identification rate and time of three types of models.