Modeling Based on the LS-SVM Optimized by PSO

According to the definition of nonlinear cointegration, this article studies the small sample nonlinear cointegration test and NECM (Nonlinear Error Correction Model) based on the LS-SVM (Least Squares Support Vector Machine) optimized by PSO (Particle Swarm Optimization). And the logical process of this method is also designed. (en, we carry out empirical research on the ship maintenance cost index and the several price indexes. Based on the judgment of the type of cointegration test relationship, the test of the nonlinear cointegration test relationship of small samples is realized, and the NECM for predicting the ship maintenance cost index is established as well, which is compared with the VAR model of linear vector. (e research result shows that the small sample nonlinear cointegration test andmodeling method based on the LS-SVMOptimized by PSO can describe the nonlinear cointegration test relationship of the small sample system. And the NECM has better performance.(e prediction effect can effectively predict small sample nonlinear systems. We also compare the prediction results with the wavelet neural network algorithm, and the results show that the generalization ability of LS-SVMOptimized by PSO is better, and the prediction accuracy of small samples is higher.


Introduction
e modeling method for traditional econometrics is based on the asymptotic theory of least squares estimation, which premised on the stability of the time series. In economic systems, especially in macroeconomic, variables are mostly nonstationary. erefore, traditional statistical inference cannot be directly applied. As a result, Engle and Granger have proposed the cointegration test theory [1] and pointed out that the cointegration system has three main forms, namely, the VAR (Vector Auto Regressive) model, the VMA (Vector Moving Average) model, and the ECM (Error Correction Model), which provide an approach to model nonstationary time series. Cointegration theory describes the long-term linear equilibrium relationship between each component sequence of the economic system and can be expressed by the ECM. Due to the pseudoregression phenomenon can be effectively avoided, the ECM based on linear cointegration technology was widely used in economic forecasting [2][3][4].
Linear cointegration implies two basic conditions. e first one is that each time series has an integer dimension, and the second one is that it has the same single integer order, which is the basis for the study of linear cointegration theory. However, many time series in the economic system have the characteristics of nonlinearity and long memory [5]. A single time series is not a single integer dimension but a fractal dimension or different integer dimensions, which means that the single time series is nonlinear and featured with fractal character, and the equilibrium relationship between sequences is often nonlinear as well. For nonlinear systems, the linear cointegration theory is not applicable, and the nonlinear cointegration theory shall be used to study the nonlinear equilibrium relationship between time series. Literature [6] gave the definition of nonlinear cointegration test: As for a vector time series X t � (x 1t , x 2t ,. . ., x nt ) T , the component sequence of {X t } shall be named as nonlinearly cointegrated, if (1) x it (i � 1,. . .,n) is a LMM(Long Memory in Mean, LMM) sequence or an I(1) sequence. (2) ere is a nonlinear function f(), let y t � f(x 1t , x 2t ,. . ., x nt ), which is an SMM (Short Memory in Mean, SMM) sequence.
e function f() is named as the nonlinear cointegrating function. If the nonlinear cointegration function has linear properties with respect to the independent variables, that is f x 1t , x 2t , . . . , x nt � α 1 x 1t + α 2 x 2t + · · · + α n x nt , (1) where α �(α 1 , α 2 ,. . ., α n ) is the vector in R n , and the long memory of the component sequence in {X t } is expressed as the I(1) sequence; the nonlinear cointegration shall be simplified to the linear cointegration relationship. erefore, the definition of nonlinear cointegration is a generalization of the concept of linear cointegration, and the nonlinear cointegration is a special case of nonlinear cointegration. As for the vector time series X t � (x 1t , x 2t ,. . ., x nt ) T , if the component sequences are all LMM sequences, (I(1) sequences are all LMM sequences; if the LMM sequence is I(1), its single integer order is not necessarily equal.), and there is a nonlinear function f() which ensure the f(x 1t , x 2t ,. . ., x nt ) is an SMM sequence. erefore, {X t } is named as nonlinear cointegration.
e key element to the testing and modeling of nonlinear cointegration relationship lies in the estimation of the nonlinear cointegration function f(). In view of the fact that the general linear cointegration analysis method cannot realize the nonlinear cointegration test and modeling, the literature [6] proposed a neural network algorithm suitable for the test and modeling of the nonlinear cointegration relationship of vector time series and discussed the theoretical basis of the method and its feasibility. By using a wavelet neural network with the ability to approximate nonlinear functions arbitrarily, literature [7] studied the NECM of nonlinear cointegration systems, provided the modeling method, and carried out relevant analysis. ese research found that compared with the linear VAR model, the NECM has a better prediction effect and can effectively predict the nonlinear economic system. For the nonlinear system modeling problems, many meta-heuristic algorithms have emerged in recent years, such as the Monarch Butterfly Optimization (MBO) [8], Slime Mould Algorithm(SMA) [9], Moth Search Algorithm (MSA) [10], Hunter Games Search(HGS) [11], Runge Kutta Method(RUN) [12], Colony Prediction Algorithm (CPA) [13], and Harris Hawks Optimization (HHO) [14]. ese algorithms have the characteristics of simple principle and easy implementation and have excellent performance in solving the optimization problems of complex nonlinear systems in the case of large samples. But the basic model of algorithm also has obvious defects; there is a lot of room for improvement.
For example, MBO has a single local search ability and limited search location, which will lead to the degradation of the population. Its global search method is too simple to make full use of the effective information of the population. However, adopting the elite retention strategy requires setting parameters and sorting operations, which increases the complexity of the algorithm [15]. e structure of SMA is simple and clear, but it is found that this algorithm still has some shortcomings, such as easy to fall into local optimization, slow convergence speed, low accuracy, and so on [16]. e MSA has the advantages of simple structure, few parameters, high accuracy, and strong robustness, but its development ability still needs to be improved, and it is easy to fall into local optimization [17]. In numerical analysis, the RUN is an important family of implicit and explicit iterative methods for the approximation of solutions of ordinary differential equations. But compared with other intelligent algorithms, the calculation process of this method is more complex.
e CPA algorithm has long running time, intermittent edges, and difficult parameters to adjust [18]. For HHO algorithm, how to balance contradictions between the exploration and exploitation capabilities and alleviate the premature convergence are two critical concerns that need to be dealt with in the HHO study [19]. Like the above algorithms, HGS requires a high number of training samples and has a good application effect in large sample research objects, but it is not suitable for small samples [20].
e above research provides effective methods and technical tools for the testing and modeling of nonlinear cointegration relationships. However, these methods have certain requirements for the quantity and quality of each series, which are suitable for large sample time series (samples shall be larger than 100 pieces). e neural network algorithm and wavelet neural network shall be prone to overfitting phenomenon, and the generalization ability of the model is not good as well [21], including relational testing and modeling have poor applicability. In view of this, this article proposes to apply the LS-SVM [22] with excellent learning performance and strong generalization ability to test the nonlinear cointegration relationship among small sample time series. Meanwhile, the PSO, which is widely used in function optimization, neural network training, fuzzy system control, and other genetic algorithms, is selected to jointly optimize the parameters of LS-SVM. is article further studies the modeling of small-sample NECM by the LS-SVM optimized by PSO. According to the method proposed in this article, the nonlinear cointegration relationship among the ship maintenance cost index and several price indexes is tested and modeled, and the empirical analysis' is carried out and compared with the VAR model and the wavelet neural network algorithm.

Nonlinear Cointegration
Time series memory shall be mainly judged from the convergence speed of auto-correlation function ρ k . As for short-memory time series, the auto-correlation function ρ k shall be decreased along with a negative exponential speed as k⟶∞, that is, ρ k ∼ ca -k (k � 1, 2, . . .). Where, a and c are constants, and a>1 and c > 0. As for long memory time series, the auto-correlation function ρ k shall be decreased along with the speed of negative exponent as k⟶∞, that is, ρ k ∼ ck -2d− 1 (k⟶∞), and d < 0.5. erefore, the faster the auto-correlation function of time series decreases, the more likely it is to be a short memory time series.

NECM Model.
Similar to the error correction model (ECM) derived from linear cointegration, for the vector time series X t � (x 1t , x 2t ,. . ., x nt ) T , when there is a nonlinear cointegration relationship among its component series, the derived NECM shall be where f() is a nonlinear cointegration function, θ is a parameter vector, ε t � (ε 1t , ε 2t ,. . ., ε nt ) T is a random error sequence, Γ j (j � 0, 1, 2, . . ., k) is the coefficient matrix of n×1, and k is the lag intervals for endogenous. en, the NECM of the ordinal number i component of (3) Estimate each parameter in formula (3) and bring the parameter estimation result into the original formula, then the NECM corresponding to each component can be obtained. e model can be used to predict each component sequence, and the ordinal number h step prediction model of the ordinal number i component shall be (4) e estimation f(X t-k ; θ) of the nonlinear function in NECM is the equilibrium error of {X t } at time t-k, and its function is to correct the difference sequence value, which is the same as the function of the error correction term in linear cointegration. As h is different, the estimation f(X t + hk ; θ) of the nonlinear cointegration function is also different. erefore, NECM's predictions of variables at different time periods are independent of each other.

Parameter Optimization of LS-SVM Model
Based on PSO Support Vector Machine (SVM) is an effective nonlinear problem processing tool developed in recent years, which has great advantages for solving small sample, nonlinear, and high-dimensional problems [25]. LS-SVM [26] is an extension of the standard SVM, which has good robustness and requires fewer parameters to be optimized, which is widely applied [27].

LS-SVM Model and Selection of Kernel Function.
LS-SVM is obtained by transforming inequality constraints into equality constraints in standard SVM algorithm [28], which is a form of SVM under quadratic loss function.
In LS-SVM modeling, the role of the kernel function is equivalent to projecting the samples into a high-dimensional space, transforming it into a linear regression problem, and then constructing the optimal regression curve [29]. erefore, the selection of the kernel function directly affects the generalization ability of the curve. e common kernel function mainly includes polynomial kernel function ( ). e selection of the kernel function includes the determination of the form of the kernel function and the determination of the kernel parameters. Considering the strong generalization ability of the RBF Complexity kernel function [30,31], this article selects the RBF kernel function for modeling research.

Optimization of LS-SVM Model Parameters Based on PSO.
After selecting RBF as the LS-SVM kernel function, there are 2 parameters shall be defined, namely the regularization parameter c and the kernel parameter σ 2 . ese two parameters largely determine the learning ability and prediction ability of LS-SVM. erefore, it is necessary to find the optimal combination of c and σ 2 .
At present, for the optimal value of LS-SVM parameters, the general selection method is grid search method, which achieves satisfactory results through continuous experiments. However, this method is time-consuming and inefficient. Some scholars proposed to use the gradient descent method to select the parameters of LS-SVM [32]. However, this method is limited by the fact that the kernel function must be differentiable, and it is easy to fall into local minima in the search process. Some scholars have also tried to use immune algorithm to optimize the parameters of LS-SVM [33] to reduce the blindness of parameter selection and improve the prediction accuracy of LS-SVM, but the implementation of this method is complicated. Besides, some scholars proposed to use genetic algorithm to determine LS-SVM parameters [34], but genetic algorithm needs to perform crossover and mutation operations, and many parameters need to be adjusted, which is computationally complex and inefficient. e PSO is a new stochastic optimization algorithm based on swarm intelligence. Similar to the genetic algorithm, PSO is also a random search optimization tool based on population, but there is no crossover and mutation operation. Particles follow the optimal particle in the solution space to search. It has the characteristics of parallel processing, good robustness, simple and easy to implement, high computational efficiency, and can find the global optimal solution of the problem with a large probability. erefore, the PSO algorithm was selected in this article to optimize the parameters of LS-SVM, and the nonlinear cointegration relationship of small sample time series was tested based on the LS-SVM optimized by PSO.

PSO Algorithm.
In the PSO algorithm, each alternative solution is called a "Particle". Multiple particles coexist and cooperatively search for optimization. Each particle flies to a better position in the problem space and searches for the optimal solution according to its own "Experience" and the best "Experience" of the adjacent particle swarm. Each particle in the particle swarm represents a potential solution of the system. Each particle is represented by three indicators: position, speed, and fitness. e mathematical expression of PSO algorithm is as follows [35]: Suppose in an n-dimensional search space, there are m particles forming a group, where X i �(X i1 , X i2 , . . ., X im ) is the current position of particle i; V i �(V i1 , V i2 , . . ., V im ) is the current flight speed of particle i; P i �(P i1 , P i2 , . . ., P im ) is the position with the best fitness value experienced by particle i, which is called the individual optimal position; P g �(P g1 , P g2 , . . ., P gm ) represents the position of the optimal fitness value searched by all particles in the whole particle swarm so far, which is called the global optimal position. e position of each particle varies according to the following formula: where k represents the kth generation of evolution; w is the inertia weight, indicating to what extent the original speed of the particle can be retained, the larger w has better global convergence ability, while the smaller w has a stronger local convergence ability, in order to slow down the movement process of the particles and prevent the oscillation phenomenon when the particles move towards P g ; generally, w is defined linearly decrease with the evolution; c 1 and c 2 are two positive learning factors, and c 1 adjusts the step size of particles flying to the individual optimal position P i , c 2 adjusts the step size of the particle flying to the global optimal position P g . e larger the value, the greater the acceleration of the particle flying to P i and P g . Generally, their value is 2.0. rand1() and rand 2 () are random numbers in the range [0,1]. Meanwhile, in order to reduce the possibility of particles leaving the search space during evolution, V i is usually limited to a certain range, namely V i ∈ [-V max ， V max ]. V max is a constant value. In general, if the search space of the problem is limited to [-X max ，X max ], then V max � α * X max can be defined, generally 0.1α ≤ 1.0. e number of particles m is generally 20∼40, the length n is determined by the optimization problem, and the range of particles is also determined by the optimization problem, and different ranges can be set for each dimension. e termination condition (fitness) of the algorithm can be defined as the maximum number of cycles or the minimum error requirement, or it can be defined by the specific problem [36]. Calculate the Fitness of Each Particle Update P i and P g According to Particle Fitness Update Particle Swarm Velocity and Position According to (11) and (12) Whether termination conditions are satisfied Obtain the optimal (γ and σ 2 ) parameter values

Basic Procedures for LS-SVM Optimized by PSO.
e LS-SVM kernel function selected in this article is RBF, and the parameters to be optimized are c and σ 2 . erefore, the particle length in PSO shall be defined as 2. e particle fitness shall select the mean square error (MSE) between the LS-SVM target value and the output value, whose function form is where n is the number of training samples, y i is the target value of LS-SVM, and y i is the output value. e smaller the particle fitness means, the better current particle position. According to the basic principle and steps of PSO algorithm, the logic flow for optimizing the parameters of c and σ 2 LS-SVM is shown in Figure 1. e optimization steps are as follows: PSO initialization, define the particle swarm number as m, the maximum number of iterations as T, the initial value of inertia weight w as 0.9, which decreased linearly with the increase of iteration times, w (t) � 0.9-t/T * 0.5. Define the maximum speed V max ; random initialization of particle swarm in each particle position and velocity, namely t � 0, noted as .., m. Calculate the fitness of each particle. e values of ci and σi2 contained in the current position of each particle shall be substituted into the LS-SVM model to obtain the model output value y i , which shall be compared with the target value yi to calculate the fitness of each particle. For each particle, if the fitness (that is, the MSE of the LS-SVM target value and the output value) of its current position is smaller than the fitness of its previous best position Pi, the best position of this particle is its current position. For each particle, if the fitness of its best position Pi is smaller than the fitness of the global best position Pg. of all particles, the global best position Pg. will be replaced by Pi. Change the velocity and position of the particle according to formula (5) and (6), so as to generate new particles. Check whether the end condition is reached, if not, return to Step2. Obtain the optimal particle information, and assign the optimal combination of c and σ2 to the LS-SVM model.

Nonlinear Cointegration Test and NECM Modeling Based on LS-SVM Optimized by PSO
e nonlinear cointegration relationship of small sample time series shall be tested by the method of LS-SVM optimized based on PSO; thus, the NECM model is established. e basic steps are as follows: Check whether each component sequence has a linear cointegration relationship. e method is to calculate the fractional integration order of each sequence. If the fractional integration order is different, it means that each component sequence does not have a linear cointegration relationship. Based on the LS-SVM optimized by PSO, the nonlinear cointegration relationship among small sample time series shall be tested, and the nonlinear cointegration function f() shall be estimated and tested as well. e input of LS-SVM is each small sample sequence to be tested, and the target value is the SMM sequence with constant mean value. e minimization of the mean square error (MSE) between the target value and the output value is used as the training target, and the kernel function of LS-SVM is optimized based on PSO algorithm to obtain the optimization model. e memory test is performed on the optimized output sequence. e memory test method adopts the modified R/S statistic Q * n proposed by Lo [21], and the distribution function of Q * n shall be where v is the quantile and F(v) is the cumulative probability, that is, F(v) � p(Qn< v). e larger the F(v), the greater the probability that the sequence belongs to the long memory sequence. When v is 3, F(v) 0.999 99. When the long memory is not significant, the output sequence of LS-SVM is a short memory time series, indicating that there is a nonlinear cointegration relationship between the sequences, and f() is the nonlinear cointegration function of the sequence to be tested. Substitute the obtained nonlinear cointegration function f() into formula (3) and use the least squares method to estimate the parameters based on Eviews software. Substitute the parameters into the prediction models (4) and (5), and the calculation result shall be the predicted value.  x i . (9) e sample variance shall be

Empirical Analysis
e accumulated deviation shall be e range value shall be en the estimated value of R/S statistic shall be In case of short memory and heteroscedasticity, the R/S statistic is not robust. Lo [37] proposed a revised R/S statistic as where σ * 2 As for a time series {x t , t � 1, 2, . . ., n}, taking m observations of the series, the {x t , t � 1, 2, . . ., n} is divided into l � [n/m] independent time series with length l. Calculate the modified R/S test statistic Q m i (i � 1, 2, . . ., l). Furthermore, the average value of number l for R/S test statistics Q m i shall be calculated to obtain the modified R/S test statistics of time series with length m.
As for different m values, the modified R/S statistic sequence can be obtained, and Mandelbrot proved that where C is a constant, and H is the Hurst exponent. at is:  6 Complexity of sequence memory and fractional order, the memory of SMPI, CPI, PPI, and MPI in the sample interval is tested. e modified R/S statistic Q * n of each index is shown in Table 1. From the test results in Table 1, the modified R/S statistic Q * n of SMPI, CPI, PPI, and MPI is far greater than 3, ese time series have significant long memory characteristics and belong to long memory time series. e following is the calculation of the fractional order of each index, taking SMPI as an example.
If we define value m as 1, 2, 3, 4, and 5, respectively, for finding out the statistic Q * n corresponding to SMPI, the results are shown in Table 2.
e modified R/S statistic Q * n and the corresponding m value of SMPI are substituted into Equation (17). H � 0.804 was obtained by least square regression. erefore, the fractional order of SMPI is d � H-0.5 � 0.304. By using the same method and steps, the fractional order of CPI, PPI, and MPI shall be calculated separately. e results are shown in Table 3.
According to the statistic Q * n test results of SMPI, CPI, PPI, and MPI, and the calculation of the fractional order of each index, we find out that each index belongs to the long memory time series, and the fractional order of each index is not the same. erefore, there is no linear cointegration relationship among SMPI, CPI, PPI, and MPI. Whether there is a nonlinear cointegration relationship among them requires further testing.

Estimation and Testing of Nonlinear Cointegration
Relationships. According to the steps of nonlinear cointegration relationship test described in 4.2, this article selects 18 groups of data (SMPI, CPI, PPI, and MPI) from 2000 to 2017 as training samples to estimate the nonlinear function relationship and optimize the LS-SVM parameters. en, the 4 groups of data (SMPI, CPI, PPI, and MPI) from 2018 to 2021 are used as test samples and substituted into the optimized LS-SVM model by rolling verification. e revised R/S statistic Q * n are estimated for 19, 20, 21, and 22 output sequences to test whether they were short memory sequences.
In the training and verification of LS-SVM model, the input variables are SMPI, CPI, PPI, and MPI, and their minimum value is 100 (that is, the benchmark value in 2000). Among these variables, SMPI has the largest value, but its maximum value is less than 600. Considering the possibility of prediction range, this article sets the maximum value of input variables as 1000. erefore, the range of input variables of LS-SVM model is [100, 1000]. Since it is necessary to determine whether there is a nonlinear cointegration relationship among input variables, the output variable of LS-SVM model is set as an SMM sequence according to the definition of nonlinear cointegration.

Model Parameter Setting.
Define key parameters of PSO algorithm: particle number is 20; the particle length is 2, and the parameters to be optimized are c and σ 2 . e search range of parameter c is (0,40000), and search range of σ 2 is (0,10000). e learning factors c 1 and c 2 shall be 2. e maximum number of iterations is 5000; the initial value of inertia weight w was defined as 0.9, which decreased linearly to 0.4 with the increase of iteration number. e maximum speed V max is 10. Random initialization of particle swarm in each particle position and velocity, namely t � 0, . ., 20. Considering that the PSO algorithm is a random search algorithm, this article operates the defined situation 30 times and takes the one with the smallest error as the final training result.

Model Optimization
Results. According to the above settings, the optimal parameters shall be calculated as c＝34 369.670, σ 2 ＝5.912. e final error with the target value is -0.025 1, and the average error is -0.026 8. e iteration in the calculation process and the particle distribution in the final iteration are shown in Figure 3 and Figure 4, respectively.

Nonlinear Relationships Test.
e test samples from 2018 to 2021 were substituted into the LS-SVM model optimized by PSO algorithm, respectively, and the modified R/S statistic Q * n was used to perform long memory test for the 4 output sequences. e statistical test results are shown in Table 4.    It can be seen from Table 4 that the test results are not significant, which proves that the four output sequences are short memory sequences, and the nonlinear function determined by the LS-SVM model is the nonlinear cointegration function of SMPI with CPI, PPI, and MPI.

Prediction of SMPI Based on NECM Model.
Based on the test results of nonlinear cointegration relationship among SMPI, CPI, PPI, and MPI, this article establishes the NECM prediction model of SMPI.

Model Setting.
We suppose that X t �(SMPI t , CPI t , PPI t , MPI t )', the nonlinear cointegration function among SMPI, CPI, PPI, and MPI is f(X t ). e model shall be defined as follows： (18) According to the AIC and SC criteria, the lag order in formula (18) is determined as k 1 � 1, k 2 � 3, k 3 � k 4 � 1. From the properties of the NECM model, it can be known that the estimation of the nonlinear function f(X t ) in equation (18) is directly related to the time of the system. As t is different, the estimation of the nonlinear cointegration function is also different. e change of SMPI at time t, ΔSMPI t , is affected by the estimation result of the system deviation from equilibrium value f(X (t-1) ) at time t-1.
erefore, the lag order k 5 in the nonlinear function estimation is 1.

Estimation of Model Parameters.
To predict SMPI based on the NECM model, it is necessary to estimate the f(X t ) of the system at different times. Afterward, the estimation result shall be substituted into equation (18) to estimate the parameters of the equation. erefore, this article also uses the data from 2000 to 2017 as training samples to estimate the nonlinear functional relationship based on the parameters of the LS-SVM model optimized by PSO, applies the output as the estimation result of f(X (2017) ), and adopts the least squares method. We estimated equation (18) for obtaining ΔSMPI 2018 in 2018. Afterward, the predicted value of SMPI of 2018 is also obtained by SMPI t � SMPI t-1 +ΔSMPI t , with implementing the error analysis for actual value. Along the same lines, the SMPI for 2019-2021 shall be predicted and compared with the actual value using a rolling forecast verification method. In the construction of each ΔSMPI t prediction model from 2018 to 2021, the LS-SVM model was optimized by the PSO algorithm, and the nonlinear cointegration function among SMPI, CPI, PPI, and MPI is estimated as well. We have the parameters c and σ 2 in the LS-SVM model. e results are as follows shown in Table 5.
With LS-SVM optimized by PSO in each group, we have sequence results of f(X (t) ) (t � 2017, 2018, 2019, 2020). e historical data of each index change value and the sequence results of f(X (t) ) (t � 2017, 2018, 2019, 2020) were substituted into Equation (18), and the least-squares estimation of model parameters was carried out by using Eviews software. Taking ΔSMPI 2018 as an example, the parameter estimation results are shown in Table 6.

Analysis and Comparison of Model Prediction Results.
According to the parameter estimation results in Table 6, we use formula (18) to obtain ΔSMPI 2018 � -7.469. e above results and the ship maintenance cost index in 2017 are substituted into SMPI t � SMPI t-1 +ΔSMPI t , and we have SMPI 2018 � 575.235. e error with the actual value of 2018 ship maintenance cost index is 0.26%. According to the same idea, we may have the predicted results of the SMPI from 2019 to 2021 and the error with the actual value. e prediction results and errors of NECM are shown in Table 7.
In order to compare the prediction effect of SMPI based on NECM, this article constructs linear VAR models of SMPI, CPI, PPI, and MPI with samples from 2000 to 2021 to e lag order of each variable in Formula (19) is set in the same way as that in Formula (18). e samples from 2000 to 2017 are substituted into Equation (19), and the least square method is adopted to estimate model parameters. According to the estimation results of model parameters, we predict the SMPI from 2018 to 2021. e prediction results and errors of VAR model are shown in Table 7.
As can be seen from Table 7, compared with SMPI prediction results based on NECM model, the prediction effect of VAR model is poor and cannot be effectively predicted, which also proves that the linear relationship among SMPI, CPI, PPI, and MPI is not significant.
Next, based on the conclusion that there is a nonlinear relationship among SMPI, CPI, PPI, and MPI, we use the wavelet neural network algorithm proposed in literature [7] to study the nonlinear cointegration relationship, establish the NECM model to predict SMPI, and get the prediction results from 2018 to 2021. e results are shown in Table 7. In the research process, it is found that the calculation process of wavelet neural network algorithm is more complex than LS-SVM. At the same time, due to the small number of samples in this paper (there are about 480 training samples in literature [7]), there is an overfitting phenomenon, but the accuracy of future trend prediction is also weaker than the model proposed in this article.

Conclusion
In this article, the small sample nonlinear cointegration test and NECM based on the LS-SVM optimized by PSO were studied. And the logical process of this method was also designed. en, we carry out empirical research on the SMPI and the several price indexes. Based on the judgment of the type of cointegration test relationship, the test of the nonlinear cointegration test relationship of small samples was realized, the NECM for predicting the SMPI was established as well, which compared with the VAR model of linear vector. e research result showed that the small sample nonlinear cointegration test and modeling method based on the LS-SVM optimized by PSO can describe the nonlinear cointegration test relationship of the small sample system. And the NECM had better performance. e prediction effect can effectively predict small sample nonlinear systems. We also compared the prediction results with the wavelet neural network algorithm, and the results showed that the generalization ability of LS-SVM Optimized by PSO was better, and the prediction accuracy of small samples was higher.
Data Availability e data presented in this study are available upon reasonable request from the corresponding author.  Conflicts of Interest e authors declare no conflicts of interest.