Prognostics for State of Health of Lithium-Ion Batteries Based on Gaussian Process Regression

Accurate estimation and prediction of the lithium-ion (Li-ion) batteries’ performance has important theoretical and practical significance to make better use of lithium-ion battery and to avoid unnecessary losses. State of health (SOH) estimation is used as a qualitative measure of the capability of a lithium-ion battery to store and deliver energy in a system. To evaluate and predict the SOHof batteries, the Gaussian process regressionwith neural network (GPRNN) as its variance function is proposed. Experimental results confirm that the proposed method can be effectively applied to Li-ion battery monitoring and prognostics by quantitative comparison with basic GPR, combination LGPFR, combination QGPFR, and the multiscale GPR (SMK-GPR, P-MGPR, and SEMGPR).The criteria of RMSE andMAPEof the proposed threemodels are reduced significantly compared to those of other existing methods.


Introduction
Lithium-ion battery has been widely applied to portable electronic devices, electric and hybrid vehicles, even military electronics, aerospace avionics, and other automotive vehicles because of the properties of lightness, high energy density, high output voltage, long lifetime, and low selfdischarge [1,2].The safety and reliability of Li-ion battery in actual application are critical factors of the system performance whose failure would not only lead to the performance degradation, repair costs increase, and inconvenience, but also cause catastrophic accidents due to overheating and short circuiting [3].Therefore, the effective monitoring and prognostics for Li-ion battery have been attracted much more attention and a reliable and effective BMS (battery management system) is indispensable [4,5].The precondition of BMS is monitoring and estimation of states which includes estimations of SOC (state of charge), SOH (state of health), and SOL (state of life).The above three states estimations are still the core content in the research of BMS [4,[6][7][8].SOH is the ratio of the released capacity with a certain discharge from full charge state to cut-off standard voltage under standard condition and its corresponding initial capacity which is the response of the battery performance degradation and the accidents possibility.In general, SOC describes the shortterm changes of the current parameters and SOH describes long-term changes.SOH does not require continuous measurements and only needs periodical measurements in most cases, and the measured periods depend on the applications.Based on the above advantages of SOH, this paper focuses on research of SOH to reflect the performance of Li-ion battery.
With the development of Li-ion battery, SOH estimation plays an important role as a quantitative description for the battery performance indicators to store and deliver energy in the system [9,10].Generally, there are two mainstream methods to estimate SOH [11].One is experience-based approach such as cycle counting method, Ah Low and Ah weighted method, and event-oriented cumulative method of aging.The other method is performance-based approach which utilizes different forms of performance models and considers the aging process and stress factors.According to different sources of information used, the performancebased approach is divided into three categories which are mechanism-based method, characteristics-based method, and data-driven method.Due to the complex physical and chemical processes of the battery itself, a lot of the rules are difficult to describe directly through the mechanism.The method which describes the battery performance from the perspective of the test data is called data-driven approach.The acquired data tends to have a strong uncertainty and incompleteness; testing all the possible factors affecting the battery life in the practical application is unrealistic.Although the limitations exist, data-driven method is still an effective prediction method.It is used to excavate the rules of battery performance for SOH prediction through using battery performance test data.Data-driven prediction does not require mechanical knowledge of object systems.It excavates the implied information to predict SOH through a variety of data analysis learning methods.Therefore, it can avoid the difficulty of the model acquisition, so it is a more practical prediction method.
There are many common data-driven algorithms such as Particle Filtering (PF), Support Vector Machine (SVM), and Relevant Vector Machine (RVM).PF can combine available information from system measurements and analytic models for SOH prediction [10,12].Kalman filter and extended Kalman filter (EKF) are applied to SOH prediction of batteries by using estimation of equivalent circuit model [13,14].Unscented filtering and Bayesian filtering are classical methods for battery SOH or SOC estimation.These stochastic filtering methods for prognostics are easier to be obtained and have wider application scope.However, the methods are difficult in containing the environmental interference and loading dynamic characteristics factors and so on, and the dynamic accuracy and adaptability are still limited.Moreover, PF algorithm itself solves the problem of parameters initialization which is time-consuming in the case of more parameters.Some methods are mostly improvement and optimization of PF or KF such as sequential Monte Carlo method referring to PF which has attracted attention to SOH [15].In recent years, SVM has been used in SOH prediction of Li-ion battery [16].In addition to basic SVM algorithm, there are some improved SVM algorithms and fusion algorithms.SVM method still has some deficiencies: the kernel function must satisfy the Mercer condition, sparsity is limited, the number of support vectors is sensitive to error boundary, and the loss of function and the penalty factor are difficult to determine.SVM also lacks the expression of uncertainty management.To solve this problem, RVM approaches have been proposed.RVM algorithm has higher calculation accuracy and lower computational complexity compared with SVM and can output the uncertainty information of prediction results [17].Kim obtained the relevant parameters of exponential model by using EIS test data based on RVM algorithm and gave the prediction outcomes [1].Oai et al. developed a new technique to predict SOH using a dual-sliding-mode observer.The researchers presented a method, Li-ion battery SOH prediction concept, based on an appropriate SOH definition [18].However, since the much sparse and dynamic fluctuation of capacity data will lead to forecasting the result of poor stability if RVM is directly used for prediction only, there are many others methods such as neural fuzzy logic [19], networks [20], regressions [21], and distributed active learning [22] that lack specific strategies in dealing with model uncertainties and would lead to prognostics performance deterioration when facing long-term changes in environmental and operating conditions.
To consider modeling flexibility and uncertainty representations, Gaussian process regression (GPR) has been investigated for Li-ion battery prognostics, where the degradation trends are learnt from battery datasets with the combination of Gaussian process functions [23].GPR could provide a confidence measure of output and has been used for SOH predictions as a kind of new machine learning method based on Bayesian theory and statistical learning theory.Meanwhile, it is a nonparametric method which is suitable for high dimension nonlinear regression problems and can give the uncertainty representation of SOH prediction.An improved GPR method has been proposed to capture regeneration phenomena [24].Li and Xu proposed a new prognostics method for state of health estimation of Li-ion batteries based on a mixture of Gaussian process models and particle filter proposed [25].However, forecasting accuracy and uncertainty of the above two methods do not meet the requirements of practical application.
In this paper, an approach based on GPR with neural network kernel function is presented to predict the SOH of Li-ion batteries.Neural network is ideally suited to the SOH data, since it has smaller misspecification than other stationary covariance functions and draws saturation at different values in the positive and negative directions of variable  [23].Meanwhile, the log marginal likelihood is much larger than other covariance functions.The proposed method consists of three models: the neural network itself, the sum of the neural network and the Maternard covariance function, and the product of the neural network and the periodic covariance function.Finally, experiments based on the NASA battery datasets are provided to demonstrate the performance of the proposed three prognostics models.The experimental results show that the predictions are near perfect.
The remainder of this paper is organized as follows.In Section 2, the background knowledge about the SOH of Liion batteries is described.Then, the proposed methodology including GPR and its parameters is provided in Section 3. Experiments and analysis are given in Section 4 to demonstrate the performance of proposed prognostics algorithm.Finally, conclusions are drawn in Section 5.

The SOH of Lithium-Ion Batteries
There are four definitions of SOH according to the battery characteristics which are shown as follows [26].
(1) From the perspective of remaining power of battery to determine the battery SOH, its definition can be given by where  aged is the current biggest power of battery and  new is the initial power.
(2) From the perspective of starting power to define the battery SOH, the expression is given by where CCA ocmp is the real-time starting power and CCA new is the predicted starting power released by battery when the SOC is 100% and CCA min is the minimum starting power needed.
(3) From the perspective of impedance measurement to define the battery SOH, the definition is depicted as follows: where   is th impedance measurement varying with the cycles of charging and discharging and  0 is the initial impedance.
(4) From the perspective of battery capacity power to define the battery SOH, this method can be expressed as where   is the th capacitance value degenerated with cycles and  0 is the initial capacity.
In this paper, this definition of SOH using the capacity power is utilized to study the performance of battery.

Gaussian Process Regression.
Gaussian process (GP) is used to describe random variables which are scalars or vectors.GP is a stochastic process which also governs the properties of functions and gets Bayesian inferences with the function-space view.GP is a set where any finite random variable has joint Gaussian distribution, and the characters of GP are completely determined by the mean function and the covariance function [28].
Consider the simple regression model The training set is {(  ,   ) |  = 1, . . ., }, where   ,  = 1, . . ., , are the input values,  is a function value, and   ,  = 1, . . ., , are the output values. represents noise which follows an independent, identically distributed Gaussian distribution with zero mean and variance  2 ; that is, Then the prior distribution of the observation  is obtained: It can be used to estimate the predictive values.The joint distribution of the observation  and the predictive values  * is derived as follows: where (,  * ) represents the covariance of the training data points and predictive values and ( * ,  * ) is the covariance of the predictive values.
Computed by Bayes' rule, the posterior distribution over the weights can be gotten.The posterior distribution to inference in the Bayesian linear model can be shown as (10)

Choice of Parameters.
GP is usually parameterized in terms of their covariance functions.There are some most popular covariance functions [23,29]: the squared exponential covariance function, the constant covariance function, and the periodic covariance function, These covariance functions are stationary and are usually used in battery health prognostic with good results.Here in this paper a particular type of neural network is applied in battery health prognostic, which is called neural network covariance function; see (7).Neural network is ideally suited to the SOH data, since it allows saturation at different values in the positive and negative directions of  [23].For comparison, the predictive distributions for three covariance functions are shown in Figure 1.The 64 data points are produced by a step function with Gaussian noise with standard deviation  = 0.1 which is totally the same as the example taken in [23].Figure 1 shows the means and 95% confidence intervals for the noisy signal in grey.Particularly, Figure 1(a) expresses the single squared exponential covariance and Figure 1(b) illustrates the sum of two squared exponential covariances.Obviously, this covariance function is more flexible than a single squared exponential, since it has two magnitudes and two length-scale parameters.The predictive distribution looks a little bit better, but they both are not ideal for this scenario.Figure 1(c) shows the neural network covariance function which is ideally suited to this case, since it allows saturation at different values in the positive and negative directions of .As shown in Figure 1, the predictions are also near perfect.New kernels can be generated with this kernel; for example, the sum of the neural network covariance function and the Maternard covariance function is a new kernel and the product of the neural network covariance function and the periodic covariance function is also a new kernel.The Maternard covariance function is stationary and nondegenerate and the periodic covariance function is selected to approximate the local regeneration phenomenon of SOH data.Therefore, these two covariance functions are used in combination with the neural network in this paper.According to the covariance function chosen in our paper, the simulation results are better than others' .It turns out that the neural network covariance function can be viewed as a suitable choice of this type of data.The covariance functions used in the proposed paper are as follows [23].
Typically, the used covariance function has some free parameters which are generally called free hyperparameters.As these parameters vary, the predictions of data are different which will lead to great fitting error.In order to get the suitable hyperparameters, optimization with the maximization of the log-likelihood function is necessary [28,30]: The hyperparameters can be determined from training data.Notice that the length-scale  in the kernel varies, and the signal variance  2  and the noise variance  2 would vary.Through experiments, it can be found that when different initial values are set, different fitting curves are obtained and the error between them would be great.According to the proposed issue, an appropriate initial value is chosen.
A larger number of training data enable GPR model to get better prognostic prediction.According to the experimental data, the linear mean function () =  +  is selected to fit the track of the data.The covariance functions discussed above, that is, the neural network shown as (7), the sum of neural network and Maternard given as (,   ) =  1 (,   ) +  2 (,   ), and the product of neural network and periodic displayed as (,   ) =  1 (,   ) 3 (,   ), are considered.The three covariance functions are denoted as Model I, Model II, and Model III.Then the hyperparameters spaces in the three models are Θ 1 = [, , l, sf]  , Θ 2 = [, , l, sf1, ell, sf2]  , and Θ 3 = [, , l, sf1, ell, p, sf2]  , where ,  are coefficients of the mean function, Λ is  times the unit matrix, and sf, sf1, and sf2 control the variance  2  in three modes, respectively.These three models are all effective and are designed to fit SOH data and make the prediction more accurate.

Raw Data Description of Li-Ion Batteries.
To comprehensively illustrate the efficiency of the proposed method, the dataset about the aging of Li-ion batteries is selected which is obtained from the data repository of the NASA Ames Prognostics Center of Excellence (PCoE) [31].
The dataset of the NASA Ames PCoE consists of 36 Li-ion batteries' data: No. 5, No. 6, No. 7, No. 18, and Nos.25-56.Because the lengths of SOHs extracted from the 36 batteries are different, Li-ion batteries of No. 5, No. 6, and No. 7 are the most commonly used in related literatures [24,25,27] because they are much longer.They were run through 3 different operational profiles (charge, discharge, and impedance) at room temperature.Charging was carried out in a constant current (CC) mode at 1.5 A until the battery voltage reached 4.2 V and then continued in a constant voltage (CV) mode until the charge current dropped to 20 mA.Discharge was carried out at a constant current (CC) level of 2 A until the battery voltage fell to 2.7 V, 2.5 V, 2.2 V, and 2.5 V for batteries Nos. 5, 6, 7, and 18, respectively.Impedance measurement was carried out through an electrochemical impedance spectroscopy (EIS) frequency sweep from 0.1 Hz to 5 kHz.Repeated charge and discharge cycles result in accelerated aging of the batteries while impedance measurements provide insight into the internal battery parameters that change as aging progresses.
The experiments were stopped when the batteries reached end-of-life (EOL) criteria, which was a 30% fade in rated capacity (from 2 Ahr to 1.4 Ahr).This dataset can be used for the prediction of SOC, SOH, and the remaining useful life (RUL) of batteries [23].
In this paper, the data from batteries No. 5, No. 6, and No. 7 is processed using (4) to implement the experiments including training and testing.Figure 2 shows the SOH curves of these three batteries.The total charge/discharge cycles are all 168. Figure 2 indicates that the SOH shows an obvious global degradation trend and local regeneration phenomenon.

Training with GPR.
For No. 5, No. 6, and No. 7, the first 100 SOH points are utilized to train the proposed GPR model and the remaining 68 points are used for prediction.For the training data from the above three batteries, the mean function is chosen as the linear function () =  +  and the covariance function is chosen as three functions discussed above; according to these three models, the algorithm for the training and prediction based on GPR is produced as follows: Note.Some experimental results are obtained from [24,27]; LGPFR: Gaussian process functional regression with linear mean function [24]; QGPFR: Gaussian process functional regression with quadratic polynomial mean function [24]; Combination LGPFR: LGPFR with combination of squared exponential covariance function and periodic covariance function [24]; Combination QGPFR: QGPFR with combination of squared exponential covariance function and periodic covariance function [24]; SMK-GPR: the GPR method with spectral mixture kernels [27]; SE-MGPR: multiscale GPR methods with squared exponential function [27]; P-MGPR: multiscale GPR methods with periodic covariance function [27].1, 0, rand, 2, 0.9, 2]  , and for Model III are Θ 3 = [, , l, sf1, ell, p, sf2]  = [0.5, 1, rand, 2, 0.9, 2, 2]  , wherein signal "rand" which is the value of parameter "" in these three models is a random number between 0 and 1.
To compare with other methods, the hyperparameter in basic GPR is selected as which is the same as the hyperparameters used in [24].(e) Apply the training data {,   }  =1 and the testing data {,   } + =+1 to Models I, II, and III with the optimal hyperparameters.Then the prediction outputs will be derived.

Prediction and Comparison.
In this part, the prediction results of five GPR models, basic GPR, combination LGPFR, and Models I, II, and III for batteries No. 5, No. 6, and No. 7, are shown in Figures 3-5, respectively.As shown in Figure 3, the red lines are the actual SOH value, the blue scatter plots are the prediction result, and the grey regions represent the 95% confidence intervals.Taking Figure 3, for example, for the method of basic GPR shown in Figure 3(a), the mean prediction output is further away from the actual SOH for the subsequent cycles.Meanwhile, the 95% confidence intervals increase significantly.The above two aspects demonstrate the poor prediction effect of basic GPR.For the method of combination LGPFR shown in Figure 3 The concrete prediction errors comparisons of the five GPR models for three batteries are indicated in Table 1.Here, two criteria of root mean square error (RMSE) and the mean absolute percentage error (MAPE) are introduced to evaluate the prediction performance, which are shown as follows: Mathematical Problems in Engineering where  is the prediction steps,   represents the actual SOH, and ŷ represents the predicted SOH.Table 1 shows the prognostic RMSEs (%) and MAPEs (%) of different SOH prediction methods, respectively.Compared with the other popular published methods in [24,27], both RMSE and MAPE values have been reduced greatly.It is found that, for three batteries, the prognostics accuracies of SMK-GPR, SE-MGRP, and P-MGPR in [27] are improved evidently compared with the combination LGPFR and combination QGPFR in [24].It is also seen that, for the proposed Models I, II, and III, the maximum prediction errors of batteries Nos. 5, 6, and 7 are averagely less by 50% than SE-MGRP and P-MGPR models, respectively.Furthermore, the MAPE and RMSE values are less than 1% for three batteries using Model II.In conclusion, it is found that the proposed method shows a much better prediction performance than the other methods.The small values of MAPE and RMSE indicate that the proposed approach can be suitable to perform accurate SOH prediction and satisfy the engineering requirement.The prediction results for No. 6 and No. 7 are similar to No. 5's.The prediction results of five GPR models are shown in Figures 4 and 5.
From the above three experiments for different Li-ion batteries, the efficiency of the proposed method is verified.

Conclusions
In this paper, a battery SOH estimation approach based on GPRNN is presented.In order to improve the long-term prediction accuracy and uncertainty based on basic GPR and combination GPFR and multiscale GPR, three GPR models based on neural network are used.Compared with the other mixture methods of GPR and multiscale GPR algorithms, the proposed GPRNN models are simpler; therefore, they satisfy the need of real-time prognostics.Moreover, the experimental simulations show that the proposed models have higher prediction accuracy and lower prediction uncertainty for Liion batteries with the complicated phenomenon.To further verify the efficiency of the proposed algorithm, using dataset with greater number of degradation data samples in real industrial applications is necessary.Finally, exploring the methodology with better prognostic performances will be the future work.

( a )
Determine the training datasets {,   }  =1 , where  is the number of charge/discharge cycles and   is the corresponding value of SOH at the th charge/discharge cycle.For batteries No. 5, No. 6, and No. 7,  equals 100.(b) Initialize the hyperparameters included in the mean function and covariance function for the different training sets.Here, the initial hyperparameters for Model I are Θ 1 = [, , l, sf]  = [0.1,0, rand, 2]  , for Model II are Θ 2 = [, , l, sf1, ell, sf2]  = [0.
(c) Optimize the hyperparameters with the maximization of the log-likelihood function.(d) Set the prediction steps , for No. 5, No. 6, and No. 7,  = 68.
(b), although the prediction values are close to the actual SOH data points, the 95% confidence intervals are still wide.The wide confidence intervals indicate the high uncertainty of the prediction result.Experiments have shown the limitations of Li-ion batteries' SOH prediction based on the basic GPR model and the combination LGPFR model.The simulations with Models I, II, and III for No. 5 are shown in Figures 3(c)-3(e); it is clear that both the point prediction results and the uncertainty representations are improved greatly.

Table 1 :
Prediction errors comparison of different methods for batteries Nos. 5, 6, and 7.