State-of-Charge Estimation of Lithium-Ion Battery Pack Based on Improved RBF Neural Networks

,


Introduction
Lithium-ion batteries have been widely used as energy storage devices and in electric vehicles due to their desirable balance of both energy and power densities. Compared with single lithium battery cells, a lithium battery pack with hundreds even thousands of battery cells connected in parallel and series is able to provide the required power in various applications [1][2][3]. e battery management system (BMS) plays an important role in maintaining safe and efficient operation of the battery. e State-of-Charge (SOC) of li-ion battery pack is a key parameter affecting the battery life, safety and efficient operation [4,5]. Based on the accurate estimation of SOC, effective management strategies can be developed to avoid overcharging/overdischarging, prolong the cycle life of batteries, and prevent the occurrence of security incidents [6]. Furthermore, with the correctly estimated SOC information, drivers can also arrange the driving time properly.
Due to the complex nonlinear characteristics of li-ion batteries, SOC cannot be measured directly in real-time applications, and it needs to be inferred using other measurable variables [7]. Since a battery pack may consists of hundreds and even thousands of battery cells, the computation effort for modelling is increased accordingly. Besides, the inconsistency of cells in a battery pack varies along with the life of the battery. us, it is a challenge to accurately estimate the SOC of the battery pack. Recently, a number of methods have been proposed to improve the SOC estimation and they can be grouped to three general approaches for the estimation of battery pack SOC. e first approach integrates the cell model into the structure of the battery pack [8,9]. However, the inconsistency between different cells in a battery pack is ignored.
In the second category, the single cell SOC estimation approach is directly extended to battery packs, including open circuit voltage method [10], ampere-hour integral method [11], Kalman filter [12,13], and the equivalent electric circuit model [8]. ese methods treat the battery pack as a "big battery" [14], which makes the SOC estimation simpler and more quick. However, the simple model is based on the precise mechanism of single cells. Due to the inconsistency between different battery cells, estimation error inevitably exists. e third category includes various statistical methods. Plett first proposed the Bar-Delta Filter method in 2009 [15] which uses a Sigma Point Kalman Filter (SPKF) to estimate the average SOC of the battery pack and Delta Filters to estimate the variance between the cell's characteristics and the average characteristics. However, the accuracy of the battery SOC estimation is a key, which is still a challenge. Dai et al. [16] and Sun and Xiong [14] proposed a dual time-scale Kalman filter, based on the equivalent electrical circuit model (EECM) where the differences in the internal resistance battery cells are considered. e mean SOC model and the differences of battery SOC proposed by Zheng et al. [17,18] use the extended Kalman filter (EKF) based on the cell mean model (CMM) and cell difference model (CDM) to estimate both the mean SOC value of battery cells and their differences, respectively. is method still requires internal information about the battery pack. Deng et al. [19] proposed a data-driven method, and an efficient feature selection method is used to estimate the SOC of a battery pack using an autoregressive Gaussian process regression (GPR) model [20,21]. A challenge for the GPR modelling is its computation time (O(N 3 )).
In summary, albeit the aforementioned progresses in the battery pack SOC estimation, to develop a simple yet accurate model is still an important issue in real-life battery applications. Data-driven methods [22] have gained a lot of interest in recent years to solve highly nonlinear classification and regression problems. e advantages of datadriven methods are the flexibility and model-free [23] characteristics which make them easy to create new models. As a class of data-driven methods [24], the machine learning approaches, such as support vector regression [25], Kalman filter [12,13,17], and backpropagation (BP) neural networks [26], have been successfully used in SOC estimation and prediction. However, the selection of dataset and input features for building these models is still ad hoc via trial and error.
To overcome some shortcomings in the aforementioned methods for the battery pack SOC estimation, this paper presents an improved RBF method using a fast recursive algorithm (FRA) to estimate the SOC of a battery pack. e FRA method [27] can be used for both neural inputs selection [28] and hidden layer node selection [29][30][31] in the configuration of RBF networks. Comparing to [32], the average cell temperature, the time mean pack voltage, the time mean pack temperature, and the time mean loop current all over 10 seconds intervals can be also added to the initial candidate pool of input variables, other input candidates can also be included such as the maximum cell voltage, the minimum cell voltage, the average cell voltage, and loop current. e statistical variables are adopted to reduce the complexity of the model and the cell information is used to overcome the inconsistency among single cells.
en, a compact subset of these candidate variables are selected as the model input by the FRA method. On this basis, an improved RBF model built by the FRA method is used to predict the SOC of the battery pack. e proposed RBF model is automatically constructed by the selection of the hidden layer nodes using the FRA method. Furthermore, the parameters of RBF kernel are optimized by particle swarm optimization algorithm (PSO). e rest of this paper is organized as follows. Section 1 introduces the input selection based on the FRA method. In Section 2, the application of improved RBF neural network for SOC estimation of battery pack is introduced in detail. Furthermore, the experimental and simulation results are compared in Section 3. Finally, Section 4 concludes the paper.

Input Selection Using FRA
Based on the theory of series expansion, polynomial NARMAX models can achieve the same modelling performance as various neural networks if certain conditions are satisfied [28]. e input selection of RBF neural network is thus simplified to determining the structure of the polynomial NARMAX model. e structure of the polynomial NARMAX model can be efficiently detected by selecting important polynomial terms using the FRA method with low computational complexity [27].

FRA Method.
Consider the following multiple-input single-output system represented by a linear-in-the-parameter model: where y(t), X → (t) ∈ R m , and ε(t) are output variable, input variable vector, and model error at time instant t, respectively. Herein, m and n denote the number of input variables and model terms (mapping functions), respectively. φ k is the nonlinear mapping function. θ k are the linear coefficients for the mapping functions.
For given N training samples, the system model is expressed in the following matrix form: where Refer to [27], and the minimal cost function using the least square method is given as where 2 Complexity us, the minimal cost function is reformulated as follows: Use the definitions below: e variance of the minimal cost function E induced by an additional mapping function ϕ k+1 is given as follows: Using the propositions detailed in [27], equation (7) is rewritten as follows: where ϕ (k) k+1 � R k ϕ k+1 . Obviously, the variance ΔE k+1 only concerns the additional mapping function ϕ k+1 . en, define the recursive matrix A � [a i,j ] k×n and recursive vector A y � [a i,j ] T n×1 , the elements of which are defined as follows: erefore, the net contribution induced by the ϕ k+1 is expressed as And the linear coefficients are estimated by Require the maximum voltage vector v , the average temperature tmp ���→ ∈ R N×1 of the battery cells, the circuit current I → ∈ R N×1 , the maximal order of time lags for inputs l x � 10, the maximal order of time lags for output l y � 3, the maximal number of selected terms m, and the minimal training error e. Ensure the SOC vector of the battery pack y → ∈ R N×1 .
(1) Initialization: form the regression matrix for polynomial term selection.
(2) for i � 1 to n do (3) calculate the recursive matrix A, A y , a j,j and ay j (j � 1, . . . , m) is recursively calculated by (4) calculate the net contribution of the terms using equation (10). (5) select the significant term. (6) end for (7) Input selection: find the order of the time lags from the selected model terms.
ALGORITHM 1: Input selection using FRA algorithm.

Complexity
Require: selected input variable matrix Φ ∈ R N⋋m in equation (2), the variable upper/lower bounds [X min , X max ] and the velocity upper/lower bounds [v min , v max ], the size of the population l, the maximum number of iterations T, the crossover factors CR � [c 1 , c 2 ] ∈ [0, 1], and the acceleration of the particle velocity w i . Ensure: the SOC vector of the battery pack y → ∈ R N×1 .
(1) Initialization: j,0 and widths x 2 j,0 of the RBF basis function, where j � 1, . . . , l, thus the initial nonlinear parameters are ) and the recursive matrix A, A y using Algorithm 1, respectively.
i and X k,i+1 denote the velocity and particle at i th iteration for k th selection, r 1 and r 2 is the random numbers. (11)end for (12)add the candidate feature with the minimal PRESS error to the regression matrix Φ, k � k + 1. (13)end while (14)Identification: calculate the linear coefficients using equation (11).   e SOC of a battery pack is a time sequence, so both the model dependent variables and the model output measured in the past are critical to the estimation of next SOC value. However, not all of the historical data are needed for SOC estimation, so the maximum order of time lags for these input variables should be determined in advance.
To select the RBF neural network inputs, the problem is converted into the polynomial model construction. us, the input selection problem is formulated as equation (1).
Herein, the mapping functions are selected using the following polynomial terms: where 0 ≦ n yk1 ≦ · · · ≦ n yki ≦ l y , 0 ≦ n xki ≦ · · · ≦ n xki ≦ l x , and l x � 10 and l y � 3. en, the neural network model inputs can be identified by selecting the most significant polynomial terms using the FRA method. e following input selection method is detailed in Algorithm 1.

Improved RBF Model for the SOC Estimation
is paper aims to develop an accurate yet simple model for battery pack SOC estimation. Deng et al. proposed a two stage algorithm based on the leave-one-out method [30] to increase the performance of RBF neural networks. e selection procedure is automatically terminated by predicted-residual-sums-of-squares (PRESS) error so that the constructed RBF neural model is parsimonious and accurate. In this paper, the FRA method is used instead of the two stage algorithm for RBF neural network construction, which reduces the modelling complexity. In order to ensure the accuracy of the model, particle swarm optimization (PSO) algorithm is used to optimize the kernel parameters.

General RBF Neural Network.
A RBF neural model can be formulated as a linear-in-the-parameters model like equation (1) as follows: where the additional parameters φ k (X → (t); c k ; σ k ) is the radial basis activation function for the hidden nodes which is often chosen as a Gaussian function. c k ∈ R m is the centers, and σ k ∈ R 1 denotes the RBF widths.
Similar to equation (2), the RBF neural model is formulated in the matrix form as follows: Validation SOC using experience inputs Actual SOC Validation SOC using selected inputs Validation SOC error using experience inputs Validation error SOC using selected inputs where Φ � [ϕ 1 , . . . , ϕ n ] T ∈ R N×n is the output matrix of the hidden nodes.

Improved RBF Neural
Model. e performance of the RBF neural model is related to the number of the hidden layer nodes and the kernel parameters. erefore, the construction of RBF network can be regarded as an optimization problem which depends on the number of hidden layer nodes, kernel parameters, and connection weights. In order to improve the accuracy and real-time performance of Li-ion battery pack SOC estimation, the FRA method is used to establish an accurate and compact RBF neural model.
Using the improved RBF neural model based on the FRA method, the hidden layer nodes are selected according to the net contribution of the hidden layer node output. At the same time, the nonlinear kernel parameters are optimized by the particle swarm optimization method. Particle swarm optimization (PSO) [33] is a nonlinear parameter optimization algorithm based on swarm intelligence, and it has been widely used for nonlinear parameter optimization. e method is simple and easy to implement, it is applied to the parameter optimization of RBF kernel function. According to [30], leave-one-out (LOO) crossvalidation and associated predicted-residualsums-of-squares (PRESS) error are used as an index to select hidden layer nodes and automatically break the selection procedure. e hidden layer nodes are selected with the maximal reduced PRESS error. us, the net contribution is changed to the following equation: , k � 1, 2, . . . , n, (15) where N and n is the number of the samples and the max number of the hidden layer nodes, respectively. e k (t) and R k (t, t) is the model error and the defined matrix R k in equation (6) at time instant t, respectively. Based on this net contribution, the improved RBF neural networks optimized by the PSO method is shown in Algorithm 2. Training SOC error using proposed RBF Training SOC error using General RBF Training SOC error using LSSVM Training SOC error using RBF improved by two-stage algorithm Training SOC using proposed RBF Actual SOC Training SOC using General RBF Training SOC using LSSVM Training SOC using RBF improved by two-stage algorithm

Battery Pack SOC Estimation.
As mentioned earlier, the battery pack SOC is estimated using the improved RBF neutral network. e schematic diagram of the proposed method for the battery pack SOC estimation is illustrated in Figure 1.
From Figure 1, there are three parts in the proposed method. In the first part, the inputs are determined from the measurements including the voltage of the battery cell (V cell ), the voltage of the battery pack (V pack ), the terminal current (I cir ), the temperature of the battery cell (T cell ), the SOC of the battery cell (SOC cell ), and the SOC of the battery pack (SOC pack ). Before the model inputs are determined by the FRA method, the candidate inputs are expanded by finding the maximum, the minimum, and the mean of V pack , T cell , and I cir . en, the delayed sequence obtained by using delay operator (z − 1 , . . . , z − 10 ) is adopted to produce the polynomial terms. us, the inputs are selected from the terms in the resultant nonlinear autoregressive moving average with exogenous inputs (NARMAX) model. In the second part, the improved RBF model is trained using the FRA method combined with the PSO method. Finally, the SOC of the battery pack is predicted using the built RBF model in which the kernel parameters (μ, σ), the number of the hidden layer (n) nodes, and the weights to the outputs (Θ) are optimized by PSO [33].

Simulation Results
We first consider a package with 216 battery cells of 18650 types connected in series. 8 battery packs in the same configuration were tested. In these tests, the circuit current, the terminal voltage of the battery pack, the terminal voltages of each cell individual, and the temperature between two battery cells are measured every 1 s. e SOC of the battery pack and the battery individual cell are all estimated every 1s by the battery management system. e data collected from a battery pack are often too large to be used to establish the estimation model. e ageing of battery capacity can be ignored in a short period, the training samples are selected every 30 s to build the improved RBF model. en, the model inputs are chosen by FRA for the battery pack SOC estimation. Validation SOC using proposed RBF Actual SOC Validation SOC using General RBF Validation SOC using LSSVM Validation SOC using RBF improved by two-stage algorithm

Complexity
Using the FRA method, the maximum voltage v max(t) of the battery cells, the minimum voltage v min(t) of the battery cells, the average voltage of the battery pack v avg(t), the voltage of the battery pack v(t), the mean voltage v m(t) of past 10 measurements, the mean current i m(t) of past 10 measurements, the mean temperature tmp m(t) of past 10 measurements, the circuit current i(t) and the average temperature tmp(t) of the battery cells, and the estimated SOC soc(t) are adopted as the inputs and output, respec- , tmp m(t − 8), and soc(t − 1) are selected. To verify the selected inputs, the improved RBF model for the SOC estimation is built using the selected inputs compared to the inputs selected by experience (trial and error). e performance using different inputs are shown in Table 1.
In Table 1, the RMSE (root mean square error) and the max absolute error are shown. Clearly, the model using the selected inputs performs much better, with the RMSE of the absolute error is almost always within ±0.08. e simulations are illustrated in Figure 2 and 3. It is shown that the SOC estimation is more accurate using the selected inputs of which the generalization error is less than that using experience inputs.
en, the proposed model is compared with the conventional RBF method, the general least square support vector method (LSSVM), and the improved RBF neural model and optimized by the two-stage method (TSS_RBF) [30]. e performance of the three methods is shown in Table 2.
According to Table 2, the proposed RBF method took more time than the conventional RBF method in training the model, but the validation RMSE of the proposed RBF model is just half of that using the general RBF model. While the LSSVM model takes almost 50 times longer to train than the proposed RBF model and the improved RBF neural model by the two stage method takes almost 50 times longer to train than the proposed RBF model. Meanwhile, the validation RMSE of the proposed RBF model is 0.02% lower than the other methods. e simulation results are shown in Figures 4 and 5. It is clear that the proposed RBF model has excellent generalization capability to obtain more accurate SOC than the other methods.

Conclusions
In order to estimate the SOC of battery pack accurately, it is necessary to adopt the data-driven method to handle the inconsistencies among the cells in a battery pack. is paper first uses the FRA method to select the input variables to improve the precision of the model because the inputs features are important to ensure the accuracy of the RBF neural networks. e experiment results show that better SOC estimation results can be achieved when a compact set of model inputs is selected. en, the FRA method is further used to improve construction RBF neural network for battery pack SOC estimation. e hidden nodes of RBF neutral networks are again selected using the FRA method, and the particle swarm optimization algorithm is used to optimize the kernel parameters. e results show that the improved RBF model can achieve high estimation accuracy at acceptable time costs.

Data Availability
e processed data used to support the findings of this study are included within the article. e data source is provided by the partner of Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, and can be obtained from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.