A Rule-Based Energy Management Strategy Based on Dynamic Programming for Hydraulic Hybrid Vehicles

Energy management strategy is very important for hydraulic hybrid vehicles to improve fuel economy. The rule-based energy management strategies are widely used in engineering practice due to their simplicity and practicality.However, their performances differ a lot from different parameters and control actions. A rule-based energy management strategy is designed in this paper to realize real-time control of a novel hydraulic hybrid vehicle, and a control parameter selection method based on dynamic programming is proposed to optimize its performance. Firstly, the simulation model of the hydraulic hybrid vehicle is built and validated by the data tested from prototype experimental platform. Based on the simulation model, the optimization method of dynamic programming is used to find the global optimal solution of the engine control for the UDDS drive cycle. Then, the engine control parameters of the rule-based energy management strategy are selected according to the engine control trajectory of the global optimal solution. The simulation results show that the 100 km fuel consumption of the proposed rule-based energy management strategy is 12.7L, which is very close to the global optimal value of 12.4L and is suboptimal.


Introduction
Hybrid vehicles are widely concerned due to the advantages included improving fuel economy, reducing harmful emission, and the freedom to be optimized [1][2][3].The hybrid powertrains have many types, of which hydraulic hybrid vehicle (HHV) is superior to the others in high power density (power per unit mass), braking energy recovery, high roundtrip efficiency of hydraulic accumulator, and low cost [4][5][6].Energy density and power density of the different energy storage units are compared [7], as shown in Figure 1.
Capacitors and electrochemical energy storage (batteries and fuel cells) have the relatively low density and are only marginally good at recovering brake energy.Besides, reciprocating charging and discharging reduces the lifetime.Hydraulic accumulator is superior to others, which has the high efficiency of braking energy recovery.Therefore, HHVs applied to urban transport can greatly improve vehicle fuel economy.
The propulsion power of HHVs comes from an Internal Combustion Engine (ICE) and a hydraulic accumulator, respectively, or together.The HHV structure can be classified into series, parallel, or power-split type according to the connection of mechanical system and hydraulic system [8].The series HHV is with the advantage of simple structure and large freedom.Engine is decoupled from driving conditions, and it has large freedom to be optimized.In a series HHV, engine energy is all converted into hydraulic energy by hydraulic pump, and hydraulic motor uses hydraulic energy to propel wheels.
The hydraulic accumulator is an energy buffer that stores the braking energy or assists engine to meet the vehicle driving power requirement.On the one hand, hydraulic accumulator can be charged and released over the full range of its operating area with high efficiency.The gas filled bladder type accumulator with elastomeric foam can reach efficiency of 95% [9].However, its energy density is relatively low.On the other hand, engine is not always with high efficiency in its full range operating area.ICE only provides power efficiently when working in a narrow regime around the minimal Brake Specific Fuel Consumption (BSFC) line according to the engine's map.An energy management strategy [10] is used to control the energy distribution between engine and accumulator.The optimization is ensuring that engine and accumulator work in their best efficiency at the same time.Therefore, a good energy management strategy is chosen to realize the optimization.
The energy management strategy directly affects the engine working state, power performance, and economic performance of HHVs and has always been the core content of hybrid technology research.In general, strategies can be divided into three categories including rule-based, instantaneous optimization strategy, and global optimal strategy [11].
If the driving condition is a known priority, the global optimal solution can be obtained using dynamic programming (DP).The DP optimization considers the system state not only at the previous moment but also at the next moment as an optimization condition [12].This "predictability" requires a known driving cycle condition, so it is limited in practical applications.DP is not a real-time optimal method for control system because all the disturbances must be known in advance.However, DP is a global optimal solution and can be used to assess the performance of other energy management strategies.It is meaningful to research the global optimal solution.Literature [13] studied optimal transmission shifting and power splitting factors of engines and hydraulic motors for HHVs using DP.The global optimal solution based on DP is global optimal offline but not an implementable solution.
Rule-based energy management strategies, also called logical threshold control strategies, are widely used in engineering practice due to their simplicity and practicality.Heuristic rules or fuzzy logic rules are used to determine the control variable output according to pre-set conditions.It mainly includes state machine control, threshold control, and power tracking control to ensure that the main components work in the most efficient area.Several different rule-based energy management strategies were compared and analyzed under Japan's 10-15 operating cycle [14].The thermostatic energy management strategy is superior to the others.Zoran Filipi et al. [15][16][17][18]  detailed system analysis.About parameters setting, the conventional wisdom is that engine operates at the "sweet spot" which lacks mathematical basis and it cannot perform the full potentiality of HHVs.Therefore, the results of the rule-based energy management strategy still have big difference to the global optimal solution.In this paper, a rule-based energy management strategy is designed to realize real-time and suboptimal control of a novel hydraulic hybrid vehicle.DP optimization is used to determine the engine control parameters of the rulebased energy management strategy.In Section 2, a Hydraulic Hybrid Wheel Drive Vehicle (HHWDV) simulation model is built and verified by the designed experimental platform.In Section 3, DP optimization is applied to find the global optimal solution for preselected drive cycles.In Section 4, the global optimal solution based on DP is obtained under the UDDS cycle.In Section 5, a rule-based energy management strategy is designed and its control parameters are selected according to the engine control trajectory of the global optimal solution.The rule-based energy management strategy is compared with the global optimal solution in the end.

System Description. A Hydraulic Hybrid Wheel Drive
Vehicle (HHWDV) is proposed in this paper, as shown in Figure 2. Strictly speaking, this new type HHWDV belongs to series type of HHVs.Therefore, it possesses all the advantages of a series HHV.This HHWDV achieves the decoupling between engine and driving conditions.Engine has large freedom to improve working performance, and it allows energy management strategy.When driving, the energy of engine output is converted into hydraulic energy by hydraulic pump, and hydraulic energy is converted into mechanical energy by wheel hydraulic motor.When braking, the vehicle kinetic energy is recycled by wheel hydraulic motors and  stored in high-pressure accumulator.The energy distribution between engine and accumulator is decided by the designed energy management strategy.

System Modelling.
The HHWDV is divided into three parts: body, driver, and control system.Driver only operates accelerator pedal, brake pedal, and gear selection (forward, neutral, or reverse) in a real situation.Body of the HHWDV mainly consists of engine, hydraulic pump, wheel hydraulic motors, hydraulic accumulator, and vehicle physical components.Their simulation models can be built by corresponding mathematic equations.
The control system is the core component of the HHWDV, and it consists of supervisory controller, engine controller, and motor controller.
Supervisory controller is the realization of various energy management strategies.According to the certain energy management strategy, the engine control parameters are determined based on the running situation of the HHWDV at each time.
Engine controller is used to ensure that engine always works at desired working points.Engine is controlled to work near the minimal Brake Special Fuel Consumption (BSFC) line all the time.Engine speed is controlled by adjusting the displacement of hydraulic pump.Aimed engine speed is tracked by the actual engine speed, which is hydraulic pump speed.Therefore, engine can be controlled all the time.
Motor controller is also called vehicle speed controller.Wheel hydraulic motor is a bidirectional variable displacement motor.When braking, hydraulic motor works as a pump to recycle the brake energy.The displacement of hydraulic motor is determined by the error of aimed vehicle speed and actual vehicle speed.Aimed vehicle speed is obtained from driver, and actual vehicle speed is from vehicle body.Vehicle speed is totally controlled by this motor controller to follow driver's operation.
As discussed above, a simulation model of the HHWDV is built as shown in Figure 3.The designed energy management strategy can be applied to the control system model.The system parameters are determined according to a conventional prototype vehicle, as shown in Table 1.

Model Validation.
According to the structure of HHWDV and its working principles, a hydraulic system experimental platform is designed.Based on system parameters in Table 1, components are selected to build the physical experimental platform, as shown in Figure 4.
The self-designed experimental platform is a semiphysical simulation test bench.The flywheel is used to simulate the vehicle inertia.The magnetic powder brake is used to simulate the vehicle driving resistance.The electric motor stimulates the actual engine operating points.The control system of the experimental platform is developed based on Digital Signal Processer (DSP).The experimental platform and simulation model are set the same initial conditions to verify the accuracy of simulation model.
When wheels are driven by high-pressure accumulator independently, the initial conditions are as follows: electric motor and engine are both off, the pressure of high-pressure accumulator is 19 MPa, flywheel speed is 0 rpm, and the displacement ratio of wheel hydraulic motor is 1.Simulation and experiment results are compared in Figures 5 and 6.
The curves of experiment and simulation results are nearly matched, observed from Figures 4 and 5.When wheels are driven by high-pressure accumulator alone, the accumulator's pressure variation with time is almost the same, and the variations of the wheel motor speed and torque are  matched.Therefore, the simulation model is proven to be accurate.

The Mathematical Modelling Based on Dynamic Programming
The energy distribution between the engine and the hydraulic accumulator is the key of an energy management strategy.This energy distribution problem can be seen as an optimal control problem.The desired solution is with as minimal fuel consumption as possible for certain drive cycles under the condition of meeting performance demands.Dynamic programming (DP) is a global optimal method and it is applied to the HHWDV to find the global optimal solution of the engine control with minimal fuel consumption.

DP Optimization
Method.DP is a numerical method for solving optimal control problems based on Bellman's Principle of Optimality [19], which is "An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.A complex multistage optimal problem can be divided into a series of single-stage optimal problem.Each single-stage optimal problem is solved by optimal solutions, and cost function is minimized according to a sequence of decisions for each step."A certain dynamic system can be expressed as a discretetime system, as shown in where x  is the state vector, u  is the control vector, and d  is the disturbance vector.The state vector describes the system's states with a particular variable.In this paper, the accumulator SOC is chosen to be the state variable.The control vector is limited to all the allowable control inputs under corresponding state variable.The aimed engine power is chosen to be the control variable in this paper.The disturbance vector must be known for all steps.The speed profile of vehicle is considered as a known disturbance in the selected driving cycle.
During the solving, x  and u  are discretized in time and space.The dimension of state vector and control vector influences the computation time directly.Besides, the dimension decides the computational accuracy and reliability.Therefore, the dimension is determined by the compromising between computation time and accuracy.The optimal sequence of control vector results in minimizing the chosen cost function.For DP, the original problem is divided into N stages and the cost function takes the following form.
At the last one step, if the constraint is not satisfied, an infinite penalty function is added.If not, penalty function is equal to zero.
Step N − 1: Step , for 0 ≤ k < N − 1: Therefore, x  and u  are discretized into a finite set of all possible states, and cost function  *  (  ) is also at discrete states.The cost function can be interpolated based on the cost values at discrete states.

The Mathematical Model of the HHWDV.
According to the known vehicle speed, the demand vehicle power at each step is calculated by the following equation: The aimed engine power is considered as the control inputs in this paper.The accumulator power is calculated based on the power balance of the HHWDV.The efficiency of hydraulic pump and wheel hydraulic motor is ignored in the calculation.Engine universal characteristics function is obtained by a great deal of data from dynamometer tests, as shown in Figure 7.The engine speed and the engine torque are interpolated for a certain aimed engine power in the minimal BSFC line.A three-order polynomial is fitted on the basis of test data.Then the calculated engine speed and torque as desired working point can be achieved by engine controller.
At each actual engine operating point, the engine fuel consumption is estimated by engine regression model.The specific fuel consumption is the function of engine speed and engine torque, which is described as follows: where  is the order number,  0 , Therefore, the specific fuel consumption is calculated in DP method is used to minimize the engine fuel consumption over the whole drive cycle by offline calculation.The specific fuel consumption is assumed to be constant during the step interval   , and the fuel consumption at each   is calculated in (13).The cost function of engine fuel consumption is expressed as ( 14): The accumulator SOC is designed as the state variable .About accumulator, the law of Boyle-Mariotte is the basic equation: where  0 is pre-charge gas pressure (MPa);  0 is corresponding volume under pre-charge gas pressure (L);  is working pressure (L);  is corresponding volume under working pressure (L); and  is air polytropic exponent ( =1∼1.4,adiabatic conditions:  = 1; isothermal conditions:  = 1.4).
The accumulator flow rate equals accumulator volume variation ratio with time.
The accumulator SOC is defined in where  is pressure of accumulator;   is maximal working pressure of accumulator; and   is minimal working pressure of accumulator.The variation range of the accumulator SOC is 0 to 1 according to the definition.When the accumulator power is considered [13], the power variation ratio function is obtained from ( 15)- (17).

The Optimization Function Based on DP.
The original optimal problem must be discretized to apply the DP algorithm.If the drive cycle during time T is divided into N equal increments   , the all equations discussed above can be discretized.
This is also called state transition equation.The state of time  + 1 is decided by the control variable and the previous state variable.
The system mathematical model equations are discretized as follows: For a real engineering optimal problem, the physical constraints must be considered.If one of these constraints is broken, an infinite penalty is added to the cost function to exclude the undesirable solutions.

The Realization Based on Matlab.
When the compromise of computation time and accuracy is considered, the variables are discretized as listed in Table 2.At each step k, all of the allowable control vector is calculated for each available state vector to find the optimal policy with minimal cost function.A backward solution is used to solve DP problem.Therefore, DP is solved from k=N-1 backward to k=0; the optimal decision is obtained from each step.
Mathematical Problems in Engineering 7 The established optimization functions are coded by M programming language in Matlab environment.The speed profile of the selected driving cycle is considered as a known disturbance, and all the variables are discrete according to Table 2.At last, the optimal decision sequence and minimal cost function value are solved by programmed code offline.After that, the obtained optimal decision sequence is the aimed engine power, which is the input of the system simulation model.The minimal fuel consumption can be verified by the simulation results, which is the global optimal solution.

The Global Optimal Solution of the UDDS Cycle
DP is used to find the global optimal solution of the engine control.The global optimal aimed engine power is obtained from DP.This global optimal solution is calculated offline and should be validated by simulation.Urban Dynamometer Driving Schedule (UDDS) is chosen to do the validation.This cycle simulates an urban route with frequent starts and stops for light-duty vehicles, and the HHWDV is also used to urban conditions.The global optimal solution is applied to the supervisor controller, and the simulation results are analyzed as follows.
Vehicle speed is controlled by motor controller during the whole UDDS cycle.The actual vehicle speed of the global optimal solution based on DP is compared with the standard UDDS speed profile, as shown in Figure 8.
Aimed vehicle speed profile is completely tracked by the global optimal solution according to simulation results.The maximal error between aimed and actual vehicle speed is 0.51%.Therefore, the designed motor controller is workable to realize the drive and brake integration.The global optimal solution based on DP is proved to meet the basic vehicle drive requirements.
Accumulator SOC variation with time under the UDDS cycle is shown in Figure 9.
The initial SOC state equals the final SOC state, where both are 0.5.This ensures that the net energy stored in accumulator is zero.Besides, accumulator SOC changes in the full range of 0 to 1 all the time, and the SOC is within 0.3 to 0.7 most of the time.The disadvantage of low energy density is overcome by proper energy distribution.The accumulator can be charged and released over the full range of its working area with high efficiency.The braking energy is recycled efficiently.Therefore, the advantage of hydraulic hybrids is achieved totally under the global optimal solution, and the fuel economy can be improved greatly.
The aimed engine power is the control variable in DP method.Besides, its value is an important intermediate variable to show engine working status.The variation of the required vehicle power and aimed engine power is shown in Figure 10.Engine works intermittently during the UDDS cycle.When braking, the required vehicle power is a negative value and aimed engine power is equal to zero, so the braking energy is recycled by accumulator.This is the characteristic for HHVs.Engine mostly works on 30 kW to 60 kW, where engine efficiency and fuel economy are high in this area, as shown in engine universal characteristic map.
Engine actual operating points are controlled by engine controller.When the aimed engine power is zero, engine idles.Engine is designed to work around the minimal BSFC line all the time, and engine actual working points are pictured in Figure 11.The results show that engine works intermittently in best performance area during the cycle.The engine actual working points are near the minimal BSFC line.
The actual fuel economy potential of the global optimal solution for the HHWDV is simulated by simulation model.The global optimal solution of this specific question is obtained, and the minimal fuel consumption is 1461.1mLunder the UDDS cycle, as shown in Figure 12.The UDDS cycle is 1370s and 11.8km.The fuel consumption per 100 kilometers is calculated as 12.38L.

Rule-Based Energy Management Strategy
The global optimal solution is obtained offline and is not a real-time method, but its result can be referenced by other strategies.The global optimal solution can evaluate other energy management strategies' quality.

Rule-Based Energy Management Strategy
Design.An implementable rule-based strategy is designed as shown in Figure 13.This strategy is also called the thermostatic energy management strategy, and it can be applied to the supervisor controller.
Accumulator SOC is the only state variable about aimed engine power [20].Engine is no longer to vehicle driving conditions in the HHWDV.The overall work area is divided into three areas according to the accumulator  SOC.In area A, accumulator SOC is high, so the engine idles or shuts down.The HHWDV is driven by accumulator alone.This is called pure hydraulic drive.In area B, this is a thermostatic control part, and works or not.If SOC declines to lower limit, engine works at a pre-set threshold power.The surplus energy is stored in hydraulic accumulator.If SOC rises to engine idles or shuts down.In area C, engine works in this area.Engine output power is used to meet drive resistance power firstly, and engine power is used to charge high-pressure Importantly, if resistance power is larger than engine output power, high-pressure accumulator assists to offer high pressure to hydraulic system.

Parameters Selection Based on the Engine Control Trajectory of the Global Optimal Solution. The engine control
the rule-based energy management strategy are not easy to determine, and the performance of the rule-based energy strategy differs a lot from improper of parameters and control actions.The SOC limit values and engine power of this control strategy are all pre-set according to engine control trajectory of the global solution.Based on the analysis of the global optimal solution, the parameters of the rule-based energy management strategy are determined.According to the optimal engine control trajectory features from DP, the SOC upper limit is 0.3 and lower limit is 0.7.The threshold power of engine is 30 kW and the maximal power is 80 kW.

Comparison with the Global Optimal Solution.
Based on the simulation model, the rule-based energy management strategy is compared with the global optimal solution.The comparison results are shown in Figure 14.
The rule-based energy management strategy is proved to be close to the global optimal solution under the UDDS cycle.The aimed engine power comparison shows that engine mostly works in the area that engine's specific fuel consumption is small, and the rule-based strategy has more engine start-stop times than the optimal one.Accumulator is fully charged in area A, and its SOC is limited to maximal value that is 1.The excessive hydraulic oil flows back to the low-pressure reservoir through the accumulator safety valve.This will cause hydraulic oil temperature to rise rapidly, and mechanical brake system should intervene at once.This also results in the waste of energy, and it should be avoided.However, such situation does not exist in the global optimal solution.The fuel consumption of the rule-based energy management strategy is 1499.1mLwhich is close to the optimal value 1461.1mLduring the UDDS cycle.It can be calculated as the fuel consumption per 100 km, as shown in Table 3.The effective braking energy recovery and optimized engine operation result in the fuel economy improvement.Therefore, the rule-based energy management strategy is a suboptimal solution.

Conclusions
Analyzing the global optimal solution based on DP is an efficient method to determine control parameters of the rulebased energy management strategy.A designed rule-based energy management strategy's engine control parameters are selected according to the engine control trajectory of the global optimal solution.The simulation results indicate that the 100 km fuel consumption of the proposed rule-based energy management strategy is 12.7L, which is very close to the global optimal value of 12.4L.Therefore, the rule-based energy management strategy is implementable for real-time conditions, and it is close to the global optimal solution.The effective braking energy recovery and optimized engine operation result in the fuel economy improvement.
In the future, the designed rule-based energy management strategy will be validated by prototype experimental platform.Other energy management strategies will be compared and analyzed, such as model predictive control (MPC) and instantaneous consumption minimization strategy.

Figure 1 :
Figure 1: Energy density and power density comparison.

Figure 3 :
Figure 3: The simulation model of the HHWDV.

Figure 6 :
Figure 6: The variation comparison of motor speed and torque.

Figure 8 :
Figure 8: Vehicle speed validation under the UDDS cycle.

Table 3 :
Performance comparison with the global optimal solution.