^{1}

^{2}

^{2}

^{1}

^{2}

^{1}

^{2}

A fuel cell is a complex system, which produces electricity through an electrochemical reaction. For the formal application of control strategies on a fuel cell, it is very important to have a precise dynamic model of it. In this paper, a dynamic model of a real hydrogen fuel cell is obtained to predict its response. The data used in this paper to obtain the model have been acquired from a real fuel cell subjected to different load patterns by means of a programmable electronic load. Using this data, a nonlinear model based on a hybrid intelligent system is obtained. This hybrid model uses artificial neural networks to predict the output current of the fuel cell in a very precise way. The use of a hybrid scheme improves the performance of neural networks reducing to half the mean squared error obtained for a global model of the fuel cell.

The problems derived from pollution, and the increasingly alarming climate change, have led modern society to look for clean energy sources. One of the most promising technologies for accomplishing hybrid energy topologies is based on renewable sources centers on hydrogen, due to its possible generation by electrolysers and then its storage. Subsequently, from this gas, the generation of electrical energy by fuel cells is absolutely feasible [

A fuel cell is a complex system consisting of a series connection of individual cells (a stack), where the electric current is produced by an electrochemical reaction, combined with all other systems necessary for its operation, that is, filters and systems that condition the gases involved in the reaction (

A fuel cell behaves as a nonlinear dynamical system, which generates electrical energy through an electrochemical reaction. The energy generated by the fuel cell is not regulated; thus, a control system is necessary for its efficient use [

It is important to be able to predict the behaviour of fuel cells for their efficient use; hence, obtaining an accurate model is a very important task before designing a control strategy. Achieving an accurate model of a system is a fundamental part of its study; however, we do not always have enough information to obtain an acceptable mathematical model. Therefore, we must resort to modelling techniques based on input–output data [

For the current prediction in this paper, several regression techniques had been checked. The algorithms based on multiple regression analysis are accepted regression methods used in several applications [

This work is divided into the following sections. After to the current introduction, the case study is described in detail. Afterwards, the model approach and the employed algorithms are presented. The results section explains the best achieved configuration of the hybrid model, and the validation of the accomplished prediction model. Lastly, conclusions are explained and future works are depicted.

A single fuel cell of a PEM stack consists of an electrolyte layer in contact with an anode and a cathode on both sides; see Figure

Fuel cell diagram.

The data used for the realisation of the model were acquired through laboratory tests of an air-cooled polymer electrolyte fuel cell (AC-PEFC). Specifically, a PEM FCgen-1020AVS stack from Ballard was used [

Stack + BoP to integrate the fuel cell.

Laboratory implementations to test the fuel cell.

The model proposal implemented in this work is illustrated in Figure

Basic schematic model.

An internal layout of the model with the mentioned previous variable values is shown in Figure

Model approach to predict actual current value.

Figure

Internally schematic to achieve the hybrid model.

To create a hybrid model, the modelling process could be divided into the following steps (Figure

Clustering phase

Regression modelling phase for each cluster

Selection of the best local model (by cluster)

Selection of the best topology for the hybrid model

Flowchart of the hybrid model creation phases.

For the clustering phase, the k-means algorithm has been used to achieve the groups with similar features. To perform the regression modelling phase, the ANN algorithm was chosen considering its capacity to predict the output of nonlinear systems with a simple internal configuration. Although this is a hybrid system, the model achieves better results if the regression algorithms are intelligent systems than if they are traditional regression methods. The regression modelling phase uses k-fold to achieve a more real approach in the model performance measurement. The k-fold testing method is explained in Figure

K-Fold training and test data selection.

As it is shown in Figure

For the hybrid model topology definition, the number of clusters must be determined. This choice is done based on the global error considering the samples quantity for every cluster and estimating a weighted error. The best hybrid configuration is the one with the best whole performance.

The k-means method is used to create a certain number of groups in an unlabeled data set. The idea is to place centroids in the corresponding hyperspace, so that the data belonging to the same centroid have similar characteristics and represent a data cluster [

Every new sample, once the centroids are trained and correctly placed in the hyperspace, is compared with them and is associated with the centroid that is closest in terms of the chosen distance, usually the Euclidean [

This algorithm has an initial training phase that needs to know the number of clusters to divide the data. This phase could be slow depending on the number of groups and the data size; however, once the training is finished, the cluster assignment is very fast for new data [

The initial location of the centroids is chosen randomly. Then, the location varies, until reaching the greatest distance between them, according to the following procedure:

Each sample is associated with the nearest centroid and is included in a specific list.

After checking all the samples and being associated with the list of the corresponding centroid, the list of labelled samples will be available.

The location of the centroid is recalculated obtaining the center of the set of samples that have been associated with it.

The procedure is repeated until the centroids are no longer displaced in the successive calculations.

Moreover, as the initial centroids are randomly selected, the procedure is repeated several times until the largest distance between centroids is reached.

The ANN is an intelligent algorithm that uses small processing units called neurons. These neurons are interconnected between each other through links, and each one calculates a function taking into account the different inputs. All the inputs to each neuron have its weight in the activation function inside the neuron [

The main specific characteristics of ANNs are that they can learn from experience through the generalization of cases [

The activation function defines the new state, or output, of the neuron as level of excitation [

The topology, or architecture, of an ANN is determined by the organization of the neurons, their arrangement, and their connections. The architecture depends on four main parameters: the layers quantity in the system, the number of neurons of every layer, the connectivity between neurons, and the activation functions [

The basic structure to interconnect neurons is the multilayer perceptron. This type of ANN is organized in several layers: input, intermediate or hidden and output. A layer is a set of neurons whose input information comes from the same source: the inputs of the ANN for the input layer or the previous layer for the rest of the layers. The output of the neurons in the same layer has the same destination too: the next layer or the output of the ANN (in the case of the output layer).

Normally, the output layer neurons use special activation functions depending on the use of the ANN; for regression, the typical is the linear function.

The data set in this research is collected using the BoP system described in the case study section. With this equipment, the samples from two different days were collected. A total of 774391 samples were recorded from these tests and, after discarding the bad measurements, the data sets were reduced to 774,385. As the model used previous values, it was necessary to eliminate the samples that did not have all the inputs values to model.

Although there were 774,379 valid samples, only 1/5 of them were used to train the hybrid model; they were selected randomly to ensure the generalization of the model. Then, only 154,875 samples were used in the modelling phase.

In addition, the samples of another different day were used to validate the hybrid model achieved. 4,832 samples, which were not used in the modelling phase, were recorded from two separate tests (1,489 and 3,343 samples each), and they were used in the validation phase of the research.

The results of this research could be divided into three different parts: the clustering, the modelling, and the validation.

The clusters were created with the explained k-means algorithm. Nine hybrid systems were created with different number of clusters (between 2 and 9), as the optimal number of groups was previously unknown. The algorithm was trained with random initialisation of the centroids, and the training was repeated 20 times to ensure the best divisions, the furthest centroids. The number of samples used in the modelling phase for each cluster is shown in Table

Number of samples in each created cluster.

Cl-1 | Cl-2 | Cl-3 | Cl-4 | Cl-5 | Cl-6 | Cl-7 | Cl-8 | Cl-9 | Cl-10 | |
---|---|---|---|---|---|---|---|---|---|---|

Global | 154,875 | |||||||||

Hybrid 2 | | | ||||||||

Hybrid 3 | 47,719 | 50,414 | 56,742 | |||||||

Hybrid 4 | 10,699 | 37,213 | 50,414 | 56,549 | ||||||

Hybrid 5 | 10,699 | 28,324 | 30,251 | 37,020 | 48,581 | |||||

Hybrid 6 | 10,699 | 22,194 | 28,058 | 28,324 | 28,580 | 37,020 | ||||

Hybrid 7 | 285 | 10,699 | 22,194 | 28,084 | 28,273 | 28,320 | 37,020 | |||

Hybrid 8 | 285 | 2,627 | 10,475 | 22,194 | 28,084 | 28,273 | 28,320 | 34,617 | ||

Hybrid 9 | 285 | 2,627 | 5,375 | 8,161 | 22,194 | 27,538 | 28,084 | 28,250 | 32,361 | |

Hybrid 10 | 285 | 2,027 | 2,627 | 5,099 | 8,161 | 22,194 | 26,447 | 27,533 | 28,139 | 32,363 |

The ANN regression algorithm is configured with only a single hidden layer. The input layer has 5 inputs, one for each variable explained in the model approach section, and only 1 output in the last layer. Several configurations of the ANNs for each cluster were trained, all of them with tan-sigmoid activation function for the internal neurons (in the hidden layer) and, in the output layer, a linear activation function was used. The difference in the several configurations was the hidden layer neurons’ quantity. This layer size varied from 1 to 15 neurons.

To train each ANN configuration, the Levenberg-Marquardt optimization algorithm was used. Moreover, to finish the training phase, gradient descent was used base on the MSE (mean squared error). The best ANN configurations for each cluster are indicated in Table

Configuration for each individual hybrid model.

Cl-1 | Cl-2 | Cl-3 | Cl-4 | Cl-5 | Cl-6 | Cl-7 | Cl-8 | Cl-9 | Cl-10 | |
---|---|---|---|---|---|---|---|---|---|---|

Global | ANN15 | |||||||||

Hybrid 2 | | | ||||||||

Hybrid 3 | ANN14 | ANN11 | ANN12 | |||||||

Hybrid 4 | ANN14 | ANN15 | ANN14 | ANN11 | ||||||

Hybrid 5 | ANN11 | ANN13 | ANN11 | ANN14 | ANN15 | |||||

Hybrid 6 | ANN12 | ANN13 | ANN15 | ANN12 | ANN15 | ANN12 | ||||

Hybrid 7 | ANN11 | ANN14 | ANN12 | ANN12 | ANN11 | ANN12 | ANN13 | |||

Hybrid 8 | ANN11 | ANN15 | ANN11 | ANN11 | ANN11 | ANN11 | ANN12 | ANN14 | ||

Hybrid 9 | ANN11 | ANN15 | ANN13 | ANN11 | ANN12 | ANN13 | ANN11 | ANN12 | ANN15 | |

Hybrid 10 | ANN11 | ANN13 | ANN11 | ANN13 | ANN11 | ANN12 | ANN12 | ANN15 | ANN11 | ANN15 |

The selection of these advantageous configurations uses the MSE as a performance measurement for the created models. In Table

Mean square error for each individual hybrid model.

Cl-1 | Cl-2 | Cl-3 | Cl-4 | Cl-5 | Cl-6 | Cl-7 | Cl-8 | Cl-9 | Cl-10 | |
---|---|---|---|---|---|---|---|---|---|---|

Global | 0.0043 | |||||||||

Hybrid 2 | | | ||||||||

Hybrid 3 | 0.0012 | 0.0046 | 0.0030 | |||||||

Hybrid 4 | 0.0000 | 0.0010 | 0.0069 | 0.0046 | ||||||

Hybrid 5 | 0.0000 | 0.0000 | 0.0102 | 0.0032 | 0.0144 | |||||

Hybrid 6 | 0.0000 | 0.0146 | 0.0037 | 0.0000 | 0.0085 | 0.0010 | ||||

Hybrid 7 | 0.0000 | 0.0000 | 0.0147 | 0.0034 | 0.0058 | 0.0000 | 0.0014 | |||

Hybrid 8 | 0.0000 | 0.0075 | 0.0000 | 0.0184 | 0.0031 | 0.0017 | 0.0000 | 0,0003 | ||

Hybrid 9 | 0.0000 | 0.0075 | 0.0000 | 0.0000 | 0.0171 | 0.0000 | 0.0030 | 0.0082 | 0.0002 | |

Hybrid 10 | 0.0000 | 0.0000 | 0.0075 | 0.0000 | 0.0000 | 0.0041 | 0.0000 | 0.0074 | 0.0019 | 0.0000 |

To calculate the best hybrid configuration for the whole model, as explained, the number of samples was considered. The performance of the different hybrids and the global model is presented in Table

Mean squared error for each model.

Global | Hybrid model (local models) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|

2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | ||

MSE | 0.0043 | | 0.0030 | 0.0042 | 0.0073 | 0.0046 | 0.0041 | 0.0037 | 0.0047 | 0.0024 |

Two validation data sets were used to check the final hybrid model accomplished with 2 clusters and configurations of 15 and 12 internal neurons. The first test is shown in Figure

Validation test 1.

The second validation data set (Figure

Validation test 2.

In Table

Performance values for the validation tests.

MSE | NMSE | MAE | MAPE | |
---|---|---|---|---|

Validation test 1 | 0.5327 | 0.0043 | 0.0704 | 1.9414 |

Validation test 2 | 0.4384 | 8.4272e-4 | 0.1889 | 3.7677 |

Percentage absolute error in the worst part of the validation test 2.

A model of a fuel cell based on hydrogen has been developed in this work. The model predicts the current in the fuel cell under different working points, and it could be used in several ways as control or fault detection. As an example, in the fault detection field, the model output must be similar to the real measure of a current sensor, and if the measured value deviates from the modelled one, a sensor failure or a system malfunction could be represented.

Since the fuel cell is a nonlinear system, a hybrid model instead of a global is selected. In this paper, ANNs are used as regression algorithm due to its accuracy. Furthermore, with the hybrid model, the performance of the ANNs is increased up reducing to half the MSE obtained with a global model.

Very good results are obtained in terms of error in the predicted current considering that the MSE value is 0.0021 for the hybrid model with 2 clusters. One of them used an ANN with 15 internal neurons and the other an ANN with 12 neurons. To validate the model, two different data sets were used and, although the maximum MAE was 0.1889, the maximum NMSE was only 0.0043.

As for future works, the possibility of predicting the future values of the current will be examined. This future prediction would increase the fuel cell performance, since it could be adapted faster to new working points.

The data used to support the findings of this study are available from the corresponding author upon request.

The authors declare that they have no conflicts of interest.

This work has been funded by the Spanish Ministry of Economy Industry and Competitiveness through the H2SMART-