The Assessment of Hydrogen Energy Systems for Fuel Cell Vehicles Using Principal Component Analysis and Cluster Analysis

Hydrogen energy which has been recognized as an alternative instead of fossil fuel has been developed rapidly in fuel cell vehicles. Different hydrogen energy systems have different performances on environmental, economic, and energy aspects. A methodology for the quantitative evaluation and analysis of the hydrogen systems is meaningful for decision makers to select the best scenario. principal component analysis (PCA) has been used to evaluate the integrated performance of different hydrogen energy systems and select the best scenario, and hierarchical cluster analysis (CA) has been used to verify the correctness and accuracy of the principal components (PCs) determined by PCA in this paper. A case including 11 different hydrogen energy systems for fuel cell vehicles has been studied in this paper, and the system using steam reforming of natural gas for hydrogen production, pipeline for transportation of hydrogen, hydrogen gas tank for the storage of hydrogen at refueling stations, and gaseous hydrogen as power energy for fuel cell vehicles has been recognized as the best scenario. Also, the clustering results calculated by CA are consistent with those determined by PCA, denoting that the results calculated by PCA are scientific and accurate.


Introduction
Due to air pollution, energy shortage and climate change, the exploration of cleaner alternative transportation fuel is of vital importance [1].Hydrogen which has been recognized as a clean energy has the ability to overcome the future energy and environmental problems, and many projects about hydrogen fuel cell vehicles have been launched in different regions [2,3].
Hydrogen fuel cell vehicles have the potential to be the most energy-efficient vehicles and to reduce polluting emissions and other harmful emissions on a well-to-wheel basis [4][5][6].The hydrogen energy systems for fuel cell vehicles comprise four subsystems in the whole life cycle, namely, hydrogen production subsystem, transportation subsystem, hydrogen refueling station subsystem, and final utilization subsystem [7].Also, a variety of types of technologies can be used in every subsystem, for instance, there are various materials such as water, methane, and coal to produce hydrogen.There is also a variety of types of hydrogen storage technology including cryo-compressed hydrogen, high-pressure cod gas storage, and metal-organic hydrides.The use of hydrogen in fuel cell vehicles can lead to economically benign transportation, but some emissions are associated with the different technologies for hydrogen production [8].Also the effects of different technologies for hydrogen production on environment are different.Similarly, different types of transportation, storage, and utilization of hydrogen in the vehicles will also cause different effects on environmental, economic, and energy aspects.Therefore, the combination of different technologies in hydrogen production subsystem, transportation subsystem, hydrogen refueling station subsystem, and final utilization subsystem will generate different hydrogen energy systems for fuel cell vehicles, and the environmental, economic, and energy performances are also different.
In order to search the best hydrogen energy system for fuel cell vehicles, we face an important question: how to determine the best hydrogen energy system for fuel cell vehicles among the alternatives according to the corresponding performances?
Searching for the best option among the pool of different hydrogen energy options through a set of decision criteria including environmental, economic, and energy aspects, multi-criteria methodology can assess the integrated performance of different energy approaches accurately.However, the methodologies that include many different indicators are difficult to compare the superiority among the different energy options because some indicators in energy option A may be superior to those in energy option B, while some other indicators in energy option B may also be superior to those in energy option A, and in that case, it is very difficult to judge which is more superior between A and B. The problem for determining the best hydrogen energy system for fuel cell vehicles is a multidimension problem, and therefore, the key point is to aggregate the multicriteria of different systems to final scores, then the sequence of the alternatives can be ranked according to the corresponding score.
Principal component analysis (PCA) is a multivariable technique in which the numbers of variables are reduced to a smaller number of factors that describe the principal variability or joint behavior of the data set [9,10].PCA, like discriminant analysis or factor analysis is a statistical method which can be used for performance evaluation [11].
Principal component analysis (PCA) has been used as a mathematical tool to evaluate and analyze the integrated performance of 11 hydrogen energy systems for fuel cell vehicles, and cluster analysis has also been used to verify the correctness of the principal components determined by PCA.

Theory of Principle Component Analysis (PCA)
. PCA is a mathematical tool which performs a reduction in data dimensionality and allows the visualization of underlying structure in experimental data and relationships between data and samples [12].The principal component (PC) is a linear combination of the original variables, and one measure of the amount of information conveyed by each PC is its variance [13].The selected principal component can be used to represent the characteristics of the samples.The procedure of principal component analysis evaluation has been described as follows [14,15].
Step 1. Collect the data about the characteristics (criteria) of the samples, and let the original decision-making matrix where m is the number of the sample, and n is the number of the characteristic.x i j represents the value of the j(th) characteristic of the i(th) sample.
Step 2. Transform all the criteria in the original decisionmaking matrix to benefit type.The transformation can be carried out according to the type of the characteristic (criteria).Consider x i j − min i=1,2,...,m x i j max i=1,2,...,m x i j − min i=1,2,...,m x i j , the j(th) characteristic is the benefit type , max i=1,2,...,m x i j − x i j max i=1,2,...,m x i j − min i=1,2,...,m x i j , the j(th) characteristic is the cost type, where benefit criteria are the-larger-the-better type, and cost criteria are the-smaller-the-type better.
Step 3. Standard transformation Step 4. Calculate the correlation coefficient matrix.The element of correlation coefficient matrix can be calculated by ( 4) Cov y si , y s j σ y si × σ y s j ⎞ ⎠ , i = 1, 2, . . ., n; j = 1, 2, . . ., n. ( Consider where Cov(y si , y s j ) is the covariance of sequences y si and y sj and σ(y si ) is the standard deviation of sequences y si , σ(y s j ) is the standard deviation of sequences y s j .
Step 5. Solve the eigenvalue and eigenvector, then calculate the contribution rate H and the cumulative contribution rate TH.The eigenvalues can be determined by (6), and the contribution rate and cumulative contribution rate can be calculated by (7) and (8), respectively, where λ k represents the eigenvalue, I n represents the unit matrix of n-order, and V k is the corresponding eigenvector of λ k such that Step 6. Express the principal component.Select the first t principal component to make the cumulative contribution rate greater than 85%, then it can be recognized as significant, meaning that the original n characteristics can be expressed by the new t principal components.Then, the first t principal component can be determined by (8).Consider P 1 P 2 . . .
where P s represents the s(th) principal component, and V k is the k(th) eigenvector which has n elements, as shown in ( 10) Step 7. Calculate the weight of the principal component. Consider where ω k is the weight of the k(th) principal, and H k represents the contribution of the k(th) eigenvalue.
Step 8. Determine the evaluation function of each sample Step 9. Rank the sequence of the samples according to the rule that the larger the score of the evaluation function, the better the sample.

Cluster Analysis (CA).
In order to verify the correctness of principal component analysis for determining the principal components, cluster analysis of the original variables (criteria) is helpful to judge the accuracy of the results by PCA.
The hierarchical agglomeration algorithm for clustering has been used in this paper, and the main thought of this methodology is assuming there are m observations, then the algorithm starts with m clusters, with the calculation of the Euclidean distance between observations, the closest points are merged into a single cluster, and repeats the process until all the observations are included in one cluster [16].Lu et al. had summarized the procedure of the clustering method [17] as follows.
Step 1. Determine the distance between all the observations.
Step 2. Link the two observations that correspond to the lowest distance to conform a new cluster.
Step 3. Compare the two observations that form part of the newly formed group with the remaining observations.
Step 4. Repeat Step 3 until all observations belong to one cluster.

Multicriteria decision making on hydrogen energy systems Principal component analysis and cluster
Comparing the results determined by PCA and Step 1 Step 2 Step 3 Step 4 Step 5 Ranking the alternatives Whether consistent?
Framework of multicriteria decision making on hydrogen energy systems.

Framework of Multicriteria Decision Making on Hydrogen
Energy Systems.The framework of multi-criteria decision making on hydrogen energy systems has been shown in Figure 1.The framework comprises six steps as follows.
Step 2. determining the criteria for the assessment of the hydrogen energy systems (C 1 , C 2 , . . ., C n ).
Step 3. using principal component analysis and cluster analysis to analyze the hydrogen energy systems, determining the PCs, the evaluation functions, and the clusters.
Step 4. comparing the results determined by principal component analysis and cluster analysis.If they are consistent, turn to Step 5, or turn to Step 3, redetermining the PCs.
Step 5. ranking the sequence of the alternatives from the best to the worst.

Case Study
In this section, principal component analysis has been applied for evaluating the performance of 11 plants of hydrogen energy systems for fuel cell vehicles provided by Feng et al. [18].The hydrogen energy systems for fuel cell vehicle comprise four subsystems: hydrogen production subsystem, transportation subsystem, hydrogen refueling station subsystem, and final utilization for fuel cell vehicles subsystem.There are multitechnologies for each subsystem, and different scenarios of hydrogen energy system can be obtained by the combination of those technologies.Eleven options of hydrogen energy system have been discussed, and the brief description of the systems has been shown as follows [7,18]: Option 9. Water electrolysis with industrial electricity at refueling stations for hydrogen production, no transportation, hydrogen gas tank for the storage of hydrogen at refueling stations, and fuel cell vehicles in Peking consume gaseous hydrogen; Option 10.Water electrolysis with valley electricity at refueling stations for hydrogen production, no transportation, hydrogen gas tank for the storage of hydrogen at refueling stations, and fuel cell vehicles in Peking consume gaseous hydrogen; Option 11.Methanol synthesis via natural gas (methanol factory), methanol tank by trucks for the storage, methanol tank for the storage of methanol, and fuel cell vehicles in Peking consume methanol (methanol reforming onboard).The environmental, economic, and energy performances of the hydrogen energy systems have been shown in Table 1, and the data had been obtained by the evaluation of the 11 plants in the life cycle of fuel cell vehicles, namely, the data was investigated from exploitation of raw materials to utilization of hydrogen in fuel cell vehicles [7,18].With the methodology shown in (2), all the criteria have been transformed into benefit type, as shown in Table 2.After the standard transformation, the standard matrix can be obtained, as shown in Table 3, then the correlation coefficient matrix can be calculated by ( 4) and ( 5), as shown in Table 4.
Then, the main results of PCA analysis of the 11 scenarios have been shown in Table 5.The variance of each principal component represents the amount of information it can convey, and the cumulative proportion of total variance indicates to the investigator how much information selecting a specified number of component retain [13].The first three principal components account for 62.472%, 19.040%, and 12.453% of total variance, respectively.The three principal components convey a large amount of information and account for 93.966% of the variability in the original data, while the other seven components only account for about 6.034%.Therefore, the first three principal components can be used to express the 11 plants of hydrogen energy systems for fuel cell vehicles.
From the eigenvectors acquired in the principal component analysis, and according to (9), the principal components can be expressed by the original variables as shown in (13).In the linear combination of the original variables for PC 1 , the coefficients of SO 2 , CO 2 , NO x , waste water, waste solids, and costs are higher comparing to those of other original variables, meaning that the first principal component is a measure of the performance of the six criteria.In the linear combination of the original variables for PC 2 , the coefficients of CO, CH 4 , and dust are higher than other original variables.Therefore, PC 2 represents the performance of those three variables (CO, CH 4 , and dust) that have significant effect on the second principal component.On the contrary, the other variables have little effect on the second principal component.In the formula of PC 3 , the coefficient of energy is very large relative to other variables.On the contrary, the  coefficients of other variables are very small, so PC 3 is a measure of the performance of energy aspect of the systems.
For the purpose of checking the accuracy of principal component analysis for determining the principal components, the hierarchical agglomeration algorithm has been used to group the ten original variables.The data after standard transformation as shown in Table 3 have been used as the inputs, and the Euclidean and furthest neighbor had been adopted as the cluster method and distance metric in the cluster analysis, respectively.
The result of cluster analysis of the original variables has been shown in Figure 2. According to the Euclidean distance, the ten variables have been firstly grouped into four clusters, that is, A (SO 2 , CO 2 , NO x , waste water, waste solids, and costs), B (energy), and C (CO, dust, and CH 4 ).The results are consistent with the results obtained from principal component analysis, where PC 1 corresponds to cluster A, PC 2 corresponds to cluster C, and PC 3 corresponds to cluster B.
Meanwhile, each cluster can further be grouped into subclusters, for instance, cluster A can divided into subcluster A1 (SO 2 , CO 2 , NO x , waste water, and waste solids) and A2 (cost).Since the relative effects of the two groups of variables on the PC 1 are different, the results are also consistent with the results acquired by PCA, and the coefficients of the variables including SO 2 , CO 2 , NO x , waste water, and waste solids are very similar in (13), while on the contrary, the coefficient of the variable (costs) is relatively small.Similarly, in principal component analysis, the variables CO, dust, and the variable CH 4 have different effects on PC 2 , and that is the reason why cluster C has been grouped into C1 and C2.Consequently, the results determined by principal component analysis are consistent with those determined by cluster analysis.
Then, the values of the principal component for the 11 hydrogen energy systems for fuel cell vehicles have been calculated, as shown in Table 6.It means that the performance of the 11 systems can be expressed by the three principal components instead of the ten original variables.
The weights of the three principal components have been shown in Table 3, then the evaluation function for the systems can be determined, as shown in ( 14) where F i is the evaluation function of the i(th) option, P it is the t(th) principal component of the i(th) option, and ω t is the weight of the t(th) principal component.With ( 14), the final score can be calculated, and with the rule of the larger the score, the better the system, the sequence of the scenarios can be ranked, as shown in Table 7.
The sequence of the scenarios from the best to the worst is Option 2, Option 1, Option 5, Option 6, Option 3, Option 11, Option 4, Option 7, Option 8, Option 10, and Option 9.The best scenario determined by PCA is Option 2, followed by Option 1 and Option 5. Consequently, the hydrogen  energy system for fuel cell vehicles with steam reforming of natural gas for hydrogen production, pipeline for the transportation, hydrogen cylinder for the storage at refueling stations and gaseous hydrogen as power energy for fuel cell

Conclusion
The growing concern on the negative effects on environmental, economic, and energy aspects of traditional vehicles has promoted the decision makers to pay significant attention on clean and environment-friendly ones, and hydrogen energy systems for fuel cell vehicles are promising and attractive technologies in the future.However various methodologies of production, storage, transportation, and utilization will lead to different impacts on economic, environmental, and energy aspects.It is difficult for the stakeholders to determine the best one directly from various hydrogen energy systems.Therefore, developing a multicriteria decision making methodology on hydrogen energy systems is meaningful and valuable for the selection of the best system.A hybrid multi-criteria decision making methodology integrating principal component analysis and cluster analysis has been proposed to assess the hydrogen energy systems in this paper.Principal component analysis has been used to determine the principal components and the evaluation functions of the systems, and cluster analysis has been used to verify the principal components determined by principal component analysis.When the results determined by principal component analysis and cluster analysis are consistent, the sequence of the alternatives from the best to the worst can be determined according to the value of the evaluation function for each system.
Eleven hydrogen energy systems for fuel cell vehicles have been assessed and analyzed by the proposed method, the sequence of the systems has been ranked, and the system using steam reforming of natural gas for hydrogen production, pipeline for transportation, hydrogen cylinder for the storage, and gaseous hydrogen for the consumption of fuel cell vehicles has been recognized as the best scenario.The clusters determined by CA are consistent with the principal components determined by PCA, indicating that the PCs for the evaluation of hydrogen energy systems are scientific and accurate.The proposed methodology can also be popularized.

1 .Option 3 .Option 4 .Option 5 .Option 6 .Option 7 .Option 8 .
Steam reforming of natural gas (central factory) for hydrogen production, hydrogen gas cylinder by trucks for transportation, hydrogen cylinder for the storage of hydrogen at refueling stations, and fuel cell vehicles in Peking consume gaseous hydrogen; Option 2. Steam reforming of natural gas (central factory) for hydrogen production, pipeline for transportation, hydrogen gas tank for the storage of hydrogen at refueling stations, and fuel cell vehicles in Peking consume gaseous hydrogen; Steam reforming of natural gas (central factory) for hydrogen production, liquid hydrogen tank by trucks for transportation, liquid hydrogen tank for the storage of hydrogen at refueling stations, and fuel cell vehicles in Peking consume liquid hydrogen; Steam reforming of natural gas (central factory) for hydrogen production, Mg 2 Ni hydride cylinder by trucks for transportation, hydride cylinder for the storage of hydrogen at refueling stations, and fuel cell vehicles in Peking consume hydride; Coal gasification (central factory) for hydrogen production, hydrogen gas cylinder by trucks for transportation, hydrogen cylinder for the storage of hydrogen at refueling stations, and fuel cell vehicles in Peking consume gaseous hydrogen; Coal gasification (central factory) for hydrogen production, pipeline for transportation, hydrogen gas tank for the storage of hydrogen at refueling stations, and fuel cell vehicles in Peking consume gaseous hydrogen; Coal gasification (central factory) for hydrogen production, liquid hydrogen tank by trucks for transportation, liquid hydrogen tank for the storage of hydrogen at refueling stations, fuel cell vehicles in Peking consume liquid hydrogen; Coal gasification (central factory) for hydrogen production, Mg 2 Ni hydride cylinder by trucks for transportation, hydride cylinder for the storage of hydrogen at refueling stations, and fuel cell vehicles in Peking consume hydride;

2 Figure 2 :
Figure 2: The result of cluster analysis of the original variables.
represents Yuan which is the basic monetary unit of China, 1 Yuan = 0.16 $.

Table 2 :
The processed data of the 11 scenarios.

Table 3 :
The standard transformed data of the 11 scenarios.

Table 4 :
The correlation coefficient matrix.

Table 5 :
Main results of PCA of the 11 scenarios for the criteria.

Table 6 :
The values of the principal components for the 11 hydrogen energy systems.

Table 7 :
The final score and the sequence of the 11 hydrogen energy systems.