Optimal Sizing for Wind / PV / Battery System Using Fuzzy c-Means Clustering with Self-Adapted Cluster Number

Integrating wind generation, photovoltaic power, and battery storage to form hybrid power systems has been recognized to be promising in renewable energy development. However, considering the system complexity and uncertainty of renewable energies, such as wind and solar types, it is difficult to obtain practical solutions for these systems. In this paper, optimal sizing for a wind/PV/battery system is realized by trade-offs between technical and economic factors. Firstly, the fuzzy c-means clustering algorithm was modified with self-adapted parameters to extract useful information from historical data. Furthermore, the Markov model is combined to determine the chronological system states of natural resources and load. Finally, a power balance strategy is introduced to guide the optimization process with the genetic algorithm to establish the optimal configuration withminimized cost while guaranteeing reliability and environmental factors. A case of islandhybrid power system is analyzed, and the simulation results are compared with the general FCMmethod and chronological method to validate the effectiveness of the mentioned method.


Introduction
Hybrid power systems (HPS) [1,2], especially those dependent on renewable energy generations (REGs), such as solar photovoltaic (PV), together with wind turbine generations (WTGs), have been regarded as the most promising configurations for remote areas power supply, since it is neither economical nor practical for delivering power over long distances.Although these clean energies provide significant contributions and opportunities, the unpredictable nature [3] of these resources has posed serious challenges to power systems [4,5].In the context of remote HPS, the greatest obstacle is to maintain power balance, because the adjustable capacity depends merely on REGs and batteries [6].Hence, the dynamic characteristic of the wind speed and solar irradiance, together with the power management of batteries, should be investigated to obtain practical configurations for HPS.
In previous literature, various methods have been introduced for sizing optimization of HPS.The stochastic nature of the REGs has been investigated with several probabilistic and chronological methods.The autoregressive moving average (ARMA) is utilized to model the uncertainties of wind generation, photovoltaic (PV) power, and load in [3,7,8].However, methods for parameters estimation of ARMA are always somewhat cumbersome.Reference [9] put forward an efficient approach for sizing optimization in a stand-alone HPS with Hybrid Big Bang-Big Crunch algorithm.Reference [10] analyzed the different results obtained by four heuristic algorithms; nevertheless the uncertainties of renewable energies have not been considered detailedly.Reference [11] suggested a method for technical and economic optimization in an isolated PV system; the solar radiation classification and power supply reliability calculation are performed hourly.However, only the cluster corresponding to the minimum solar radiation is selected, which may not be suitable in the context of HPS.The other investigations are based on gross chronological data [12,13] where the computation time is always too unbearable.In [14], the traditional fuzzy -means (FCM) is adopted, which divides the data of wind speed, solar radiation, and load evenly.Thus the inherent characteristics of the data are handled in a somewhat arbitrary way.

International Journal of Rotating Machinery
This proposed methodology will be complementary to the previous studies and take a step further.First of all, time series analysis [15] was used to describe the characteristic of hourly wind and solar and load data with FCM, the function of which is to group the elements of data sets that have analogous characteristics.Considering that FCM is sensitive to the initialization number of clusters [16], a parameters self-adaptive method is introduced to optimize the initial state.Furthermore, the Markov model [17] is combined to obtain the system scenarios of HPS.Then the correlation and time dependency of data sets are maintained with the time-dependent clusters of the renewable generations and load power consumption.The optimal sizes for WTGs, PV, DG, and batteries are determined with the genetic algorithm (GA), in which a power balance strategy is designed to ensure that the capitalized and operational costs are minimized and the reliability requirements, CO 2 emission, and batteries constraints are preserved at the same time.
The remaining parts of the article can be demonstrated in the following manner.The models of the components in HPS and the technique of FCM with self-adapted clustering number are introduced in Section 2. Section 3 presents the objective function and constraints for optimal sizing method in HPS.Section 4 utilizes a case of stand-alone hybrid system located in Hainan, China, to verify the advantage of the proposed methodology, where the comparison between the self-adapted FCM model and the traditional model with chronological data is analyzed.In Section 5, conclusions are summarized and the relationship between reliability and cost is discussed.

Models of the Components in HPS
2.1.The Components in HPS 2.1.1.WTGs Generation System.The output power of each WTG [10,18] is obtained by (1), and the power curve of IEC 61400-12 standard is displayed in Figure 1.
The parameters  and  are calculated by where V denotes the rated wind speed, V ci and V co are, respectively, the cut-in wind speed and cut-off wind speed.  means the rated power of WTGs.The overall wind power    can be derived with (3), and the total wind energy    can be obtained with (4): where   is the number of wind turbines and Δ is the time step.

PV System.
For each PV panel, the output power can be obtained [10] with where  PV () denotes the PV module power at time . V means the rated power,  means the solar irradiance, and  Ref means the referenced solar irradiance, in 1000 W/m 2 . Ref means the referenced temperature on the surface of panels, which can be set to be equal to 25 ∘ C.   means the temperature-coefficients of PV panels, and it can be set to be equal to −3.7 × 10 −3 .The temperature   of each cell is deduced with where  air is the atmospheric temperature. means the solar irradiance and  stc means the standard operation cell temperature and set to be 25 ∘ C. The overall wind power   PV can be obtained with (7), and the total wind energy   PV can be calculated with (8): where the number of PV panels is given with  PV .
International Journal of Rotating Machinery 3

Battery Bank and the Power Balance Strategy.
To accommodate the stochastic behavior of PV and wind resources, battery banks are widely utilized for hybrid power systems.The power balance strategy is mainly based on the flexibility of batteries and diesel generations.The diesel generator in [14] is the only adjustable power generation without consideration of storage devices.Thus the power balance strategy is limited to only one pattern, namely, the diesel generators run to make up for the power shortage of renewable energies.To maximize the utilization of REG and minimization of diesel generation, a power balance strategy is illustrated in Figure 2.

FCM with Self-Adaptive Cluster Number.
In this section, FCM clustering is modified to identify the operation state of HPS.The calculation complexity can be significantly optimized considering the number of states will be much less than the 8760 h in the chronological methods.Traditional FCM clustering algorithm can only deal with a prescribed data set with clustering number given in advance, which is not flexible in the context of large data sets.A new validity function [19] is introduced to construct the proportion of compactness and divergence; thus the cluster number can be obtained according to the given data set.

FCM Clustering for HPS. The FCM clustering algorithm established by Dunn and then further improved by
Bezdek has been used extensively.
The given data set is divided into  clusters relating to some given criterions to optimize an objective function.The problem can be formulated as An effective partition of a given data set should be divergent and compacted.The degree of compactness and divergence are evaluated with a clustering validity function.Hence, a new validity function is adopted in where  is the fuzzy weighting index and greater than  3.The partition matrix (0) is set as an initial condition; merely two local values of () are required to be compared since the solution is locally minimized of the object function, which validates the effectiveness of Step 4 in Figure 3.
In [14], the cluster number for WTGs is selected by evenly dividing the range between the cut-in and cut-off wind speed.The selection of cluster number of PV and load is the same.However, the inner uncertain nature of the wind resource may be disregarded by this means, and the accuracy of this method may be reduced significantly.
In this paper, the cluster numbers for WTGs, PV, and load are obtained via the method proposed in Section 3.1.More specifically, the wind speed V, solar irradiance, and load power can be divided into  WT ,  PV , and  LD clusters coherently.The cluster centers   ,  PV , and  LD are the representative in this cluster, namely, the representative state of wind speed, solar radiation, and load.

The Markov Stochastic Process.
In the analysis of a stochastic process, the Markov chain is an effective method to relate the probability of a state with the frequency of the corresponding event.The operation states are  WT ( WT ),  PV ( PV ), and  LD ( LD ), where  WT ,  PV , and  LD are the state indices obtained by the proposed method in Section 3.1.Take a wind farm with four Markov states as an example, shown in Figure 4.The state transfer probability and failure rate among different states are given with  and , respectively.
If   ,  = 1, 2, . . ., , are probabilistic for  states ( is the cluster number), then they should satisfy The diagonal elements  , are equal to the negativesum of each off-diagonal element at column  of transition probability matrix , and the elements in other positions correspond, respectively, to .
where  , denotes the transition number from state  to , and   means the number of states .Rate of departure (RD) is the modulus of the diagonal elements; the frequency (  ) and duration for state  (  ) can be formulated as The system states of HPS are mainly determined by the combination of WT and PV states, which also determine the state of DG and batteries.

Sizing Optimization
3.1.Objective Functions.In the proposed optimal sizing methodology of HPS, the main goal is to determine the amounts of each kind of DG.For a given system load, the objective can be set for the overall cost optimization.
Min  () = min ∑    , , where subscript  denotes the type of generations, namely, WT, PV, DG, and batteries, and  contains the cost of the unit, installation, and fuel consumption of the HPS.The fuel consumption of the DGs using fossil fuel is given with Then the cost model of a diesel generator is For the DGs using renewable energy resources like WG and PV, the operation cost can be ignored.For the DGs using fossil fuel, the operation cost should be accumulated in the studied period.The combustion of fossil fuel will contribute to the emission of CO 2 and gaseous pollutants.The ramping characteristics of diesel generation are neglected in the article since the time resolution is set to be one hour.
In (17), the fuel consumption cost   can be obtained.
By means of FCM clustering, it can be reduced to where   can be obtained with the proposed power balance strategy.

Constraints.
On the basis of normal operation in the stand-alone HPS, in order to associate the reliability factor and environmental factors, the main constraints of the proposed methodology are as follows.

Power Balance in the Given Time Resolution.
The foundation of the sizing optimization problem is the power balance.The power balance strategy in this paper is illustrated in Figure 2 (power balance strategy of HPS).The power balance equation is where  .. means output power of each kind of DG in the given time interval , the length of which is associated with the length of the planning period and is set to be 1 h in this paper.

The Minimum and Maximum Scale of the DGs
where the subscripts max and min mean the maximum and minimum restraints for DG scales, respectively.

Battery Constraints.
At time , the value of charge level for each battery bank should satisfy The battery banks capacity ( Bat ) sets the maximum value of charge level ( Bat,max ) and the depth of discharge (DOD) determines the minimum value of charge level for the batteries ( Bat,min ).
3.2.4.Reliability Index.LPSP (loss of power supply probability) is selected for its simplicity, which can be derived from the division of total time of power unbalance and the studied period with The subscript "max" signifies the maximum limits.

Environmental Factors.
The CO 2 emission of the DGs using fossil fuel can be obtained with (24), when considering it in the total period, and should be constrained by a maximum CO 2 max .
In this article, , , and  of a 30-kW diesel generator are set to be 0.028144, 0.001728, and 0.0000017.

Using Genetic Algorithm to Get Optimal Solution.
The genetic algorithm (GA) is chosen to solve the sizing problem considering its ability to obtain a globally optimal solution for optimization problems.It is inspired by the process of biological evolution, namely, crossover, mutation, and selection.The population of individual solutions is repeatedly modified with a "fitness" function, typically related to the objective function.
In this article, the optimization problem ( 14)∼( 25) is handled with GA, where the variables  WT ,  PV ,  DG , and  BAT are linked to form the gene strings in the state variable (chromosome), and ( 14) is set to be the fitness function.

Case Introduction. The data from an island in Hainan province of China are used to analyze the proposed problem.
There are abundant wind and solar resources in this island.

Simulation Results and Analysis.
The monthly average wind speed, solar irradiation, temperature, and load power profile consumption are shown in Figures 5-8.
Firstly, the wind speed V, illumination , and temperature  are imported to the HPS model to obtain the power output of WTGs (  ) and PV ( PV ).Then the cluster for WT, PV, and load are obtained via the method proposed in Section 2.1.The cluster centers Vc, Gc, and Lc are, respectively, the representative states for solar irradiance, load, illustrated in Tables 1-3.
According to the simulation results from self-adaptive FCM, the optimal cluster numbers for wind WT, PV, and LD are 10, 5, and 8. Namely, each WT has 10 states, each PV has 5 states, and the overall load has 8 states.There are 80 possible states for the renewable energy generations as a whole.It can be noted here that the scenario of HPS has been significantly simplified.
The operation scenario considered here is greatly simplified to be the aggregation of clustering states.The outputs of DGs and batteries are also determined by these states, according to the power balance strategy demonstrated before.
For a new state, the probability is the multiplication of state probabilities for every individual WTG, PV, and LD.The frequency  and duration time  for a new scenario can be obtained likewise.Then the results from chronology-based, traditional FCM based, and the self-adapted clustering number based GAs are compared; results are illustrated in Table 4. GA is capable of realizing global optimization but cannot guarantee it.The chronological-based method requires 8760 iterative loops, and the traditional FCM based method needs 920 (=23 * 4 * 10) iterative loops, and the proposed method needs 400 (=10 * 5 * 8) iterative loops.
It should be noted that the wind energy is superfluous in the winter nights.The output power of PV panels is redundant in the summer daytimes.
It can be found that chronology-based method is the most time-consuming due to complicate loops.With regard to the overall cost, the proposed method still has advantages.Figure 9 shows the iteration performance for the mentioned algorithms.
The proposed self-adapted FCM clustering model is superior to traditional method in two aspects: (1) Compared to chronological-based methods, with the reduction of investigated data set, the number of system scenarios of HPS can be significantly reduced,  and the computation burden and CPU time can be significantly reduced, shown in Table 4.
(2) Compared to the traditional FCM based method, the clustering numbers of data sets are inherently obtained and optimized, which increases the probability to obtain a global optimum solution.While the traditional method simply makes the partition by uniform division, the scenarios selection is somewhat arbitrary and disregards the inner stochastic characteristic of the data sets.Thus the basic FCM based method finally obtains a locally optimized solution.
In the proposed method, the benefits of the renewable energies are thought to be the reduction of CO 2 and the improvement of LOSP, which has been set as constraint to the very problem.Considering the impacts of different reliability index on the investment cost, let CO 2 max be 30000 kg/y.The overall investment cost grows higher as the reliability request (LPSP max ) increases, as shown in Figure 10.The impact of CO 2 is similar to the LPSP index.
Actually, the benefits of the reduction of CO 2 and the improvement of LOSP are negatively related to the cost optimization procedure.We have modified the discussion on this issue in our revised manuscript.

Conclusion
In this paper, a novel method utilizing the self-adapted FCM clustering combined with the Markov model and GA is proposed to determine the best mix of HPS.A power balance strategy is also designed to guide the optimization process.The self-adapted FCM clustering can handle the stochastic characteristics of REGs, and the Markov model can significantly reduce the operational scenarios of REGs.The proposed method has comparable competitive overall cost, and it can be concluded that the benefits of the reduction of  CO 2 and the improvement of LOSP are negatively related to the cost optimization procedure.
The future work will include the following: (1) Improving the clustering model to further study the correlation among renewable resources (2) Adding the local and global control strategy to the power balance analysis process (3) Extending the proposed method to the operation plan stage of the HPS.

Figure 2 :
Figure 2: Power balance strategy of HPS.

Figure 10 :
Figure 10: Sensitivity analysis of LPSP max and cost.
1and  means the clustering number. is the subscript index of the data set , in which   ( = 1, 2, . . ., ) is each pattern.
2.2.2.Clustering Procedure.Then, the FCM algorithm with self-adapted clustering number is outlined with the proposed validity function ().The clustering procedure is illustrated in Figure

Table 3 :
Markov model of load.