Red tides are caused by the combination effects of many marine elements. The complexity of the marine ecosystem makes it hard to find the relationship between marine elements and red tides. The algorithm of fuzzy
Ocean as the cradle of human beings provides humans with abundant biological resources and mineral resources. Along with the rapid growth of population and economic society in the 21st century, we march into the sea to alleviate the shortage of resources, followed by the marine ecological environment pollution and destruction. Because of a large amount of untreated waste water directly discharged into the ocean and global climate change, harmful red tide species increased dramatically. The cause of the red tides is more complex. Although the occurrence mechanism of red tide has not yet been determined, most scholars believe that red tide occurrence is closely related to water eutrophication [
There have been some studies on the red tide prediction. Using numerical method, Gibson et al. established NPZ ecological dynamic model and analyzed five kinds of seston feeding functions [
In this paper, a fuzzy
Red tide is caused by the comprehensive action of multiple factors, such as the sudden proliferation and accumulation of some plankton [
According to the research on the prediction of red tide, there are already many prediction methods, such as empirical prediction method, statistical prediction method, numerical model prediction method, and artificial neural network [
Clustering is one of the most basic activities of human understanding of the world. The purpose of clustering is to make the same class of things as similar as possible and different categories of things as different as possible. According to the different values of membership degree, the clustering method can be divided into hard clustering method and fuzzy clustering method. As for the hard clustering method, 0 means that the sample must not fall into this category and 1 means that the sample must belong to this category. Fuzzy clustering method is a combination of fuzzy theory and clustering analysis. The fuzzy clustering algorithm proposed by Dunn and extended by Bezdek is the most wellknown and the most frequently used method [
Let
The objective function of FCM clustering algorithm
The basic idea of FCM algorithm is to find fuzzy matrix of membership degree and cluster center, making the objective function minimum.
Membership degree function satisfies the following equations:
According to the Lagrange multiplier optimization algorithm, the update formulas of membership degree and cluster center are expressed as
FCM algorithm is described as follows.
Give the number of clustering categories
Randomly select cluster centers
According to formula (
According to formula (
If
Although FCM clustering algorithm as a classical algorithm is widely applied in a variety of key areas [
The selection principle of the initial cluster center is to make the initial cluster center within a certain threshold contain more data. This not only ensures that the clustering algorithm finds the cluster centers in a number of feasible regions, but also effectively reduces the impact of the noise and outliers on the objective function.
Let
Calculate Euclidean distance between any two samples in the dataset to generate distance matrix
With the center of the two samples as center and the regional threshold
Choose the nearest two samples in the remaining samples outside the region. Repeat Step
Figure
The selection of the initial cluster centers through the principle of regional minimum data density.
According to formula (
As shown in Figure
The selection of the initial cluster centers through the principle of the minimum mean distance.
As the most common analysis method of FCM, fuzzy clustering algorithm based on dissimilar objective function only considers weighting Euclidean distance between sample data and cluster centers, without taking into account the distance between each cluster center [
Dissimilar objective function with distance between cluster centers is defined as [
According to the Lagrange multiplier optimization algorithm under the constraint conditions of formula (
Formula (
In order to obtain the feature weight, the similarity coefficient
(1) The correlation coefficient method is defined as
(2) The least arithmetic average method is defined as
(3) The angle cosine method is defined as
The correlation coefficient method is used to calculate the similarity coefficient
The feature weight
The objective function of DWFCM is defined as
According to the Lagrange multiplier optimization algorithm under the constraint conditions of formula (
Considering the unit of sample data is not unified and the data is not complete, the data are normalized before the sample is classified. The normalized functions are as follows:
The steps of improved FCM algorithm are as follows.
Set the regional threshold
According to the number of regional thresholds, regional density, and regional radius, accomplish the selection of
Calculate the correlation coefficient
According to formulas (
If
The samples are classified according to the following methods:
In this case, the sample
Some changes of the marine environment factors accompany the process of the red tide from happening to extinction. The original sample set is composed of marine environment factors which have a great influence on the red tide. FCM, possibilistic fuzzy
Before clustering, the normalized processing of original samples is completed. If the horizontal axis represents nitrogen concentration and the vertical axis represents the phytoplankton density, the simulation results of three different algorithms are shown in Figures
In Figures
The original sample set with 21 elements.
Sample  Nitrogen concentration ( 
Phytoplankton density (10^{4}/cubic meter) 

1  0.4  148 
2  0.54  109 
3  0.1  71.5 
4  0.07  31.8 
5  0.54  109 
6  0.31  83 
7  0.42  113.1 
8  0.21  93.3 
9  0.18  31.1 
10  0.21  97.2 
11  0.4  74.5 
12  0.36  125 
13  0.56  187 
14  0.39  146 
15  0.07  32 
16  0.47  205 
17  0.43  136 
18  0.43  147 
19  0.26  160 
20  0.32  105.4 
21  0.37  103.7 
The comparison of clustering results in Example
FCM  PCM  DWFCM  

Clustering center  (0.55, 0.84)  (0.56, 0.81)  (0.54, 0.86) 
(0.78, 0.50)  (0.85, 0.43)  (0.81, 0.55)  
(0.18, 0.12)  (0.16, 0.19)  (0.17, 0.18)  


Iteration number  11  13  8 


Error score  4  3  1 


Error rate  20%  15%  5% 
The clustering results of FCM algorithm in Example
The clustering results of PCM algorithm in Example
The clustering results of DWFCM algorithm in Example
In order to sufficiently demonstrate superiority of the proposed optimization model in this paper, this example chooses another original sample set which also have a great influence on the red tide. Considering that the water temperature is the key factor of plankton growth speed and the transparency can be used to evaluate the density of plankton, water temperature (°C) and transparency (m) are chosen as the research factors. 32 original samples are shown in Table
As to the FCM algorithm, the sum of the membership degree of the same sample belonging to all categories is 1, which makes FCM algorithm sensitive to noise and outliers [
From Figures
The original sample set with 32 elements.
Sample  Water temperature (°C)  Transparency (m) 

1  26.47  1.5 
2  26.22  1.2 
3  24.47  1.2 
4  24.2  3.5 
5  25.15  1.2 
6  24.2  2.5 
7  23.7  2.2 
8  26.6  0.9 
9  26.5  0.4 
10  26.5  0.5 
11  26.1  0.6 
12  26.2  1.5 
13  26.5  0.9 
14  24.9  1.8 
15  25.0  2.5 
16  25.0  2.5 
17  25.8  1.1 
18  26.1  1.8 
19  26.2  1.5 
20  25.4  1.2 
21  27.4  5.0 
22  27.6  2.1 
23  27.0  2.0 
24  26.8  4.0 
25  26.8  5.0 
26  26.8  4.5 
27  26.6  0.9 
28  26.5  0.4 
29  26.5  0.5 
30  26.1  0.6 
31  26.2  1.5 
32  27.5  3.9 
The comparison of clustering results in Example
FCM  PCM  DWFCM  

Clustering center  (0.64, 0.10)  (0.65, 0.11)  (0.74, 0.12) 
(0.67, 0.48)  (0.68, 0.50)  (0.69, 0.39)  
(0.66, 0.84)  (0.69, 0.80)  (0.70, 0.82)  


Iteration number  18  19  13 


Error score  7  5  2 


Error rate  22%  16%  6% 
The clustering results of FCM algorithm in Example
The clustering results of PCM algorithm in Example
The clustering results of DWFCM algorithm in Example
In view of the complexity of the red tide disaster and the shortage of the previous prediction algorithm, a DWFCM algorithm is proposed to predict red tide. The initial cluster centers are chosen by the principle of regional minimum data density and the minimum mean distance, and the objective function is optimized by using the weighted cluster center. The simulation results show that DWFCM algorithm has better denoising ability and can optimize the prediction model of red tide disaster and get more accurate predictive results. However, DWFCM algorithm introduces many parameters. In order to get accurate parameter values, a lot of experiments have to be done. Therefore, the study of combining DWFCM algorithm with other algorithms to overcome the defects in the DWFCM algorithm is the focus of study in the future.
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work was supported by the Grand Science & Technology Program Shanghai China (no. 14DZ1100700).