Classi�cation of Marine Mammals Using Trained Multilayer Perceptron Neural Network With Whale Algorithm Developed With Fuzzy System

: The existence of various sounds from different natural and unnatural sources in the deep sea has caused the classification and identification of marine mammals intending to identify different endangered species to become one of the topics of interest for researchers and activists field. In this paper, first, an experimental data set was created using a designed scenario. The Whale Optimization Algorithm (WOA) is then used to train the multilayer perceptron neural network (MLP-NN). However, due to the large size of the data, the algorithm has not determined a clear boundary between the exploration and extraction phases. Next, to support this shortcoming, the fuzzy inference is used as a new approach to developing and upgrading WOA called FWOA. Fuzzy inference By setting FWOA control parameters can well define the boundary between the two phases of exploration and extraction. To measure the performance of the designed categorizer, in addition to using it to categorize benchmark datasets, five benchmarking algorithms CVOA, WOA, ChOA, BWO, PGO, were also used for MLP-NN training. The measured criteria are concurrency speed, ability to avoid local optimization, and classification rate. The simulation results showed that FWOA has the highest classification accuracy in classifying both sets of marine mammal datasets and provides better results than the other five benchmark algorithms.


Introduction
The deep oceans make up 95% of the oceans' volume, which is the largest habitat on Earth 1 .Creatures are continually being explored in the depths of the ocean with new ways of life 2 .Much research has been done in the depths of the ocean, but unfortunately, this research is not enough, and many hidden secrets in the ocean remain unknown 3 .Various species of marine mammals, including whales and dolphins, live in the ocean.Underwater audio signal processing is the newest way to measure the presence, abundance, and migratory marine mammal patterns 45,6 .The use of human-based methods and intelligent methods is one method of recognizing whales and dolphins.Initially, human operator-based methods were used to identify whales and dolphins.Its advantages include simplicity and ease of work.However, the main disadvantage is the dependence on the operator's psychological state and inefficiency in environments where the signal to noise ratio is low 7 .To eliminate these defects, automatic target recognition (ATR) based on soft calculations is used 8 .Then, contourbased recognition was used to recognition whales and dolphins due to its time complexity and low identification rate 9 .The next item from the subset of intelligent methods is the ATR method based on soft computing, which is wildly popular due to its versatility and parallel structure 10,11 .MLP-NN neural network, due to its simple structure, high performance, and low computational complexity, has become a useful tool for automatically recognizing targets 12 .In the past, MLP-NN training used gradient-based methods and error propagation, but these algorithms had a low speed of convergence and were stuck in local minima 13,14 .In recent years, we have witnessed the increasing use of meta-heuristic algorithms for the subject of neural network training.Genetic Algorithm (GA) 15 , Simulated Annealing (SA) 16 , Biogeography Algorithm (BBO) 17 , Gray Wolf Optimizer (GWO) 18 , Spider Community Algorithm (SSA) 19 , Dragonfly Algorithm (DA) 20 and Algorithm for Finding No Progress (IWT) 21 Suggested examples of the algorithm Existing meta-innovations that have been used as artificial neural network trainers.GA and SA are likely to reduce local optimization but converge at a slower rate.This works poorly in applications that require real-time processing.BBO has timeconsuming calculations.GWO and IWT, despite their low complexity and high convergence speed, fall into the trap of local optimization, so they are not suitable for problems with local optimization.Many regulatory parameters and high complexity in the SSA algorithm are among the weaknesses of this algorithm.In addition to having many control parameters, DA has time-consuming calculations that are not suitable for real-time use.The main reason for getting stuck in local optimizations is the imbalance between the two phases of exploration and extraction.To solve problems such as getting stuck in local optimizations and slow convergence speed in WOA, various methods are provided, including parameterization ɑ by linear control strategy (LCS) and arcsinebased non-linear control strategy (NCS-arcsine) to establish the right balance between exploration and extraction 22 .On the other hand, the No Free Launch (NFL) theorem logically states that meta-heuristic algorithms do not have the same answer in dealing with different problems 23 .Due to the problems mentioned and considering the NFL theory in this article, a fuzzy whale algorithm called Fuzzy-WOA introduce for the MLP-NN training problem to identify whales and dolphins.
To investigate the FWOA performance, we design an underwater data accusation scenario, develop an experimental dataset, and use a well-known benchmark dataset 24 .In this regard, a novel two cepstrum liftering feature extraction approach gets utilized to mitigate the impact of the time-varying multipath and fluctuating channel effect.In the following, the article is organized so that in part 2, it designs an experiment for data collection.Section 3 deals with the preprocessing of the obtained data.Section 4 will cover how to extract a feature.Section 5 describes WOA and how to fuzzy.Section 6 will simulate and discuss it, and finally, Section 7 will conclude.

The experiment design and data acquisition
As shown in Figure 1., to obtain a real data set of sound produced by dolphins and whales from a research ship called the Persian Gulf Explorer and a Sonobuoy, a UDAQ_Lite data acquisition board and three hydrophones (Model B& k 8013) were obtained was used with equal distance to increase the dynamic range.This test was performed in Bushehr port.The array's length is selected based on the water depth, and Figure 2. shows the hydrophones' location.

The ambient noise reduction and reverberation suppression
For example, the sounds emitted by marine mammals (dolphins and whales) recorded by the hydrophone array are considered x (t), y (t), z (t), and the original sound of dolphins and whales is considered s (t).The mathematical model of the output of hydrophones is in Eq.1.
x(t)=∫ ℎ( − )() In Equation (1), the Environment Response Functions (ERF) are denoted by h (t), g (t), and q (t).ERFs are not known, and "tail" is considered uncorrelated 25 , and naturally, the first frame of sound produced by marine mammals do not reach the hydrophone array at one time.Due to the sound pressure level (SPL) in the Hydrophone B&K 8103 and reference 26 , which deals with the underwater audio standard, the recorded sounds must be preamplified by 10 6 factor.Then, the Hamming window and Fast Fourier Transform (FFT) are applied to the frequency domain SPL.Next, frequency bandwidth is alleviated to 1 Hz bandwidth by Eq.2.SPL1 = SPLm−10 logΔf (2) SPLm is the obtained SPL at each fundamental frequency center in dB; re 1 μPa, SPL1 reduces SPL to 1 Hz bandwidth in dB; re one μPa, and Δf represents the bandwidth for each 1/3 Octave band filter.Wiener filter has been used to minimize the square mean error (MSE) between ambient noise and marine mammal noise 27 .After that, the results were calculated using Eq.3 to detect sounds with low SNR, i. e., less than 3 dB, deleted from the database.
Where T, V, and A represent all the available signal, sound, and ambient sound, respectively.After that, the SPLs were recalculating at a standard measuring distance of 1 m as Eq.4.SPL = SPL1 + 20 log (r) (4) The ambient noise reduction and reverberation suppression block diagram is shown in Figure 3. Figure 3.The ambient noise reduction and reverberation suppression block diagram In the next part, the effect of reverberation must be removed.In this regard, the common phase is added to the band (reducing the phase change process using the delay between the cohesive parts or the initial sound is called the common phase) 28

average cepstral features and cepstral liftering features.
The effect of ambient noise and reverberation decreases after detecting the audio signal frames obtained in the preprocessing stage.In the next step, the detected signal frames enter the feature extraction stage.The sounds made by dolphins and whales emitted from a distance to the hydrophone experience changes in size, phase, and density.Due to the time-varying multipath effect, fluctuating channel complicates the dolphins and whales recognition task.The cepstral factors with the cepstral liftering feature can significantly alleviate the multipath effects, while the average cepstral coefficients can notably eliminate the time-varying effects of shallow underwater channels 29 .Therefore, this section proposes the use of cepstral features, including mean cepstral features and cepstral liftering features, to make a valid data set.The channel response cepstrum and the original sound dolphins and whales cepstrum could be separated by higher and lower cepstrum indices, respectively 30 .They are in non-overlapping parts of the liftering cepstrum.Thus, the quality of the features is improved by lower time liftering.After the noise and reverberation removal, the frequency domain frames of SPLs (S(k)) are transferred to the cepstrum feature extraction section.The following Equation calculates the cepstrum features of the sound generated with dolphins and whale's signals. (5)


Where S(k) represents the frequency domain frames of sounds generate dolphins and whales SPLs, N denotes the number of discrete frequencies used for FFT, Hl (k) shows the transfer function of the Mel-scaled triangular filter, where l= 0, 1,..., M. Finally, cepstral coefficients are transferred back to the time domain as c(n) by Discrete Cosine Transform (DCT).
As mentioned earlier, the sound produced by dolphins and whales is extracted by Low-time liftering process.Thus, the low-time liftering is proposed as Eq.6 to extract the sound that originated from the whole sound.
wn nN Lc shows the length of the liftering window, which is used as 15 or 20.The final features can be calculated by multiplying the cepstrum c(n) with the and applying logarithm and DFT functions as Eq.7 and Eq.8, respectively.
T [ (0), (1),..., (P 1)] The first 512-cepstrum points (out of 8192 points in one frame for a sampling rate of 8192 Hz, expect for zeroth index { C [0] m y }) are corresponded to 62.5 ms liftering coefficients, are windowed from the N indices, which is equivalent to one frame length to reduce the liftering coefficients to 32 features.The length of sub-frames, before averaging, is five seconds.During the averaging liftering process, ten previous frames comprise 50s average cepstral features.The final average cepstral features are calculated by smoothing those ten frames.Consequently, the average cepstral feature vector consists of 32 features.In the next phase, the Xm vector would be an input vector for an MLP-NN.To sum up, the number of neural network inputs is equal to P. The whole process of the feature extraction stage is indicated in Figure 6.To sum up, the output of this stage is shown in Figure 7.

Design of an FWOA-MLPNN for automatic detection of sound produced by marine mammals
MLP-NN is the simplest and most widely used neural network 31 .Important applications of MLP-NN include automatic target recognition systems.For this reason, this article uses MLP-NN as an identifier 32 .MLP-NN is one of the most robust neural networks used for systems with high nonlinearity.Also, the MLP-NN is a feedforward network capable of non-linear fittings with higher precision.Despite what has been said, one of the challenges facing MLP-NN is always training and adjusting the edges' bias and weight 33 .
Figure 7.The typical visualization produces sound cepstrum features (Striped Dolphin).The steps for using meta-heuristic algorithms to teach MLPNN are as follows: The first phase is to determine how to display the connection weights.The second phase is assessing the fitness function to evaluate these connection weights, which can be considered as Mean Square Error (MSE) for the recognition problems.In the third phase, the evolutionary process is used to minimize the fitness function, which is the MSE.The technical design of the evolutionary strategy of connection weights training can be presented in Figure 8. and Eq.10.(10)   Where n represents the input nodes' amount,   indicating the node's connection weight to the th j node j  denotes the bias (threshold) of the th j hidden neuron.training sample.In the final stage, the recognizer needs a meta-heuristic algorithm to tune the parameters mentioned above.The next section proposes using an enhanced whale optimization algorithm (WOA) with fuzzy logic called FWOA as an instructor.

Fuzzy WOA
This section upgrades WOA using fuzzy inference.In this regard, in the first subsection, it will review WOA, and in the second subsection, it will describe the fuzzy method for upgrading WOA.

Whale optimization algorithm
The WOA optimization algorithm was introduced in 2016, inspired by the way whales were hunted by Mirjalili and Lewis 34 .WOA starts with a set of random solutions.In each iteration, the search agents update their position using three operators called the encircling prey, the bubble-net attack method (extraction phase), and the bait search (exploration phase).In encircling prey, whales detect prey and encircling it.The WOA assumes that the best solution right now his prey.Once the best search agent has been identified, other search agents will update their location to the best search agent.This behavior is expressed by Eq.13 and Eq.14.
Where t is the current iteration,  and  the coefficient vectors, ( * ⃗⃗⃗⃗ ) the place vector is the best solution so far and  the place vector.It should be noted that in each iteration of the algorithm, if there is a better answer ( * ⃗⃗⃗⃗ ) should be updated.The vectors  and  are obtained using Eq.15 and Eq.16. = 2 •  −  (15)  = 2 •  (16) Where  decreases linearly from 2 to zero during repetitions and  is a random vector in the distance [0,1].In the bubble-net attack method, the whale swims around its prey and along a contraction circle, simultaneously in a spiral path.To model this simultaneous behavior, it is assumed that the whale with a 50% probability chooses one of the contractile siege mechanism or spiral model to update the whales' position during optimization.The mathematical model of this phase is defined as Eq.17.

𝑋 (𝑡
Where  ⃗ ⃗ is obtained from Equation ( 8) and refers to the distance i from the whale to the prey (the best solution ever obtained).A constant b to define the logarithmic helix shape, l is a random number between -1 to +1.  is a random number between zero and one.Vector A is used with random values between -1 to 1 to bring search agents closer to the reference whale.In the search for prey to update the search agent's position, random agent selection is used instead of using the best search agent's data.The mathematical model is in the form of Eq.18 and Eq.19.
)   ⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗ is the randomly selected position vector (random whale) of the current population, and vector  is used with random values greater than one or less than 1 to force the search agent to move away from the reference whale.

Proposed fuzzy system for tuning control parameters
The proposed fuzzy model receives the normalized performance of each whale in the population (Normalized Fitness Value) and the current values of the parameters  and  .The output also shows the amount of change using the symbols ∆ and ∆.The NFV value for each whale is obtained by Eq.20.NFV= −    −  (20)     The NFV value is in the range of [0.1].This paper's optimization problem is of the minimization type, in which the fitness of each whale is obtained directly by the optimal amount of these functions.Eq.21 and Eq.22 update the parameters  and  for each whale are as follows: The fuzzy system is responsible for updating the parameters  and  of each member of the population (whale) and the three inputs of this system are: The current value of parameters  and  , NFV.Initially, these values are "fuzzification" by membership functions.Then their membership value is obtained using μ.These values apply to a set of rules and give the values ∆α and ∆C.After determining these values, the "defuzzification" process is performed to estimate the numerical values ∆α and ∆C.Finally, these values are applied in equations 12 and 13 to update the parameters ∆α and ∆C.The fuzzy system used in this article is of the Mamdani type.Figure 9.
Shows the proposed fuzzy model and membership functions used to adjust the whale algorithm's control parameters.

Simulation results and discussion
In this section, to show the power and efficiency of MLP-FWOA, in addition to using the sounds obtained in sections 2 and 3, the reference dataset 24 is also used.As already mentioned, To obtain the data set, the m X vector is assumed to be an input for the MLP-WOA.The m X dimension is 680× 42, which implies that the data set contains 42 features and 680 samples.Also, the dimension of the benchmark dataset is 410× 42.In MLP-FWOA, the number of input nodes is equal to the number of features.70% of the data was used for training, and 30% was used for testing.To have a fair comparison between the algorithms, the condition of stopping 300 iterations is considered.There is no specific equation for obtaining the number of hidden layer neurons, so Eq.23 is used to obtain 35 .
Where N shows the number of the inputs, and H denotes the number of the hidden nodes.Similarly, the number of output neurons is equivalent to the marine mammal classes, i. e., six neurons.For a comprehensive assessment of FWOA performance, this algorithm is compared with WOA 34 , ChOA 36 , PGO 37 , CVOA 38 , and BWO 39 benchmark algorithms.The basic parameters and the primary values of these benchmark algorithms get demonstrated in Table 2.The classifiers' performance is then tested for classification rate, local minimum avoidance, and convergence speed.Each algorithm is run 35 times, and the classification rate, mean and standard deviation of the minimum error and P-value are shown in Table 3. and table.4.Mean values and standard deviation of minimum error and P-value indicate the algorithm's strength in avoiding local optimization.Figure 11.and Figure 12. also show a comprehensive comparison of the convergence speed and syntax and the classifiers' final error rate.The simulation was conducted in MATLAB 2020a by a personal computer with a 2.3 GHz processor and 5 GB RAM.Test scenario and location of the Hydrophones The ambient noise reduction and reverberation suppression block diagram The block diagram of the reverberation removal section Typical sound presentations produced by dolphins and whales and their spectrogram The block diagram of the cepstrum liftering feature extraction process.
The typical visualization produces sound cepstrum features (Striped Dolphin) Introduction of MLP-NN as a search agent for a meta-heuristic algorithm (a) Persian Gulf Explorer (b) Research sonobuoys (c) Hydrophone model 8103 of B&K company.(d) UDAQ_Lite data collection board.

Figure 1 .
Figure 1.Items needed to collect data sets in The raw data including 170 samples of Pantropical Spotted Dolphin (8 sightings), 180 samples of Spinner Dolphin (5 sightings), 180 cases of Striped Dolphin (8 sightings), 105 cases of Humpback Whale (7 sightings), 95 samples of Minke Whale (5 sightings), and 120 samples of the Sperm whale (4 sightings).The trial setup was designed and conducted, as shown in Figure 2.

Figure 2 .
Figure 2. Test scenario and location of the Hydrophones Figure 3.The ambient noise reduction and reverberation suppression block diagramIn the next part, the effect of reverberation must be removed.In this regard, the common phase is added to the band (reducing the phase change process using the delay between the cohesive parts or the initial sound is called the common phase)28 .Therefore, a cross-correlation pass function by adjusting each frequency band's gain eliminates non-correlated signals and passes the correlated signals.Finally, in each frequency band, the output signals are combined to generate the estimated signal, i. e.  ̂.The reverberation removal block diagram is indicated in Figure4.Typical representations for dolphin and whale sounds and melodies and their spectroscopy are shown in Figure5.

Figure 4 .
Figure 4.The block diagram of the reverberation removal section

Figure 6 .
Figure 6.The block diagram of the cepstrum liftering feature extraction process.

Figure 8 .
Figure 8. Introduction of MLP-NN as a search agent for a meta-heuristic algorithm As previously stated, the MSE is a common criterion for evaluating an MLP-NNs as Eq.11.  m 2 kk ii i1 MSE O d   

Figure 10 .
Figure 10.How to training MLP-NN using FWOA In this paper, to classify marine mammals, a fuzzy model of control parameters of the whale optimization algorithm was designed to train an MLP-NN.For MLP-NN training from CVOA, WOA, FWOA, ChOA, PGO, BWO algorithms Used.As the simulation results show, FWOA has a powerful performance in identifying the boundary between the exploration and extraction phases.For this reason, it can identify the global optimal and avoid local optimization.The results indicate that MLP-FWOA, MLP-CVOA, MLP-WOA, MLP-ChOA, MLP-BWO, MLP-PGO have better performance for classifying the sound produced by marine mammals, respectively.The convergence curve also shows that FWOA converges faster than the other five benchmark algorithms in convergence speed.

Figures
Figures

Figure 1 Items needed to collect data sets in Figure 2
Figure 1

Table 2 .
The initial parameters and primary values of the benchmark algorithms

Table 4 .
Results obtained from different algorithms for Datasets obtained in parts 2 and 3 As shown in Figure 11.and Figure 12., among the benchmark algorithms used for MLP training, FWOA has the highest convergence speed.PGO has the lowest convergence speed by adjusting control parameters by fuzzy inference correctly detecting the boundary between the exploration and extraction phases.As shown in Tables 3. and table 4, MLP-FWOA has the highest classification rate, and MLP-PGO has the lowest classification rate among the classifiers.The STD values, shown in Tables 3. and Table4., indicate that the MLP-FWOA results rank first in the two datasets, confirming that the FWOA performs better than other standard training algorithms and demonstrates the FWOA's ability to avoid getting caught up in local optimism.A P-value of less than 0.05 indicates a significant difference between FWOA and other algorithms.