Research on Underwater Sound Source Localization Based on NLM-EEMD and FCM-Generalized Quadratic Correlation

The accuracy of underwater sound source localization depends on the preprocessing effect of signal denoising and delay estimation performance. In order to meet the needs of high-precision underwater positioning, this article studies a time delay estimation method based on nonlocal mean-enforced mean empirical mode decomposition (NLM-EEMD) autocorrelation denoising preprocessing and fuzzy C-means clustering (FCM)-generalized quadratic for underwater sound source localization technology. The theories such as NLM, EEMD, FCM, and generalized quadratic correlation delay estimation are studied in detail, and the speech signal of the test library in the TIMIT standard library is selected for simulation analysis, which verifies the correctness of the delay estimation method. Experiments were carried out in a larger outdoor pool. The analysis results show that the NLM-EEMD adaptive denoising method effectively reduces noise interference and improves the quality of EEMD decomposition and the FCM-generalized quadratic correlation method increases the correlation between two signals. The time delay estimation accuracy is improved, and the positioning accuracy is 48.66% higher than that of the direct generalized cross-correlation method.


Introduction
Underwater sound source localization has always been the focus and hot issue of hydroacoustic research and is widely used in military and civil elds such as ocean exploration and development, coastal defense, etc. Sound source localization techniques based on acoustic sensor arrays can be divided into three categories: localization methods based on maximum controllable response power beamforming, localization methods based on high-resolution spectral estimation, and localization methods based on TDOA. Carter rst proposed the sound source location method of maximum controllable response power beamforming in 1977 [1] and later generations continue to optimize and improve it. Dibiase Joseph Hector proposed the SPR-PHAT method [2]; Wan and Wu proposed the SRP method [3] based on the principal eigenvector; and Do and Silverman and Lima et al.
proposed an improved search method from coarse to ne, which greatly reduced the amount of computation [4,5]. However, this location method requires a priori knowledge of the sound source and environmental noise, it is sensitive to the initial search value, and it not only reduces the amount of calculation but also reduces the accuracy of estimation [6]. e spatial resolution of the positioning method based on high-resolution spectral estimation is not limited by the signal sampling frequency, and arbitrary positioning accuracy can be achieved under certain conditions. However, this method relies on searching the entire space to determine the location of the sound source, which is more expensive and complex than other algorithms [7,8]. is method is generally not used in single sound source positioning. Compared with the rst two methods, the positioning technology based on TDOA is the simplest one, with high positioning accuracy, less computation, and good real-time performance. It can achieve good results in the application of single sound source positioning [9].
Time delay estimation accuracy is the decisive factor to determine TDOA positioning accuracy. Preprocessing underwater acoustic signals such as denoising or optimizing time delay estimation algorithms are e ective methods to improve time delay estimation accuracy and target positioning accuracy. Knapp et al. and Mosayyebpour et al. estimate time delay based on generalized cross-correlation, cross power spectrum phase, quadratic correlation, adaptive least mean square (LMS), acoustic transfer function, subspace decomposition, and cepstrum filtering methods [10][11][12][13][14][15][16]. Jiafan Yin and Lu Yang et al. used the NLM method to denoise the human ECG signal and eddy current sensor signal, respectively [17,18]. Lu, Liu, and Zhang et al. used the EEMD method to denoise the microseismic signal, pulse signal, and blasting vibration signal of underground mines, respectively [19][20][21]. Li, Yu, and Su et al. combined the FCM method to process pipeline leakage signal, EEG signal, and micro-Doppler radar intermediate frequency signal, respectively [22][23][24].
Based on the above research, this article proposes an underwater sound source location method based on NLM-EEMD and FCM-generalized quadratic correlation. First, NLM-EEMD autocorrelation denoising is performed on the collected signal, which makes up for the unsatisfactory denoising effect of EEMD on low SNR signals. en FCM fuzzy clustering is performed to adaptively extract the effective segment of the signal, making the correlation peak after generalized quadratic correlation time delay estimation more prominent; thus, the accuracy of sound source location is improved [22].

NLM Algorithm.
Let the signal model be expressed as y(t) � x(t) + n(t), where y(t) is the sound sensor that actually collects signals, x(t) is valid signal, and n(t) is noise signal.
For one-dimensional signals y(t), denoising processing [18,[25][26][27][28][29][30] based on NLM is to eliminate the interference of noise n(t), it is the process of recovering valid signal x(t), and is the weighted average of finding all similar blocks in the entire search area M(t) [31,32], the weighted mean K(t) can be obtained by where Z(t) represents the sum of the similarity degrees of all similar blocks, that is, the normalization factor, and the calculation equation is as follows: where ω(t, s) is the weight, and the calculation equation is as follows: where d 2 represents the sum of the squares of the Euclidean distance of the neighborhood blocks of the two sample points t and s, h controls the decay speed of ω(t, s), the value is h � λ ��� � 2L ∆ , ∆ represents the neighborhood block centered on the t sample point, L ∆ represents the neighborhood block centered on the s sample point, λ represents the bandwidth parameter, and its value is expressed as λ � 0.5σ according to the SURE criterion, where σ is the signal variance. e weight ω(t, s) depends on the similarity between the sample points s and t and satisfies the following conditions: e parameter relationship diagram of the NLM algorithm is shown in Figure 1. N represents half of the search area M, and the parameters P, λ, and N jointly determine the final result of the filtering.

EEMD Algorithm.
Empirical mode decomposition (EMD) is essentially a method of decomposing the time domain signal according to the frequency domain scale, producing a series of intrinsic mode function (IMF) components with different characteristics and a residual component; it is different from variational modal decomposition (VMD) [33,34]. But EMD decomposition may cause modal aliasing problems and endpoint effects, so the EEMD decomposition method has been developed. e steps of the EEMD algorithm [21,22] are as follows [35][36][37][38][39]: (1) Gaussian white noise is added to the target signal: (2) e noise-added signal x i (t) is decomposed by EMD to obtain multiple groups of IMF components IMF i k , where k � 1, 2, · · · , K, K is the decomposition scale.
(3) e average value of each group of IMF components is calculated as the final value of EEMD decomposition: e screening process of EEMD is adaptive. In the decomposition process, the Gaussian white noise signal is added, and the Gaussian white noise can be used to cancel each other when multiple trials are superimposed, and the abnormal events in the original signal are "smoothed." erefore, it is a noise-assisted data analysis method, which solves the modal aliasing problem that may occur in EMD and makes the denoising effect better and the final sound source localization accuracy higher. e EEMD decomposition flow chart is shown in Figure 2.

Combination of NLM and EEMD to Reduce Noise.
e signal is first preprocessed by NLM denoising, which can improve the quality of EMD decomposition. To avoid modal aliasing problems and end effects, the signal is decomposed using the EEMD method. EEMD decomposes the signal into a set of IMF components with frequencies from high to low, removes some IMF components that do not meet the threshold condition, and adaptively reconstructs other remaining effective IMF components to obtain the final denoised signal.
Adaptive signal reconstruction [40,41] is to select two evaluation indicators-correlation coefficient and root mean square error-that determine the effective IMF component by calculation and reconstructing the signal. e steps are as follows: (1) e correlation coefficient between each IMF component and the original signal is calculated separately, and the calculation equation is as follows. In the equation, x represents the original signal, j represents the j − th sampling point of the signal, and c i represents the i − th IMF component decomposed (i � 1, 2, · · · , K); the larger the Corr, the higher the correlation.
(2) e root mean square error of each IMF component and the original signal is calculated separately, and the calculation equation is as follows. e smaller the RMSE, the closer it is to the original signal.
(3) Calculating the adaptive threshold λ and δ, the equation is as follows. In the equation, K represents the number of IMF components, and the IMF components whose threshold satisfies the condition of Equation (10) are selected to reconstruct the signal.
e flow chart of EEMD signal reconstruction is shown in Figure 3.

Simulation.
e voice signal of the test library numbered FDAC1-SX304.WAV in the TIMIT standard library is selected, Gaussian white noise is added to make SNR � 0, the signal sampling frequency is set as 16000 Hz, and timefrequency domain analysis and NLM-EEMD noise reduction processing on the signal are performed. Time domain diagram, spectrogram, time spectrum, and Hilbert spectrum of the original signal are drawn in turn, as shown in Figure 4.
Adding Gaussian white noise to the original signal to make SNR � 0, and the original signal with noise is shown in Figure 5. e noisy original signal is first preprocessed by NLM, and the result is shown in Figure 6; then EEMD is

Reconstructed signal
Calculate adaptive threshold, select the IMF component that meets the conditions c j (t), c m (t), or c l (t)(among them: j, m, l ∈ l, 2, ...K) EMD is decomposed to IMFc i (t), i = 1, 2, ...K decomposed, and the decomposed IMF components and the remaining components and the corresponding spectrum are shown in Figure 7; the IMF components are distinguished according to frequency, it is divided into high-frequency component, low-frequency component, and trend quantity. e time domain diagram of each component is shown in Figure 8. e correlation coefficient and root mean square error of each IMF component and residual component and the original signal are calculated, and then the adaptive threshold λ of 0.2692 and δ of 0.0074 are obtained. e IMF components with correlation coefficient greater than 0.2692 and root mean square error less than 0.0074 are selected for reconstruction, and the adaptive reconstruction process is shown in Figures 9 and 10. According to the result of adaptive reconstruction, the components IMF1, IMF2, and IMF3 are selected for reconstruction. Figure 11 shows the comparison diagram of the signal before decomposition and the reconstructed signal and the reconstruction error diagram. Figure 12 shows the comparison of original signal with the original signal with noise and the signal processed by NLM-EEMD; it can be seen that the denoising effect is very obvious, and the original signal characteristics are basically restored.

FCM-Generalized Quadratic Correlation
Adaptive Delay Estimation Algorithm 3.1. FCM Algorithm. FCM algorithm (Fuzzy C-means algorithm) [22] is a clustering algorithm based on flexible fuzzy partition. is method uses the concept of determining the geometric closeness of data points in Euclidean space and obtains the membership degree of each sample point to all class centers by optimizing the objective function, so as to automatically classify the samples.
Supposing the signal point sample is X � x 1 , x 2 , · · · x n , c(2 ≤ c ≤ n) represents the number of cluster centers, A 1 , A 2 , · · · A c represents the corresponding c categories, U represents fuzzy classification matrix, and V � [] 1 , ] 2 , · · · ] n ] represents the cluster center vector, we use the weighted sum of squared distances from the signal point X to each cluster center ] i (1 ≤ i ≤ c) as the objective function: where m is the weighting parameter and u ij and d ij are the membership degree and Euclidean distance from the signal point sample x j to the cluster center υ i , respectively. e essence of the FCM algorithm is to repeatedly modify U and V through iteration. When the algorithm converges, theoretically, various cluster centers and the membership degrees of each sample to each pattern class are obtained, thus completing the fuzzy clustering division.

Generalized Quadratic Correlation Delay Estimation
Algorithm. It is assumed that the acoustic signals collected by the two acoustic sensors are shown in Equation (11). In the equation, s(n) represents the original sound source signal, α 1 and α 2 represent the attenuation factors of the two signals, n 1 (n) and n 2 (n) are noise, and τ represents the time delay difference between the sound source reaching the signals collected by the two sound sensors.
e simplest time delay estimation is to directly perform cross-correlation processing on the two signals x 1 (n) and x 2 (n), as shown in Equation (12).
Substituting (11) into Equation (13) to get: Because (n), n 1 (n), and n 2 (n) are not related to each other, (13) can be further simplified: where R ss (τ) represents the autocorrelation function of the sound source s(n), and R n1n2 (τ) is the cross-correlation function of the two noises n 1 (n) and n 2 (n). If n 1 (n) and n 2 (n) are uncorrelated noises, (14) can be further simplified as: According to the properties of the correlation function, when R 12 (τ) takes the maximum value, τ is the time delay difference between the sound source signal reaching the two acoustic sensors. e generalized cross-correlation algorithm [42][43][44] transforms the cross-correlation function into the frequency domain and then weights the cross power spectrum to achieve the effect of suppressing noise, and then inversely transforms it to the time domain to obtain the generalized cross-correlation function. As shown in (15), G x1x2 (ω) refers to the cross power spectrum of the two collected signals, φ 12 (ω) is the weighting function.
Common weighting functions and their characteristics [45][46][47][48][49] are shown in Table 1. Shock and Vibration e principle block diagram of the generalized crosscorrelation delay estimation algorithm is shown in Figure 13. e quadratic correlation algorithm [50][51][52] is also a common time delay estimation method. It performs a correlation operation on the autocorrelation function and the cross-correlation function of the two time domain signals to improve the time delay estimation accuracy.
It is equivalent to Wiener filtering, which can effectively suppress areas with high noise power and error-prone signal estimation (i.e., areas with low signal-to-noise ratio) but will expand the peak of the cross-correlation function. SCOT It is an improvement on ROTH weighting, considering the effects of both channels, but when G x1x1 (ω) � G x2x2 (ω), SCOT is equivalent to ROTH, so it will also expand the peak of the cross-correlation function. PHAT It is equivalent to whitening filtering, which has a better effect on large signal-to-noise ratios and is suitable for broadband signals. However, when (ω), the correlation function is not a δ function; in addition, it is weighted by G x1x2 (ω). When the signal energy is small, the denominator approaches 0, thereby increasing the error. It can be improved by adding a fixed constant to the denominator.
|c(ω)| 2 is the modulo-squared coherence function, defined as |c(ω)| 2 � |G x1x2 (ω)| 2 /G x1x1 (ω)G x2x2 (ω); the maximum likelihood weighting function gives a large weight to the frequency band with a large signal-to-noise ratio and a small weight to the frequency band of a small signal-to-noise ratio. In this way, the influence of noise can be better suppressed, and it is the optimal filter in the statistical sense.

HB weighted
It has a suppressing effect on the periodic components in the signal, and the effect is similar to that of direct cross-correlation at low signal-to-noise ratio.
When there are obvious periodic components in the signal, it fails; when the signal-to-noise ratio is low, the performance is not as good as PHAT.

Fourier transform
Fourier transform take the conjugate Generalized Weighting Function power spectrum function G x1x2 (ω) generalized cross power spectrum Inverse Fourier Transform Peak detection Delay value Generalized cross-correlation function R 12 (τ) Figure 13: e principle block diagram of the generalized cross-correlation delay estimation algorithm. 8 Shock and Vibration Ignoring the cross-correlation function of signal and noise, (16) can be simplified as where R RS represents pure signal for secondary correlation and R RN represents noise for secondary autocorrelation. Compared with the primary correlation, the secondary correlation reduces the influence of noise on the signal and (ω) x 1 (n) x 2 (n)  can estimate the delay value more accurately in the low signal-to-noise ratio environment. e block diagram of the quadratic correlation delay estimation is shown in Figure 14.
In this article, the advantages of generalized cross-correlation and quadratic correlation are combined for time delay estimation, that is, generalized quadratic correlation time delay estimation [50]. e principle block diagram is shown in Figure 15. e weighting function of generalized cross-correlation chooses SCOT weighting function.

Combination of Delay Estimation of FCM and Generalized Quadratic Correlation.
is article uses the FCM algorithm to adaptively extract the effective segment signal and then performs generalized quadratic        Shock and Vibration correlation processing, which can sharpen the correlation peak and improve the accuracy of delay estimation. e steps are as follows [22]: (1) In the FCM algorithm, the weighting parameter m � 3 is set, and the number of cluster centers is c � 3. e acoustic signal used for underwater positioning studied in this article is the shock signal, so it is divided into three categories: the front part of the shock signal, the shock signal generation segment, and the post-impact signal segment.
(2) Using FCM to process the acquired signal, the effective segment signal point samples are extracted and reconstructed, and the shock generation segment signal is obtained. (3) For the acoustic signals received by other acoustic sensors, the same processing methods of steps (1) to (2) are adopted. In this way, each acoustic sensor corresponds to a signal of an impact generation segment, and the minimum value of the start point and the maximum value of the end point of these signals are taken as the final start point and end point, respectively, that is, the segment min (x(t start1 ), x(t start2 ), · · · · · ·) to max (x(t en d1 ), x(t en d2 ), · · · · · ·) is intercepted from the received signal of the acoustic sensor. e signal is processed as the final pending delay estimation. (4) e generalized quadratic correlation delay estimation is performed to obtain the delay difference between the signals.

e Principle and Process of Underwater Sound Source
Localization. In this article, the ternary acoustic sensor array is used to collect the underwater sound source signal, and the time delay between the signal processing and the signal is estimated, and finally the underwater sound source   position is calculated by using the time delay difference. e schematic diagram of the experimental site is shown in Figure 16. e azimuth distance equation used by the ternary acoustic sensor array for sound source localization is as follows: where φ is the azimuth of the explosion point to be estimated; R 2 � R is the distance of the explosion point to be estimated; c is the speed of sound; d is the distance between the array elements; and τ 12 , τ 13 , and τ 23 represent the time delay difference of the three-channel signals, respectively.

Experimental System Construction.
e experimental device is built as shown in Figure 16, and the sound source localization experiment is conducted in a large outdoor pool. e underwater sound source signal is generated by throwing heavy objects at different angles and distances relative to the ternary array. e distance between the elements of the ternary acoustic sensor array is 10 m and arranged in a straight line. e sensitivity of the acoustic sensor is less than − 200 dB when detecting the sound source in the frequency range of 10 Hz-50000 Hz, the sampling frequency of the acquisition instrument is 128k Hz, and each acquisition time is set to 10 s (Figures 17  and 18).

Experimental Signal
Processing. e collected acoustic signals at different locations are first analyzed in the timefrequency domain, and then NLM preprocessing, EEMD decomposition, and adaptive reconstruction are performed to obtain denoised signals. Finally, combined with the FCM algorithm, based on the generalized quadratic correlation delay estimation, the algorithm estimates the time delay of the three-channel signal in pairs and then solves the sound source position.
Taking the acoustic signal collected in an experiment as an example, the above analysis and processing are carried out. e time domain diagram and spectrogram and time spectrum of the signals collected by the three hydrophones are shown in Figures 19 and 20, respectively. e comparison between the original signal of the threechannel signal and the NLM preprocessing results is shown in Figure 21, the NLM-EEMD denoising results are shown in Figure 22, and the signal adaptive reconstruction process is  16 Shock and Vibration

18
Shock and Vibration  shown in Figure 23. e comparison between the reconstructed signal and the NLM processed signal is shown in Figure 24, and the reconstruction error is shown in Figure 25. It can be seen that the NLM preprocessing has an ideal de-drying effect. In the process of EEMD signal decomposition and reconstruction, the three signals are decomposed into 10 IMF components, adaptive threshold λ 1 is 0.2291, δ 1 is 0.0019, λ 2 is 0.2954, δ 2 is 0.0072, λ 3 is 0.2872, δ 3 is 0.0026, the first channel signal selects the 2nd, 3rd, 8th, 9th, and 10th components for reconstruction, the second channel signal selects the 3rd, 4th, and 8th components for reconstruction, the third channel signal selects the 2nd, 3rd, 4th, 8th, 9th, and 10th for reconstruction, and the original signal is basically restored, and the reconstruction error meets the requirements. Figure 26 shows the clustering diagram and the change diagram of the clustering objective function after the threechannel signal is processed by FCM. Finally, the generalized quadratic correlation delay estimation is performed as shown in Figure 27.

Algorithm Verification and Analysis.
In order to further verify the superiority of the algorithm in this article for underwater sound source localization, three different methods of direct cross-correlation method, generalized cross-correlation method, and generalized quadratic correlation method based on NLM-EEMD-FCM were used for the acoustic signals of the experiments in the previous section. e time delay estimation algorithm is used to calculate the sound source position. e delay estimation diagrams of direct cross-correlation and generalized crosscorrelation are shown in Figures 28 and 29, respectively. e positioning results of the three delay estimation methods are shown in Table 2.
It can be seen from Table 2 that the error is extremely large when using direct cross-correlation, which is basically invalid; for this group of collected signals, after the signal is processed by NLM-EEMD-FCM, the azimuth error is slightly improved, but compared with the distance error reduced by nearly 1.5 meters, it is negligible, so in general, the generalized quadratic correlation delay estimation method based on NLM-EEMD-FCM sharpens the peak value, improves the delay estimation accuracy, and reduces the underwater sound source localization error.

Conclusions
In order to improve the accuracy of underwater sound source location, an underwater sound source location method based on nonlocal mean overall average empirical mode decomposition (NLM-EEMD) autocorrelation denoising preprocessing and fuzzy C-means clustering (FCM)-generalized quadratic correlation time delay estimation is studied in this article. e experiment was carried out in a large outdoor pool. rough the analysis of the collected signal data, it was found that the NLM-EEMD adaptive denoising method has good effect, which improves the decomposition quality of EEMD. e FCM-generalized quadratic correlation method increases the correlation between two signals, sharpens the correlation peak, and improves the accuracy of time delay estimation. Taking the signal data analyzed above as an example, when the sound source distance is about 85 meters, the sound source location error based on NLM-EEMD and FCM-generalized quadratic correlation method is only 1.81%, which is 48.66% less than that of direct generalized cross-correlation method.
is article provides a new idea to improve the accuracy of time delay estimation and also provides a certain reference for the research of high-precision positioning of underwater sound sources.
Data Availability e data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this article.