Phase Clustering Based Modulation Classification Algorithm for PSK Signal over Wireless Environment

,


Introduction
With the development of science and technology, wireless network has become the main media of information transmission in recent decades, and it plays an important role in the field of communication, field of military, and other fields.Satellite wireless network communication technology meets the requirements of time and place for information transmission, with its wide coverage, good broadcasting ability, and the unlimited character of different geographical conditions at any time or place.In the field of electronic surveillance and electronic countermeasures, multiple sensors can make each combat unit share their reconnaissance information through effective collaborative working systems and generate an overall situation with high precision and reliability via information fusion.Thus, the technology of cooperative reconnaissance network based on multiple sensors has become a hot issue in the field of electronic warfare.
As one of the main modulation methods in wireless communication networks, PSK, including multiple phaseshift-keyed (MPSK) and other converted forms such as /4 differential quadrature phase-shift-keyed ((/4)DQPSK), is also one of the most common carrier transmission modes in wireless digital communications.It has high spectrum utilization ratio and strong anti-interference ability; more importantly, it is also relatively simple in circuit implementations.Because of the remarkable spectrum character and multiple demodulation methods, (/4)DQPSK is widely used in satellite communication networks and mobile communication systems.The nonbalanced quadrature phaseshift-keyed (UQPSK) is a modulation mode transferring two different types and rates of binary bit stream data, which is established by different power distribution of two 2 Mobile Information Systems orthogonal components of the carrier.In recent years, it has been widely used in satellite digital communication networks or transmission and tracking systems between aircraft and grounded processing systems.
The present algorithms of PSK signals' modulation recognition are mostly based on decision theory and statistic pattern [1][2][3].The former [1] is usually analyzed via the maximum likelihood function.A sufficient statistic for classification is obtained and simplified, and then a suitable threshold is chosen for comparing with this statistical parameter to achieve the modulation classification.Based on this general principle, some improved algorithms, such as Quasi-Log Likelihood Ratio [4,5] (qLLR), Average Log Likelihood Function [6] (ALLF), Sequential Probability Ratio Test [7] (SPRT), and other improvements, were proposed by home and abroad scholars.However, these methods require a lot of known parameters, have a sensitive touch of symbols' synchrony and mode mismatch, and lead to a huge computation, which limit its own practical application heavily.
According to different statistical classification characteristics, the statistic pattern recognition methods can be divided into a number of branches, which are mainly based on instant information in time domain or other transforming domains, spectral correlation [8] (e.g., high order cumulant), constellations, chaos theory, and fractal theory, and other properties.Extracted from the instant information of the received signal in time and frequency domains, several parameters are adopted in Traditional Digitally Modulated Signal Recognition Algorithm [3] (DMRA).This method has a large correct recognition set, which makes it suitable for real-time data analysis.However, since each threshold is heavily dependent on SNR, DMRA cannot be effectively accomplished in practice, especially in the situations with low SNR.The algorithm derived from wavelet transform [9][10][11] can extract the signals' instant phase accurately.Yet it is only suitable for pulse shaping signals and deteriorates seriously for other kinds of signals.Based on spectral correlation, high order cumulant [12,13] has a great property of antinoise.However, it is limited to its exponentially increasing computation as the signals' modulation order is bigger-thanequal eight and cannot achieve online real-time data analysis, which limits its practical application heavily.Besides, the methods based on constellations [14,15] and fractal theory [16] and any other methods are restricted to various extents to apply in practice.
Inspired by data mining and image processing, some novel algorithms are proposed in the recent one or two years in noncooperative communication.The approach based on clustering algorithms is a new trend in Automatic Modulation Classification (AMC) for digital modulations.An advanced method derived from -means algorithm is proposed by Weber et al. [17] for Quadrature Amplitude Modulation (QAM) and PSK signals.In this paper, a novel utility function which indicates the best fitting constellation diagram is defined for the AMC decision.Simulations and measurements in a real monitoring environment demonstrate its effectiveness.Xu et al. [18] proposed a new method for phase clustering.Originated from mountain cluster algorithm [19], this technique can achieve multiple peaks in only one calculation process and avoid repeated peak cutting.On the other hand, it has about seven times computation time than high order cumulant due to its principle of repeated searching.Moreover, the author did a fault analysis at the condition of frequency offset existence.Both the algorithms above are formulated in the following section as objects for performance comparison.
In view of these above problems, this paper proposes an effective classifier.The structure is as follows: In the second part, one traditional and two novel methods of recent works on the PSK signal classification are formulated, and their shortcomings are pointed out, respectively.In the third part, an improved method is proposed and elaborated for classification.Furthermore, a robust estimator for frequency offset is presented and described in detail in the subsequent part.Necessary comparisons and simulations are performed and shown in the fifth part, which demonstrate the feasibility and effectiveness of proposed methods in this paper.

Recent Works
AMC is a significant step after signal detection in a radio monitoring environment and is fatal to the following process such as signals' demodulation and other steps.The simplified block diagram of the receiver is depicted in Figure 1.After series of preprocessing, the received signal in the receiver is transformed into a baseband signal.It is identified for signal modulation recognition, which leads to a more effective signal processing, such as demodulation and decoding, subsequently.
From the beginning of requirement in electronic monitoring and countermeasures, lots of researches are studied by home and aboard scholars.Their proposed algorithms and methods develop and improve the performance of NDA AMC of intercepted signals.A traditional algorithm, high order cumulant, and two novel methods, advanced -means algorithm and phase clustering method, are introduced and elaborated in this section.They are adopted as the comparison objections.
2.1.High Order Cumulant.Assume that the signal to be processed () is a  order stationary random process with zero mean.According to the basic theory of stochastic processes, the  order cumulant and  order moment of () are both relative to time delay, yet irrelevant to the th time spot.
For complex stationary random signal (), its high order moment can be unitedly expressed as where  and  are the index number of  and  * , respectively.Several high order cumulants commonly used can be represented by high order moment as follows: ( In theory, the algorithm of high order accumulation can completely eliminate the effect of Gaussian noise and be an ideal tool for signal processing under Gaussian noise.However, this method is not suitable for online real-time signal processing due to its amount of calculation.The higher modulation order the intercepted signal has, the more computation, which increases exponentially, it costs.

Advanced K-Means Method.
The -means algorithm is an optimal method for hard clustering when the number of clusters  in the input data  is known.Equation (3) is the cost formula, where  is the index and  is the length of the input data : The variable   is the membership indicator, which is equal to unity if the input data   belongs to the cluster  and zero otherwise.The membership indicator   is calculated based on the shortest distance of the input data   to the prototypes   , as given in (4), where   represents the index  of the winning prototype   for the input data   : The -means algorithm iteratively solves the clustering problem stated in (4) by alternating between a competitive and a learning step.In the competitive step, the allocation of the input symbols   to the prototypes   is carried out in such a way that  in (3) is minimised.In the learning step, the prototype positions are updated by calculating the mean value of the corresponding input symbols   .
A novel utility function  is proposed by Weber that indicates the best fitting constellation diagram to the calculated prototypes of the clustering algorithm: where   , is the result of the cost function in (3) for a specific constellation pattern  , of the considered modulation pool  , .The variable  represents the number of prototypes or the modulation order of  , and  represents the modulation scheme.  are the specific symbol positions of the constellation pattern. , and   are the calculated prototypes.Generally, the first term of the utility function  1 ( , ) indicates the minimisation of cluster variances, and the second factor  2 ( , ) evaluates the position of the calculated prototypes   to the given constellation pattern  , and the assignment of the input symbols   to the prototypes.To conclude, the utility function  =  2 1 ( , ) *  2 ( , ) evaluates if all prototypes are covered by the input samples and if the variance of the clusters and the EVM can be minimised.For this reason, this algorithm is named the highest constellation pattern matching (HCPM) algorithm.
A real monitoring environment is employed for field trial in Weber's paper.Eight signals including 4 kinds of QAM and MPSK signals, respectively, are introduced for identification capability demonstration.Two-Threshold Sequential Algorithmic Scheme (TTSAS) [20], fuzzy algorithm [20] and the -centre algorithm, are adopted for performance comparison.Simulations and measurements show the effectiveness of the proposed method than the other three counterparts.
The idea gives a new direction for signal recognition.What is better, it can be extended to the application of QAM signal detection and identification.However, the simulation running time it needed is too long; that is to say, it is not relatively suitable for real-time signal processing than some other methods.The simulation time of this method is displayed in the table in Simulation, which shows that it has a larger weakness as it compares to its counterparts.

Phase Clustering Method.
Another method based on data mining is phase clustering method proposed by Xu et al. [18].Inspired by subtractive clustering method, a novel clustering function is derived for signals' classification.Because the method proposed in this paper is an improvement of this method, the specific process is no longer described here.
The carrier frequency offset is also considered in Xu's paper.The premise is that the received signal has been timing synchronized.When the carrier frequency offset Δ satisfies Δ  ≤ 0.15, an approximate exact complex sequence can be obtained from [21], and the phase sequence can be expressed as where   is the symbol period and  = 1, . . .,  is the sample spot, where  is the sample number of each data package.  () is the noise's phase, and the phase of received signal   () is In order to eliminate the influence of frequency offset as much as possible, a new sequence can be obtained by making difference to   (); that is, where The major idea of Xu's method for reducing the influence of frequency offset is to employ the difference of phase of the baseband signal    () as the signal to be processed, the same as the condition without frequency offset to get the correct signal modulation order.A significant premise mentioned in his paper is that    () is a uniform distribution object.However, in this case, the effective part ( 9) is no longer subject to uniform distribution, which directly leads to the phase clustering algorithm invalid.
Four common MPSK signals are introduced for classification performance in XU's paper.The correction classification probability is adopted as the measure index.
In order to enhance the recognition efficiency, this paper proposes an improved method for phase clustering, which can effectively reduce the signal processing time without degrading the classification performance.In addition, in view of the carrier frequency offset, this paper also gives a feasible solution for its estimation and correction.

Proposed Method
In order to achieve a better NDA classification performance for signals, an improved phase clustering method is proposed in this paper.Then, a robust estimation method is proposed for frequency offset correction.

Advanced Phase Clustering Method.
The received wave of modulated PSK signal can be generally expressed as where   ∈ {exp(−(2/)),  = 0, 1, . . .,  − 1} is the symbol sequence,  is the modulation order, () is the shaping pulse of shaping filter,   is the sample period,   is the carrier frequency, and  0 is the carrier phase (Figure 3).() is the white Gaussian noise with zero mean and  0 variance.Root Raised Cosine (RRC) filter is generally adopted for signal shaping in wireless communication networks.It is also considered in this paper.Under the premise of carrier and timing synchronization, the output baseband signal's phase after matching filter can be written as where   () denotes the phase sequence of transmitted signal,   () denotes the phase sequence of noise, and  is the sample number of each data package.For PSK signal, its phase   (), expressed as follows, commonly obeys to uniform distribution, which means that there are / sample spots intensely around each constellation point if order modulated signal is sampled at  points: In order to measure the radian distance between the phase of sampled point and reference phase, a distance function () is defined with independent variable  ∈ [0, 2), here, An advanced clustering function is proposed for phase clustering and expressed as fractional form, which has a remarkable reduction on computation as comparing to the index form, as where  ∈ [0, 2) denotes the phase variable as the reference phase.
A division set of  is installed as the reference phase for phase clustering.As is shown in formula (14), all the phase to be processed need to measure the distance to each reference phase  before clustering.In fact, the clustering process is similar to a repeated search process.Just because of this, the computational quantity of phase clustering is totally determined by the division set of .A suitable division can not only reduce the calculation amount of this method, but also enhance the correction rate of clustering.The uniform distance from 0 to 2 is a common method of reference phase segmentation.
For simplicity, an approximation is used to reduce calculation amount.When (  () − ) ≥ 9/, And it can be regarded as When a baseband PSK signal is to be processed, the advanced phase clustering (APC) function can be expressed as Assume that   () is distributed independently and has an equal occurrence probability.Since   () is a stationary Gaussian phase noise, formula (17) can transform into Note that the necessary condition of the upper equation is the uniform distribution of the transmitted signal phase without noise.In the ideal constellation, there are approximate / sample points distributed on the location of each constellation, if the sample number of the received signal, which has  modulation order, is .Otherwise, the above equation is not established anymore.
Due to the particularity of the distance function, () has a periodicity of 2.On the other hand, the independent variable  in the clustering function (14) is only related to the distance function, which leads to the fact that the clustering function also has the periodicity of 2.
The character of periodicity of formula ( 14) is given in detail.When  + 2/ < 2, the periodicity of clustering function is formulated and elaborated in formula (19).It also can be proved as the same way that V  (+2/−2) = V  (), if  + 2/ > 2: The expression of clustering function V  () is a periodical function with Since the periodic signal has special spectral properties in frequency domain, the period of clustering function can be extracted easily through Fast Fourier Transform (FFT),   () = FFT[V  ()].The certain frequency which is corresponding to the place of maximum value of its Fourier transform result indicates  V .Modulation order  can be calculated through the above equation; then signals' classification is achieved.More favorably, carrier phase  0 is irrelevant to this method for  V , which means that this proposed method is also robust to signal's constellation rotation.
If deep recognition requirements are needed for PSK signals with the same modulation order, lots of statistics parameters can be chosen and employed.For example, there are two sets of data: 4PSK and OQPSK, 8PSK and (/4)DQPSK.Envelope entropy of differential phase of the baseband signals can be introduced to distinguish them, respectively.The recognition performance is certainly determined by the introduced statistics parameters, yet regardless of the APC method.
The APC method has several advantages: (1) Since it is derived from the coding characters rather than statistical properties, the direct influence of SNR is reduced.
(2) As an optimization algorithm of multiple peaks searching, it achieves all the peaks in one calculation process and avoids repeated peak cuttings.
(3) Fraction is used instead of exponential function in the clustering function, so that the calculation process is simplified, which leads to a much lower computational quantity.

Frequency Offset Correction.
In the digital communication system, the carrier frequency offset is often introduced by the difference between the receiver and the transmitter oscillator and also caused by the Doppler frequency shift, which is brought by the channel nonlinearity and phase noise.
In the wireless network, especially the electronic monitoring and other noncooperative communication systems, the accuracy of frequency offset estimation directly affects the performance of the receiver.
In Xu's method mentioned in last section, phase clustering algorithm is directly adopted with the difference phase of received signal.However, the difference phase of the effective part of signal which carries messages is no longer uniform distribution.Only frequency offset estimation and correction can be considered in this case to eliminate the impact of frequency offset as much as possible.
The frequency offset estimations of the Fitz and L&R algorithms are directly achieved via the weighted summation of the autocorrelation of the signal.This means that these two algorithms, and the improved methods based upon them, are all heavily affected by the correlation interval on the estimated range, unless the correlation interval achieves its maximum value of sampled number .However, in this case the calculation amount increases dramatically, especially when  is large.For this particular reason, most of the improved methods are derived from Kay's algorithm.The autocorrelation function of the processing signal is used in the M&M algorithm used in Kay's and L&W's algorithms for weakening the influence brought on by phase noise.However, the addition operation of the autocorrelation's phase introduces the phase folding problem.An objection phase of the baseband signal, which has a real value near − or , may be changed to a completely different value under the influence of noise, and this leads to an error in the frequency offset estimation result.The WNALP algorithm is derived from the M&M algorithm, which solves the phase folding problem and broadens the estimation range remarkably.However, the signal in real noncooperative environments is usually intercepted under a low SNR due to its special condition, which causes great difficulties in subsequent signal processing.
In order to reduce the thresholds' effect and improve the unbalance between estimation accuracy and estimation range of frequency offset under low SNR, an advanced NDA estimator based on the weighted summation of the differential phase of the autocorrelation is proposed in this paper.
Assume that timing synchronization is accomplished.The baseband signal sequence with frequency offset () is expressed as where   ( = 1, 2, . . ., ) is the modulated symbol sequence from the transmitting end,  is the number of sampling points of the selected signal segment in the receiving end,  Δ is the unknown frequency offset to be estimated,   is the sample period, and  is a random initial carrier phase, which follows the uniform distribution in the range of [0, 2).
Usually, the channel noise of the communication system () is considered to be random complex additive Gaussian noise, with zero mean and bilateral spectral density  0 /2.We normalize its amplitude as Then, under the hypothesis of () ≫ 1 for a large enough SNR, where β() is also a Gaussian process with zero mean.The autocorrelation is defined as where z() = x () is the normalized baseband signal raised to the power of  and   is the set maximum correlation interval.Using the same principle as above, the autocorrelation function can be continuously transformed as where ε() is also a Gaussian process with zero mean.We see that We define Δ 0() ≜ ∠ 0 () * 0 ( − 1).According to the principle of Kay's algorithm, an objective function can be set as where Δ 0 = [Δ 0 (1) , Δ 0(2) , . . ., Δ 0(  ) ]  .The estimated value of the frequency offset fΔ is obtained when the objective function J 0 obtains its minimum value.So, the normalized weighted correlation linear estimator proposed can be expressed as where  0 is the weight of the differential phase: .
The normalized baseband signal is considered to be the signal to be processed, which effectively weakens the performance loss resulting from the nonlinear operation of raising to a power of .The weighted summation of the differential phase of the autocorrelation also decreases the influence of noise effectively compared to the method of argument operation after weighting the conjugate difference of the autocorrelation.Thus it provides a better estimation accuracy and is described in Kay's paper [5].The difference of autocorrelation is a great improvement, which can make the estimation range independent of the maximum correlation interval   and solve the phase folding problem compared to Fitz's and the L&R algorithm and the improved methods based upon them.Meanwhile, the proposed method in this paper has the same estimation range as its counterpart WNALP algorithm.Importantly, the large estimation variance of the WNALP algorithm under low SNR conditions is improved, which effectively balances the trade-off between estimation accuracy and estimation range under low SNR in the process of frequency offset estimation.

Simulation
Computer simulations are performed to test the performance of the methods proposed in this paper.Considering the background of electronic monitoring and countermeasure in satellite communication and wireless communication networks, the simulation set contains six common modulation types of PSK signals: BPSK, QPSK, 8PSK, 16PSK, OQPSK, (/4)DQPSK, and UQPSK.Each simulation result is the average of 1000 independent runs.Because of the special environment we assumed, no a priori knowledge of intercepted signal is assumed for all the experiments.Regular communication equipment and environment are adopted for these simulations.The signals are shaped by raised root cosine filter with its roll-off factor  = 0.22.The received intercepted signal is sampled  = 512 points for test with Sample frequency   = 20 MHz.We suppose that the symbol rate   has been estimated accurately by a certain algorithm and the sample number in per symbol period is   =   * /  = 32.Additive white Gaussian noise is considered in this situation.Moreover, channel effects such as fading and multipath propagation are ignored and we assume that perfect time and frequency synchronization have been achieved.The SNR in this paper is defined as   / 0 , where   is the energy per symbol and  0 is the power spectral density of the Gaussian noise.
A common method based on four-order cumulant (HOC) and two new ways derived from data mining, advanced -means clustering (AKMC) and phase clustering (PC), are adopted for performance comparison of signal classification.Classification capabilities and simulation times in a single run of the subroutines are shown in Figure 2 and Table 1, respectively.
The same set of signal data to be processed are introduced in four subroutines, respectively, for classification performance and simulation time comparison.Except for fourorder cumulant, the other three methods present obvious correct classification rate trends and a large SNR tolerance for all the involved signals.The 16PSK signal can be correctly classified from approximate 2 dB, and the other six signals have larger SNR tolerances less than −5 dB.Even so, they are distinguished clearly via simulation time table.It is shown in Table 1 that PC method and AKMC method both cost two or more times simulation time than APC method proposed in this paper.If the modulation order of signal to be processed is 16, the order needs to be chosen as 16 for accumulation in HOC algorithm.However, the computation amount of this algorithm increases exponentially.It is a predictable result that its simulation time could be bigger than or equal to the time of APC method.Due to the instability of the APC algorithm at the low SNRs, few wrong judgements of the signal's modulation order appear.That is the reason why the order of 16PSK signal gets 17 during 1∼2 dB.
In summary, APC method proposed in this paper has a better classification capability than its counterparts and gives a new guidance for practical signal processing.Order classification result and correct classification rate by APC method are separately displayed in Figures 4-5.
Frequency offset directly impacts on the performance of signal classification.In order to verify the effectiveness of the proposed NWALP method, the WNALP algorithm, from which the proposed method in this paper is derived, was selected for comparison in this paper.
Let us assume that the signal to be processed is a QPSK signal with additive Gaussian noise.The number of sample points is set as  = 256, and the sampling frequency is normalized to the unit as   = 1.The correlation interval of autocorrelation is set as   = /2.Each simulation of estimation accuracy and estimation range for the frequency offset was run at least 100 times.The estimation variance is adopted as the measure of estimation accuracy.The MCRLB [22] is also calculated as an absolute measure of the theoretical optimal valuation: The performance of the proposed method compared to the abovementioned methods is shown in Figure 5 under a normalized frequency offset  Δ = 0.001, as the SNR changes within the range of −15∼20 dB by 1 dB steps.It can be seen that even when the frequency offset is set at a smaller value, the WNALP algorithm still shows a poor estimation performance of about 10 −2 under low SNRs.This means that when the frequency offset is small, the error of the algorithm may be of the same order of magnitude as the frequency offset itself.Such a large estimation error leads to a complete failure of the algorithm.However, it can be obviously seen that the estimation accuracy of the proposed method remains steady in the vicinity of 10 −4 under low SNR conditions and has an improved estimation accuracy of at least two orders of magnitude compared to the original WNALP algorithm.On the other hand, the estimation error of the proposed method rapidly decreases to a magnitude near the MCRLB.Even as the SNR increases, it remains steady at approximately 10 −7 due to the SNR threshold effect.
The estimation range of the carrier frequency offset is usually observed under large SNRs in order to obtain wider and more accurate bounds.When SNR = 15 dB, the estimated ranges of the chosen algorithms were simulated as shown in Figure 6.As can be seen, the proposed NWALP method can achieve the same frequency offset estimation range as WNALP, which has proved to be a better choice for a large estimation range than other algorithms.This estimation range cannot be increased even if the SNR increases.It can be seen that the proposed method in this paper can achieve a large frequency offset estimation range.
Figure 7 vividly shows the difference of constellation changes before and after frequency offset correction by two distinct colors.Signal with frequency offset make its constellation is displayed as a blue ring, which cannot catch any constellation point.However, the approximate original appearance is displayed after frequency offset correction with red.It can be seen that the proposed method in this paper has a remarkable effectiveness under 0 dB.

Conclusion
This paper presents a robust classifier for NDA recognition in noncooperative wireless environment.Nonsupervised clustering of the signal phase is achieved by measuring the radian distance between each signal phase and the reference phase.This method proposed optimizes the clustering function and reduces the computation sharply.Moreover, frequency offset is considered and an advanced method is proposed for frequency offset estimation and correction.First, a normalization of the baseband signal is performed.After the nonlinear operation of raising the signal to a power of , the estimate of frequency offset is obtained via the weighted summation of the differential phase of the signal's autocorrelation.This method balances estimation accuracy and range under low SNR conditions, which sharply improves the estimation accuracy without shrinking the maximum estimation range, even if the SNR is as low as −15 dB.Seven common PSK signals are adopted for simulation experiments.The classification performance and the estimation and correlation for frequency offset are displayed and demonstrated with several simulation result figures, which illustrate their feasibility and practice.Generally, this classifier, which is derived from data mining and image processing, has a guiding value for signal processing in electronic surveillance and electronic countermeasure of communication networks.Further work, such as the initial optimization of clustering centers  and multipath  and Rayleigh channel and other practical problems, is considered for applications.

Figure 2 :
Figure 2: Comparison of correct order recognition rate.

Figure 5 :
Figure 5: Comparison of estimation variance of frequency offset.

Figure 7 :
Figure 7: Comparison of frequency offset correction.

Table 1 :
Simulation time for order classification.