Numerical Analysis of Modeling Based on Improved Elman Neural Network

A modeling based on the improved Elman neural network (IENN) is proposed to analyze the nonlinear circuits with the memory effect. The hidden layer neurons are activated by a group of Chebyshev orthogonal basis functions instead of sigmoid functions in this model. The error curves of the sum of squared error (SSE) varying with the number of hidden neurons and the iteration step are studied to determine the number of the hidden layer neurons. Simulation results of the half-bridge class-D power amplifier (CDPA) with two-tone signal and broadband signals as input have shown that the proposed behavioral modeling can reconstruct the system of CDPAs accurately and depict the memory effect of CDPAs well. Compared with Volterra-Laguerre (VL) model, Chebyshev neural network (CNN) model, and basic Elman neural network (BENN) model, the proposed model has better performance.


Introduction
The networks, communications, and television systems have entered the digital age. The class-D power amplifiers (CDPAs) [1] have become increasingly popular for audio applications because of their high power efficiency. Since the output transistors of CDPAs operate in the ohmic and cut-off regions, there exists nonlinearity in the system. One of the nonlinear phenomena is the intermodulation distortion (IMD) [2]. The CDPAs also have the memory effects, resulting from the node voltage and current depending on not only the current input but also the historical signals due to the existence of parameters with dynamic distribution [3]. The existence of memory effects [4] is often identified by imbalances between the corresponding upper and lower distortion products, such as the same order of IMD.
Behavioral modeling [5,6] of nonlinear circuits and systems has received much attention in recent years. In behavioral modeling, the nonlinear component is generally considered as a "black box, " which is completely characterized by external responses, that is, in terms of input and output signals, through the use of relatively simple mathematical expressions. Behavioral modeling techniques provide a convenient and efficient means to predict system-level performance without the computational complexity of full circuit simulation or physical level analysis of nonlinear systems, thereby significantly speeding up the analysis process. The existing PA's behavioral models are mainly based on Volterra series or its expanded and simplified forms [7,8]. However, its large number of coefficients complicates its practical implementation, which makes the standard Volterra series only limited to "weak" nonlinear PAs.
Owing to the fact that the neural networks are provided with available solutions for nonlinear function approximation, system identification, exclusive or and encoder problems, the study of PA's behavioral models based on neural networks has already been developed in recent years [9][10][11][12]. In order to well study the nonlinear characteristics of CDPAs, a new behavioral model based on improved Elman neural network (IENN) is proposed in this paper. In IENN, a selfconnection of context nodes is added in this model, which could make the neurons more sensitive to the history of input data. The Chebyshev orthogonal polynomials [13,14] instead of sigmoid functions are employed as the activation 2 The Scientific World Journal functions of hidden layer neurons to improve accuracy and the convergence rate of IENN, and the structure of IENN is simpler. The gradient descent (GD) algorithm is used to train the neural network model. Simulation results with two-tone signal, linear frequency modulated (LFM) signal, and binary phase shift keying (2PSK) signal as inputs have shown that the proposed model IENN could well depicts the nonlinear distortions of PAs.
The remainder of this paper is organized as follows. The basic Elman neural network (BENN) is introduced in Section 2. In Section 3, the new behavioral model based on IENN and the training algorithm of IENN is presented in detail. Simulation results using two-tone signal and broadband signals as input are given in Section 4. The conclusion is shown in Section 5.

The Basic Elman Neural Network
The architecture of BENN [15,16] is illustrated in Figure 1, which is generally divided into four layers: input layer, hidden layer, context layer, and output layer. The feedforward loop consists of input layer, hidden layer, and output layer in which the weights connecting two neighboring layers are variable. There exists a back-forward loop between context layer and hidden layer, which makes the neural networks sensitive to the history of input data. In BENN, the context neurons can be treated as the memory units, so the model can manifest the memory effect of nonlinear system theoretically. Furthermore, because the dynamic characteristics of BENN are provided only by internal connections, there is no need to use the state as input or training signal, which makes BENN prior to static feedforward network.

The Behavioral Model Based on IENN
3.1. The Architecture of Improved Elman Neural Network. The architecture of IENN is presented in Figure 2; it is similar to BENN. To improve the learning speed and output accuracy, some changes are made. To better deliver the memory effect of the nonlinear system, a self-feedback is added to the context layer neurons with a feedback coefficient gain. This operation increases the memory depth and makes the model's output more sensitive to the history inputs. The value of Chebyshev orthogonal basis functions can easily be calculated by recursion operation, which is simpler than the sigmoid function. And the Chebyshev orthogonal basis functions have been used as active functions in many neural networks [14,17,18] for different applications and proved to be fast and accurate. We consider using the first category Chebyshev orthogonal basis as the activation function in the hidden layer instead of the sigmoid function in BENN. The research results have proved that the IENN model can simplify the computing complexity, reduce the training time, and enhance the convergence precision.
In IENN, the input layer has nodes, the hidden layer and the context layer own nodes, and the output layer possesses nodes. The basic functions in each layer are as follows.
Output layer Activation function:

Hidden layer
Context layer Input layer where represents the th iteration step and ( ) and ( ) denote the input and the output of the input layer, respectively.

Hidden Layer. The input of the th hidden layer neuron is
where ( ) is the output of the th context layer neuron, 1 ( ) represents the weight from the th context layer neuron to the th hidden layer neuron, and 2 ( ) represents the weight from the th input layer neuron to the th hidden layer neuron.
Since the input of Chebyshev orthogonal basis functions is defined within the interval [−1, 1], the input of the hidden The Scientific World Journal 3 layer needs to be normalized. The normalization of V ( ) is defined as The output of the th hidden layer neuron is The function (⋅) indicates the first category Chebyshev orthogonal basis functions given in Figure 2.

Context Layer.
In the context layer, the output is represented as where 0 ≤ ≤ 1 is the self-connection feedback gain of the context layer. When = 0, this network is reductive into the BENN.

Output
Layer. The output̃( ) of IENN can be expressed as where 3 ( ) denotes the weight from th hidden layer neuron to th output layer neuron.

Training Algorithm.
Training of the neural networks has been developed rapidly in recent years [19][20][21]. The gradient descent (GD) algorithm, as a basic approach for training neural networks in many areas, searches the parameter space of the network in the steepest descent way to minimize the error between the network output and the desired output [22]. In IENN, the gradient descent algorithm is used to update the weights. Assume that the actual system output vector is y( ) = [ 1 ( ), 2 ( ), . . . , ( )] ( is used for transpose) and the th iteration of IENN model output vector The error-function, namely, the sum of squared error (SSE), is defined as By using the partial derivative of error-function with respect to the weight parameters, the increments of the weights are as follows: where (⋅) is the first derivative of the normalized input of hidden layer neurons V ( ). 1 , 2 , and 3 represent the learning rate of 1 , 2 , and 3 , respectively. In order to well analyze the error of system output and IENN output, the transient absolute error vector ( ) is defined as The mean error of ( ) is

Training
Steps of IENN. By using the GD method, the training steps to determine the optimal number of neurons in hidden layer are as follows. The initial values of , 1 , 2 , and 3 are got by continuous testing.
Step 1. Prepare the training input and output data. Set the initial number for neurons = 4 in hidden layer, define the maximum neurons max = 100 in hidden layer, and the maximum iteration step max = 100. The threshold value of SSE is min .
Step 5. Increase the number of neurons in hidden layer; if > max , end the training process. Jump to Step 2.

Simulation Results and Analysis
In order to verify the correctness and reliability of IENN model, the training sample sequences are achieved from input of half-bridge CDPA shown in Figure 3. As shown in Figure 3, the PWM signal is produced by the comparison between the two-tone signal and the triangular signal. The frequency and amplitude of triangular signal are = 400 kHz and AM = 9.6 V. The output signal of CDPA is termed as y. A group of the training data is extracted by the sampling frequency = 1 MHz. The testing data has the same length and sampling frequency with the training data; the difference is the starting time. In the simulation results, the testing data's starting time is treated as 0 ms.

Optimal Neurons Number in Hidden Layer.
In order to determine the optimal neurons number in hidden layer of two models, by using the training data of two-tone signal, with the frequencies of the two-tone signal being 1 = 4.36kHz and 2 = 30kHz, their amplitudes being AM 1 = AM 2 = 4 V, and the signal length being 0.5 ms, the relationship between SSE( ) and the number of hidden layer neurons is studied. When the maximum iteration step In consideration of the convergence rate and the calculation, = 25 is chosen as the number of hidden layer neurons in the following discussion.

Simulation Analysis of Four Models with Two-Tone
Signal Input. The Volterra-Laguerre (VL) model [7] and the Chebyshev neural network (CNN) model [17,23,24] are introduced to be compared with the BENN and IENN model. The VL model is proposed in [7]. There are two parameters in this model: the number of Laguerre orthogonal functions and the pole of Laguerre functions (| | < 1). When = 3, this model cannot reconstruct the output well. Here, we choose = 5 and = 0.97; there are 605 parameters needed to be estimated. The CNN model in [23] employs a group of Chebyshev orthogonal polynomials to activate the hidden layer neurons, and based on the GD method, the iterative training formula is obtained. For three neural network models, set the number of hidden layer neurons = 25 and the iteration step max = 50. Using the twotone signal as input, the simulation results of four behavioral models in time domain are shown in Figure 5.
In Figure 5, the time domain error is the transient error y −ỹ. Values of the mean error and the maximum transient error max of four models are listed in Table 1 with two-tone signal input. It can be seen in Figure 5 and Table 1  The two-tone signal is often used to study the memory effect of the nonlinear system [4,25] since the IMD of the signal is easy to measure. When a two-tone signal is used as training data, the simulation results of four models in frequency domain are given in Figure 6.
In Figure 6(d), 1 = 4.36 kHz and 2 = 30 kHz are the input two-tone signal's frequencies. 3 = 2 − 1 and 4 = 2 + 1 are the second order IMD (IMD2). 5 = 2 − 2 1 and 6 = 2 + 2 1 are the third order IMD (IMD3). The existence of IMD means the system is nonlinear and the asymmetry of IMD demonstrates the memory effect of the system. The circuit output spectrum and the spectrum error are listed in Table 2.
The Scientific World Journal As shown in Figure 6 and Table 2, the spectrum error at IMD2 and IMD3 of VL model is a little large, the asymmetry between the upper and lower sidebands has been weakened, and some of the memory effect characteristics are lost. The short memory length of VL model is the reason for this. But the number of parameters in this model is already large; if the memory length increases, the parameters will increase rapidly. The spectrum of CNN in Figure 6(b) shows that it has lost almost all the information of the IMD. Since the CNN model is a feedforward neural network, the output of the model is only related to the input at present moment; it cannot express the previous influence of the inputs on the output, namely, that the CNN model cannot demonstrate the memory effect. The spectrum errors of BENN and IENN model are stable; under the same conditions, the spectrum error of BENN is 0.011 dB, and IENN is 4.92 × 10 −6 dB. The IENN is much more accurate than BENN.

Simulation Analysis of Four Models with LFM Signal
Input. For the experimental validation, the LFM signal is used as input of half-bridge CDPA, whose center frequency is 30 kHz, amplitude is 8.5 V, bandwidth is 4 kHz, and training data length is 2.0 ms. Other parameters of the simulation are the same as above. Using the LFM signal as training samples, the simulation results of four behavioral models in time domain are shown in Figure 7; the results in frequency domain are given in Figure 8. The mean error and the maximum transient error max of four models in time domain are listed in Table 3, and the average spectrum errors and the maximum spectrum errors are listed in Table 4.
As shown in Figure 7 and Table 3, when the LFM signal is used as input of CDPA, the transient error of the VL and CNN model is very huge, and the output signal cannot be reconstructed accurately. In the same conditions of the number of hidden neurons = 25 and the iteration step  As shown in Figure 8 and Table 4, using the training data of LFM signal, the spectrum error of the VL and CNN model

Simulation Analysis of Four Models with 2PSK Signal
Input. In further experiments, a 2PSK signal is used as input of half-bridge CDPA, whose carrier frequency is 20 kHz, amplitude is 8.5 V, digital baseband signal is a 7bit pseudorandom sequence ( sequence), baseband symbol width is 0.25 ms, and testing data length is 1.75 ms. Other parameters of the models are the same as above too. Using the 2PSK signal input, the simulation results of four behavioral models in time domain are shown in Figure 9, and the results in frequency domain are given in Figure 10. The mean error and the maximum transient error max of four models in time domain are listed in Table 5, and the average spectrum error and the maximum spectrum error are listed in Table 6.
It can be seen in Figure 9 and Table 5 that with a 2PSK signal input, the VL and CNN model cannot reconstruct the CDPA output accurately, and the transient error is still very large. Under the same conditions that the number of hidden neurons is 25 and the iteration step is 50, IENN model is more precise than BENN model. The final maximum transient As shown in Figure 10 and Table 6, using the training samples of 2PSK signal, the spectrum errors of the VL and CNN model are still very large, and the memory effect of CDPA cannot be demonstrated by these models. The spectrum errors of BENN and IENN model are steady. Under the same conditions, the maximum spectrum error of BENN

Conclusions
In this paper, a behavioral modeling based on IENN is proposed to describe the nonlinearity and memory effect of CDPAs. In IENN, a group of Chebyshev orthogonal basis functions is employed to activate hidden layer neurons to improve the learning speed and the accuracy and also to simplify the structure of model. A self-connection of context nodes is added to make the output more sensitive to the history of input data.
According to the simulation results, it can be seen that, to reach the same error threshold, compared to BENN, IENN needs fewer hidden layer neurons and less iteration steps. It means that IENN has fast learning speed and can use simpler network structure to achieve the same requirements than many other neural networks. Using the same number of hidden layer neurons and iteration number, simulation results by using the training data of two-tone, LFM and 2PSK