Evolution of Black-Box Models Based on Volterra Series

Modeling nonlinear systems has shown to be a challenge in different areas of science. Most natural phenomena and physical devices present a nonlinear behavior. In this sense, it is very useful to classify nonlinear systems, so the rightmodel can be used for each system. In [1], a complete classification of nonlinear systems is given. If any of the following phenomena occurs, a nonlinear dynamic model has to be used:


Introduction
Modeling nonlinear systems has shown to be a challenge in different areas of science.Most natural phenomena and physical devices present a nonlinear behavior.In this sense, it is very useful to classify nonlinear systems, so the right model can be used for each system.
In [1], a complete classification of nonlinear systems is given.If any of the following phenomena occurs, a nonlinear dynamic model has to be used: (1) asymmetric responses to symmetric input signal changes (ASYM), (2) generation of higher-order harmonics in response to a sinusoidal input (HARM), (3) input multiplicity, meaning that one steady-state response corresponds to more than one steady-state input (IM), (4) output multiplicity, meaning that one steady-state input corresponds to more than one steady-state output (OM), (5) generation of subharmonics in response to any periodic input (SHAM), (6) highly irregular responses to simple inputs like impulses, steps, or sinusoids (CHAOS), (7) input-dependent stability (IDS).
A nonlinear system is classified due to phenomena presence as follows: (i) mild: ASYM, HARM, and IM, (ii) intermediate: IDS, (iii) strong: OM, SHAM, and CHAOS.
In electrical engineering, signal amplifiers are very often used for different purposes.One of the main uses is for signal transmission, where a power amplifier (PA) is needed.A radio frequency (RF) power amplifier is a typical nonlinear system.Even when the transistor is operating in a quasilinear region, driven with small variance input signals, the output signal has nonlinear components, due to the physics of the transistor.
A PA behavioral model (BM) remains in the mildly nonlinear class.The known PAs to be modeled present these characteristics in normal operation conditions, when tested with sinusoid stimuli.None of the other phenomena (OM, SHAM, CHAOS, or IDS), which imply the need of intermediate or strong nonlinear dynamic models, were observed in amplifier measurements.This paper will present a classification of BMs and discuss the evolution of BMs based on VS used in the modeling of RF PAs, from some of the simplest models to recent ones reported in the literature.

Journal of Applied Mathematics
Nonlinear/memoryless Linear dynamic Linear dynamic Linear dynamic 2 + c 3 e(k) 3  + Figure 1: A PA representation using a nonlinear feedback structure [2].

Classification of Power Amplifier Behavioral Models
The classification of PA BMs used in this work is as in [2]: (i) memoryless (ML): the output envelope reacting instantaneously to variations in the input envelope, (ii) linear memory (LM): BMs that account for envelope memory effects attributable to the input and output matching networks' frequency characteristics, (iii) nonlinear memory (NLM): dynamic interaction of nonlinearities through a dynamic network.
Figure 1 is used by the authors to classify the various BMs.Memoryless models are represented by the block "nonlinear/memoryless." Linear memory models are models that account for the "linear dynamic": () and () blocks (matching networks).Models that care for nonlinear memory contain all previously mentioned blocks and the feedback path with the block (), attributed to electrothermal and/or bias circuitry dynamics.
Although this classification was very complete by the time of this paper [2], further developments in the field were noticed, and so an extension of this classification has to be done.Some models were recently reported: (i) the pruned Volterra series (rVS1) [3], (ii) the pruned Volterra series (rVS2) [4], (iii) a parallel cascade model (PCM) composed of a static nonlinearity and a reduced Volterra model (PNLrVS) [5], (iv) a parallel cascade subsampled reduced Volterra series with the first branch composed of a static nonlinearity and a rVS model and other branches being rVS models, all with the same memory depth (PssVS), as detailed in [5].
The last model (PssVS) is a multirate parallel reduced VS.This model presented the best performance among all BMs reported in the literature, in an extensive comparison presented in [6].
A timeline of publications related to the accuracy of VS models is presented in Figure 2, based on a search in the database of the IEEE Xplore.This search was focused on PA BMs.
A graphical representation of all these models from the initial classification of nonlinear systems to the modern VS models is presented in Figure 3.

Evolution of Power Amplifier Behavioral Models
This section presents the evolution of PA BMs and their equations from nonlinear memoryless models to reduced Volterra Series models.3.1.Nonlinear Memoryless.The nonlinear part of an amplifier model represents the intermodulation distortion (IMD), or the static part, and is usually composed of polynomials or other nonlinear functions (e.g., tangent-sigmoids, look-up tables).These models do not account for dynamics of the system.In this section, some of the memoryless nonlinear models will be covered.

Power Series.
A nonlinear system can be represented by a power series: where   are the polynomial coefficients and  is the order.A simple form to estimate a power series is using linear regression methods, as polynomial coefficients are linear in parameters.The polynomial regression matrix U for  measurements, a polynomial degree , and the parameter vector θ is ( Then LS equation can be applied: This regression matrix results in a Hessian with a high condition number (CN), defined as the ratio of the largest to smallest singular value in the singular value decomposition of a matrix [8].A large CN is not desirable in the estimation process, as it implies that small errors in the input can cause large errors in the output.For very high CN, orthogonalization of the regression matrix is required to find the solution.
The Hessian CN can be improved if orthogonal polynomials are used.These polynomials are derived based on the input signal used in the system.Thus, the regressors are closer to the ideal situation for a Hessian (regressors mutually orthogonal).
For real valued input signals, Chebyshev (derived for single tones) and Hermite (derived for Gaussian distribution) polynomials are typically applied.

Baseband Power Series.
Although polynomial LS estimation is a reasonable possibility to calculate the IMD components, it generates also "out-of-band" harmonics, as shown as an example in Figure 4.The numbers above show the respective IMD products (second, third, and so on) of a Taylor series expansion when a two-tone excitation is applied [9].
These are uninteresting for predistortion purposes, the main objective of behavioral modeling.To solve this problem, the first-zone equivalent (or baseband) polynomial is necessary.It can be derived writing the input signal as [10] So, a binomial based expression for   () can be obtained: Only the terms where  is odd and 2 −  = ±1 can contribute for the first-zone output or  = ( + 1)/2 and  = ( − 1)/2.Then (7) can be written as Using the binomial property and the relation observed in (5), Finally the first-zone filtered input signal can be found as The component (1/2 −1 ) (  (+1)/2 ) corresponds to the baseband power series coefficients.
If no bias is present in the input/output signals, the regression matrix for the estimation of the coefficients of the baseband polynomial can be written as The baseband polynomial can be written in a compact form: For complex Gaussian baseband input signals, a derivation of orthogonal polynomials is found in [11].

3.1.3.
Bessel-Fourier Model.The complex Bessel approximation of a memoryless RF power amplifier is obtained by the periodic extension of the instantaneous voltage transfer characteristics by a complex Fourier series expansion.This derivation was extracted from [12]: where () is the input signal and () the output signal, both with finite dynamic range, and   are Fourier series coefficients.The parameter  is determined by the maximum dynamic range of the input, , which defines the period of the Fourier series periodic extension.Hence, for the general -carrier input the output is And, employing the Bessel function series approximation, with  =   () and  =    +   (), it is possible to write (16) as Rearranging some terms, Defining   = (  −  − ) for  = 1 ⋅ ⋅ ⋅ ∞ and ∑  =1   = 1 (as only the first zonal components will be considered), and    denotes the   th-order Bessel function of the first kind.The coefficients   may be obtained by using a LS approximation.

Look-Up
Tables.Look-up tables (LUTs) are the most common type of nonlinear static models in real-world implementations [7].An advantage in comparison with other methods is the configuration possibility of the interpolation and extrapolation behavior.LUTs also present good accuracy and very fast evaluation.The drawbacks are poor physical interpretation, high number of parameters, and being not continuously differentiable.Linear interpolation is normally used to determine the points among intervals, but also other methods as cubic interpolation and splines are possible [13].

Linear Memory.
The two-box modeling techniques are a possibility to represent the linear memory of an amplifier.They are also known as modular approaches [14] or feedforward block oriented models [1].They are obtained by combining components from the following two classes: static (or memoryless) nonlinearities and causal, linear time-invariant dynamic subsystems.Parametric and nonparametric modeling methodologies can be used.Flexible arrangements of block structured models in two possibilities are feasible: Wiener model (linear-nonlinear) and Hammerstein model (nonlinear-linear) [15].The most frequently used configuration for the linear block of this model is a FIR filter.The nonlinear block is commonly represented by a polynomial [1].Examples of these structures are shown in Figure 5.
If the linear dynamic block is represented by a FIR filter, the output of this block for the Wiener model is For the Hammerstein model, the FIR filter output is If the static nonlinearity block is represented by a power series, the output of this block can be formulated for the Wiener Model and for the Hammerstein Model as follows: The overall model output is then the combination of these equations for each model: Equations ( 25) are a simple way to model a nonlinear amplifier with memory.

Nonlinear Memory.
More complex models are necessary to estimate the nonlinear memory, like parallel models or Volterra series.Examples of these models are parallel cascade models.
Any system that can be represented by a truncated VS (26) can be also modeled exactly using parallel cascaded structures [16].
This technique is the association in branches of various models (Wiener, Hammerstein, Wiener-Hammerstein, etc.).The overall model structure becomes more complicated with each iteration, as each branch is composed of a single model.The value of the cost function decreases or stays constant with each additional branch [17].An example of this configuration is seen in Figure 6.
This method combines the following favorable properties: (i) computationally efficient even for high-order models with large memory-bandwidth products, (ii) allowing the direct extraction of the Volterra kernels, (iii) offering the convenience to use different methods for the identification of the linear and nonlinear blocks [17].
As a drawback, this method is very sensitive to noise if too many paths are used [14].Consequently, a proper selection of the paths using parametric FOMs and the system order of the nonlinearity should be made, in order to achieve low noise and good convergence models.
Although the best estimation methods to identify the parallel Wiener model's coefficients would be the nonlinear ones, Korenberg proposed initially linear methods with acceptable results, as described in his paper [17].

Volterra Series.
Volterra series accounts for a mildly nonlinear class of nonlinear systems and has the property of dynamic interaction of nonlinearities, so it is well suited for the description of PAs.
The finite, discrete VS model is given by [18] where ℎ  is the kernel of order ,  and  are discrete indices of the sampling interval, and  is the memory length.The sampling interval must be selected to cover the needed input/output signal's bandwidth.
The main disadvantage of a VS based BM is the number of parameters necessary to estimate and consequently to represent the model.A VS model using 5 delay taps needs 5, 125, 625, and 3125 parameters for the 1st, 3rd, 5th, and 7th order kernels, respectively.These values are not practical, since an estimation using so many coefficients is very computational intensive, even for actual computers.
By using the symmetry condition, the complexity of the Volterra kernels as a function of the order of nonlinearity is given by the binomial [19]: where  is the number of delay taps used and  is the order of the kernel.Using (27), the above cited model is reduced to 5, 45, 126, and 330 parameters.Unfortunately, this equation is valid only for real valued signals.

Complex Valued Baseband Volterra Series.
In order to obtain the best model performance, it is necessary to adapt the BMs under study to the modern PA input/output industry standard signals, once these models are designed for linearization purposes.The excitation signals are complex valued, and as a practical issue only first-zone filtered (baseband) equivalent BMs are frequently used, due to the difficulties to implement bandpass models in hardware.
A closed form for determining the number of independent terms for baseband VS using complex signals is the binomial [20]: where ⌊⋅⌋ is the floor operation.
As an example, the numbers of parameters of a complex valued baseband VS using 4 delay taps are 4, 40, 200, and 700 for the 1st, 3rd, 5th, and 7th order symmetric kernels.The use of the Volterra kernels symmetry property is necessary in the model extraction process, since it eliminates the linear dependent columns of the kernel to be estimated.
Several techniques are employed to estimate VS.If the system is memoryless, VS are reduced to a Taylor series and can be estimated as described in Section 3.1.If the system has only linear memory, it can be estimated using the techniques  listed in Section 3.2.If the system presents only nonlinear memory or linear and nonlinear memory, some strategies described in Section 3.3 can be employed.As shown above, VS presents a very high number of coefficients.This can lead to ill-conditioned Hessian matrices, as shown in [20].The best way to estimate a BM using VS is to prune some terms, losing the minimum accuracy as possible.This simplification includes the use of only main diagonals terms of the VS kernels, the popular parallel Hammerstein models (PH) [21].

Pruned Volterra Series.
There are other models that take into account some physical knowledge of the device and include important interactions of the input signal, as in [4,22,23].These are the pruned or reduced VS models, because they include interactions different than the terms in the main diagonal.The accuracy is naturally higher than the PH models, but they also use more coefficients in the model.

Cascade Multirate Pruned Volterra Series.
Joining pruning techniques of VS, multirate techniques, and also parallelization of models (cascade), a new model was developed.This model presented different models composing the cascade.The first branch is a nonlinear static block, and the next branches are reduced Volterra series, each one estimated at a different subsampling rate (PssVS).This model and its estimation method are fully explained in [5], and it is shown in Figure 7.
Based on Figure 7 and considering the input signal as  meas (), the output signal  NL () is the output of an estimated baseband polynomial (or a look-up table) as in (12).The first residue is  res1 () =  meas () −  NL () . (29) The second output signal,  dyn1 (), is the output of an estimated reduced Volterra series BM.Its input signal is  meas (), and its output signal is the first residue,  res1 ().Then the second residue is  res2 () =  res1 () −  dyn1 () . (30) So the th residue for  ⩾ 2 is The overall output of the PssVS model is described as This equation is also described as in (33).This equation is linear in parameters and its coefficients can be found using linear regression methods such as least-squares (LS) [6]: The excitation signals are complex valued, and thus a firstzone filtered odd-order model was designed.This approach is expressed by the polynomial basis function showing only odd terms [24].The input vector is (),  is the number of branches used in the model,  is the NL order of the model, and  is the memory depth.A higher order can be used in the static nonlinear branch, as it can also be replaced by any other static nonlinear block.The advantage of this model is that it can account for different memory effects existing in a PA also at different sampling rates.Applying the input signal at different rates emphasizes the flexibility of the model.
A comparison of this model with other models here cited using simulated and measured data obtained from a LDMOS RF PA is firstly presented in [5], using only one FOM (NMSE).Further on, a comparison was presented in [6] using five different FOMs, namely, normalized mean square error (NMSE) [21], normalized root mean square error (NRMSE) [25], mean absolute error (MAE) [26], maximal absolute error (MaxAE) [27], and coefficient of efficiency () [28].Final results show a superior performance of the PssVS.
These results show a trend in LDMOS RF PA behavioral models that can not be override.Multirate models allow simpler hardware to be used in linearization devices that are the end products of BM.They also present a higher accuracy than any other model reported so far for LDMOS RF PAs.Further efforts in this research direction can reveal even more accurate models that use fewer coefficients with simpler hardware.

Conclusion
This paper presented a classification of nonlinear systems and also a modern classification of behavioral models of RF power amplifiers.Then an evolution of behavioral models was also presented, including all equations that characterize these models.This review showed several models, from basic baseband power series to the recent parallel multirate pruned Volterra series models, commenting also on their accuracy.Remarks about future trends in power amplifiers behavioral models were also made.

Figure 4 :
Figure 4: Frequency-domain response of a nonlinear amplifier supplied with a two-tone test input signal.

Figure 6 :
Figure 6: Example of a parallel Wiener model.

Figure 7 :
Figure 7: A PA behavioral model representation using multirate and reduced Volterra series in a cascade configuration.