On the Generalization Capabilities of the Ten-Parameter Jiles-Atherton Model

This work proposes an analysis on the generalization capabilities for the modified version of the classic Jiles-Atherton model for magnetic hysteresis. The modified model takes into account the use of dynamic parameterization, as opposed to the classic model where the parameters are constant. Two different dynamic parameterizations are taken into account: a dependence on the excitation and a dependence on the response. The identification process is performed by using a novel nonlinear optimization technique called Continuous Flock-of-Starling Optimization Cube (CFSO), an algorithm belonging to the class of swarm intelligence. The algorithm exploits parallel architecture and uses a supervised strategy to alternate between exploration and exploitation capabilities. Comparisons between the obtained results are presented at the end of the paper.


Introduction
Hysteretic behavior of magnetic materials is a common topic found in literature [1][2][3][4][5][6][7] especially in the numerical modeling field [8][9][10][11][12][13][14][15][16][17][18][19].Presently, multiple phenomenological models exist to represent this behavior.The lack of a physical model suggested the use of advanced numerical techniques to represent it, like, for instance, Neural Networks [20][21][22][23][24][25][26][27][28][29][30].Among the phenomenological models, one that is based on physical considerations and that is very common model in literature is the one proposed by Jiles-Atherton in 1983 [31][32][33][34].After being proposed, the model was later refined through the physical explanation on some details concerning the reversible component and the main equation to give a physically sensible result for minor loops.The model is parametric and is defined by five coefficients that describe different physical aspects.The identification process of the model is numerical and is approached through solving a least squares (LSQ) problem against some reference data that can be either experimental or simulated [35,36].The use of this model for practical applications has been favored by the development of new optimization techniques, which allowed a finer identification of the model parameters [16,[37][38][39][40][41][42][43][44][45][46][47][48][49].On the other hand, the core of the model was widely revised in the years.
An approach that is commonly found in literature imposes some dynamic dependence for the parameters on the model excitation [50,51].These modified models feature a higher number of degrees of freedom and thus are able to represent complex dynamics.Indeed, this comes with the drawback that the identification procedure is more difficult.The identification problem for the classic model is nonconvex [52] and different solutions exist for the same material.The different solutions provide different generalization capabilities.Once the model has been identified for a particular set of excitations/responses, the goal is to use it to represent the material for any generic excitation.However, it has been seen that complex waveforms that feature minor loops are poorly reproduced unless particular care has been taken in the choice of the identification waveform.Nevertheless, using a complex waveform for identification makes the identification problem more difficult to solve.For this reason, a powerful optimization algorithm is needed, to solve both the problem of a difficult convergence and the risk of being trapped into the local minima of a nonconvex problem.For such a task, in this work, we propose the use of the Continuous Flockof-Starling Optimization Cube (CFSO 3 ) algorithm, described in [53][54][55].This algorithm belongs to the swarm intelligence class of metaheuristics algorithms and is an analytic evolution 2 Mathematical Problems in Engineering of the Flock-of-Starlings (FSO) algorithm found in literature and already applied with success on different problems [56,57].The main difference between the FSO and the CFSO 3 lies in the criterion for position update of the particles: the FSO updates the positions according to a discrete law; the CFSO computes analytical trajectories for each particle.Since the trajectories are computed analytically, it is possible to supervise the optimization behavior: it is possible to modify the algorithm parameters to exhibit convergence, divergence, or local oscillations of the particles.The proposed algorithm runs natively on master-slave parallel architecture.The main idea is to have a master process performing the exploration of the solution space and to refine the solutions into the candidate areas through local exploitations performed by slave processes.The purpose of this work is to investigate the generalization capabilities of a modified Jiles-Atherton model, featuring five additional parameters.This new ten-parameter model will be shortly addressed as J10P.The additional parameters are not constants (like the classic ones) but are rather scaled according to the model state.In particular, two different dependences are going to be investigated: a dependence on the excitation and a dependence on the response.In both cases, the model will be identified on a wide set of reference materials, and the results will be validated against random waveforms on the same materials.In the first part of this paper, the derivation of the J10P model from the classic one will be shortly reviewed.In the second part, the CFSO 3 algorithm will be analyzed, remarking its parallel capabilities.In the third part, the identification and validation benchmark will be described, along with the results obtained in terms of convergence and generalization.Conclusions and final considerations will follow.

The Dynamic Parameterization of the Jiles-Atherton Model
Several models are available in literature to represent the magnetic hysteresis phenomenon [1].Models span from representing the phenomenon at microscopic level [2] through domain analysis to less physical and more application oriented approaches [7,13].Among the different models, the most commonly used in literature are the one from Preisach and the one from Jiles-Atherton.The former is widely used in both scalar [5,[8][9][10][11]15] and vector [3,19] forms.A recent work extended the hysteresis representation of the Preisach vector model in 3D domain [12].The original model proposed by Jiles-Atherton [31][32][33][34] for the hysteretic response of ferromagnetic material is a differential model that expresses the derivative of the magnetization with respect to the magnetic field, through a nonlinear relation.The model is composed of two equations ( 1) and (2), which express, respectively, the differential relation and the nonlinear component of the anhysteretic magnetization M an : There are five parameters involved:   , the magnetization saturation; , the shape parameter; , the mean field parameter; , the domain wall pinning constant; and , the domain flexing constant.The  term takes into account whether the integration is taking place in the ascending or the descending part of the loop.It is called directional factor, and its value is either +1 or −1, respectively.The strong advantage presented by the classic Jiles-Atherton parameter lies in the reduced number of parameters and the physical interpretation that was attributed to each parameter.The identification process for these parameters can be solved in several ways as proposed by the same author of the model [35,36] and other notable examples that can be found in literature [16,43].Still, the most common approach is to identify it by solving a least squares problem.Different algorithm can be used to solve the inverse problem: classic Particle Swarm Optimization [14,45], Simulated Annealing [48], Differential Evolution [51], and hybrid approaches based on Genetic Algorithms [47,52].The goal of the identification procedure is to identify the set of five parameters that describes the material under study.In the classic model, the five parameters are constant.The model used in this work deviates from the physical nature of the original J-A model, introducing a phenomenological approach where each generic parameter  is expressed by where  can be either the current excitation (i.e., the magnetic field ) or the last computed value for the response of the model (i.e., the flux density ).Introducing a dependency on the excitation/response of the model is a strategy already used in other modeling approaches where the hysteretic phenomenon is modeled through Neural Networks [20,22,24,27,30].Representation of the ferromagnetic material under distorted excitation [23,25] of a multifrequency domain [28] is a difficult generalization problem for an ANN that must be solved by introducing some sort of memory in the neural estimator.This approach is not mandatory in this model, since its differential nature naturally takes into account the memory effect.However, the representation of minor loops can be enhanced by applying this paradigm to model parameterization.The dynamic part of the parameter is distributed according to a Gaussian distribution, with the width of the curve proportional to .The value of  is related to the saturation values for either the magnetic field or the flux density, according to the model used.It is defined during the identification procedure and is equal to 0.2-0.8times the  SAT or the  SAT value.Indeed, the dynamic expansion for all the five parameters may be redundant for some simple problems.An incremental analysis from the J5P model to the J10P model can be found in Appendix B. The degrees of freedom introduced by the J10P model, in terms of flexibility in representing a hysteresis phenomenon, can be seen by operating alternatively on the two components of the same parameter.For example, the   parameter is basically a vertical scaling factor for the whole hysteresis loop: larger   will give a taller loop, whereas smaller   will give a shorter one.By modifying alternatively the   0 component and the   1 components, it is apparent how the two changes affect the hysteresis curve.In Figure 1, two sets of hysteresis curves are shown.On Figure 1(a), the   0 parameter assumes the values of {85, 95, 16, 1.16, 1.26}, and the   1 parameter is kept at 0. In Figure 1(b), the   0 parameter is set to 16, and the   1 parameter assumes the values of {−25, −15, 0, 15, 25}.
In both cases, the dynamic part of the parameter was dependent on the excitation of the model (thus,  = ).In the first case, the scaling of the hysteresis curve is the same on the whole domain of the cycle.In the second case, the hysteresis curve is scaled near the origin but is not affected near the saturation points.

The CFSO 3 Algorithm: From the Formulation to the Parallel Strategy
The problem of parameter identification for both the classic and the modified J-A models is a nonlinear problem that features several local minima and instable points [52].For this reason, the approach followed in literature for model identification tends to favor highly explorative algorithms [14,45,51].
Still, the solution found by these algorithms by themselves is seldom an optimal one due to the lack of exploitation capabilities.For this reason, hybrid techniques based on the combination of explorative and exploiting algorithms [47,52] have been proposed as a valid alternative to single algorithms.The choice of the exploitation algorithm can go from simple gradient descent based techniques [47] to complex metaheuristics like the Bacterial Chemotaxis Algorithm often used in electromagnetic inverse problems [44].Hybrid strategies are well versed to be implemented on parallel architecture, distributing the tasks of exploration and exploitation on the master and the slaves [18,49].In this work, the identification problem is solved through a novel algorithm that features both exploration and exploitation capabilities and runs natively on parallel architecture.The CFSO 3 algorithm belongs to the family of global optimization methods that are based on swarm intelligence.It is a very versatile algorithm that is able to exhibit both exploitation and exploration capabilities, according to the configuration of the algorithm itself.The core of the method is based on a modification of the well-known Particle Swarm Optimization (PSO) that features a "topological" component, called the Flock-of-Starling Optimization (FSO).The topological component is responsible for particle-to-particle interaction.Indeed, in the classic PSO algorithms, the only interaction between the particles is by the common attraction to the global best.In this algorithm, however, every particle adds to its velocity vector an average of the velocities of a number of other particles in the swarm.The update equation for the particles is given in the following: Given that  = 1, . . ., Δ, where Δ is the dimension of the solution, , , and  express the inertial (tendency to maintain the direction), cognitive (tendency to follow the personal best  best ), and social (tendency to follow the global best  best ) coefficients.Without the last term (the sum), this is the equation expressing the classic PSO update law.The term in the sum ∑  =1 ℎ  V   is the one accounting for the particle-to-particle interaction.Compared to the PSO, the FSO presents better exploration capabilities for very large solution spaces.The FSO however is still a discrete algorithm whose behavior is controlled by coefficients in a way that is not directly apparent.Indeed, by modifying the {, , , ℎ} parameters, it is possible to influence the algorithm general conduct, but strict control of convergence, divergence, and oscillation is impossible.A big novelty on this behalf was introduced with the CFSO algorithm [54,55].In this algorithm, the update equations for the FSO were considered as state equations for a dynamic system and were integrated in the Laplace domain: In ( 5), V and X are two vectors expressing the velocities and positions for the different members of the flock.F is a forcing term that takes into account both the personal and the global bests.The submatrix A 1,1 is expressed in (6) and combines the contribution for both inertia () and the particle-to-particle interaction (H).The submatrix A 1,2 accounts for the cognitive and social coefficients (given that  =  + ).The integration in Laplace domain for this system can be done under some assumptions for the H matrix.However, expressing the vector F as a Laplace transform can be rather tricky.Indeed, in (7), we can see the forcing vector in the time domain.The  best () and  best () functions are not known a priori, since they change according to the different points found by the algorithm during the exploration.For this reason, it is not possible to express them as Laplace transform.To solve this problem, the approach proposed in the algorithm suggests evaluating these two functions on a time-window (TW) reference.Each time window has a  length.Inside the time window, it is possible to assume that the value assumed by the global and personal best for each particle does not change and is equal to the one evaluated at the beginning of the time window.Under this assumption, it is possible to perform the Laplace integration on a time-window basis, thus effectively tracing the trajectory for the particles in closed form for each TW.Updating the particle position by using closed forms for trajectories has major advantages in terms of supervising the algorithm global behavior.Indeed, the set of particles is now a dynamic system completely defined in the Laplace domain.It is then possible to analyse the poles of the system to understand its stability conditions.Equations ( 8) express the poles for X() for a particular case of H where all the birds are connected (Fully Connected CFSO): The symbol ∼ on the , , and ℎ parameters expresses a parameter that is valid only under the current time window.Under the assumption of ( ω − h) 2 − 4μ ̸ = 0, ( ω + ( + 1) h) 2 − 4μ ̸ = 0, all poles are simple (either complex or real) with single multiplicity.We can thus apply the inverse Laplace transform and obtain the following expression for the single particle trajectory: Stability analysis can be performed by observing the real and imaginary parts of the poles: asymptotical stability can be obtained by condition (10), whereas oscillation can be obtained by condition (11).Consider Indeed, if the poles are real part negative, the resulting trajectories in the time domain will tend to a convergence towards an equilibrium point.If poles are real part positive, a fast divergence will occur.In both cases, if the poles are complex conjugates, the behaviors will be combined with oscillations.
Each of these behaviors can be used to create a hybrid strategy, similar to the one seen in [18,49,52], that features both exploration and exploitation capabilities.This strategy is thought to run on parallel architecture by the algorithm called CFSO 3 [53] depicted in Figure 2.
The CFSO 3 algorithm runs on master-slave parallel cluster.The master runs the CFSO configured to exhibit oscillating behaviors (CFSOpi).Its purpose is to explore the solution space and find candidate areas where a further investigation can be performed by the slaves.They run the CFSO with asymptotically convergent particles (CFSOas) that is initialized near the candidate area identified by the master.Each time the master identifies a new area, a slave is assigned to explore it.In the meantime, to increase the exploration capabilities of the master algorithm further, the poles are temporarily switched to an unstable configuration (CFSOus) to escape from the local minimum (the found candidate area).More details on the algorithm can be obtained in [53][54][55][56][57].As a final remark, it should be pointed out that the hybrid approach proposed was thoroughly compared with other swarm intelligence based algorithms, outperforming them in several different benchmark problems.In Table 1, a summary of the parametric configuration for the CFSO used in this work is reported.

Assessment of the Generalization Capabilities of the J10P Model
The purpose of this section is to investigate the generalization capabilities of the modified Jiles-Atherton model that features dynamic parameters by comparing two approaches.The first one adjusts the dynamic parameters according to the excitation of the system and the second one according to the response of the system.The former will be shortly addressed as J10P  and the latter as J10P  .Assessing the generalization capabilities of the model is a process composed of four steps.First, a reference dataset representing the magnetic behavior  generalize the material behavior is the most important feature for a model representing magnetic hysteresis in real applications.Indeed, in that case, it will maintain its accuracy also under distorted excitations, which is a common problem in power applications due to harmonic pollution due to the nonlinear loads.In fact, the upper limit in generalization capabilities for complex waveforms for the classic Jiles-Atherton (five parameters) model was the main reason for the implementation of the modified model [52].The previous studies analysed the generalization capabilities of the classic model through an investigation that involved several different magnetic materials and different kind of excitation waveforms.An example of J5P shortcomings for these problems can be found in Appendix A. Nevertheless, a model identified on simple { REF ,  REF } datasets (e.g., just a saturating waveform) is a poorly identified model, unable to correctly represent the material under distorted excitations.On the other hand, increasing the complexity of { REF ,  REF } makes the identification problem more difficult to solve but provides a model more reliable when complex waveforms are involved.This behavior however is not indefinitely monotonic, since very complex waveforms for { REF ,  REF } go beyond the flexibility capabilities of the classic Jiles-Atherton model.This is the reason for the creation of the modified model: removing as more as possible the limits imposed by the few degrees of freedom of the classic model.The workbench implemented in this paper is composed of three main elements: a set of excitations for both reference and validation, an optimization tool used for model identification (CFSO 3 ), and a set of virtual magnetic materials generated through the Preisach model (see Figure 3).The use of the Preisach model as reference "virtual" material is a technique already used in other works in literature on the generalization properties of the Jiles-Atherton model.Indeed, using a virtual material comes with the added benefit of simplifying the expansion of the investigation on several different materials, overcoming the need for samples preparation.Preisach model can be implemented simply in its scalar form [9,15], and the present analysis can be expanded for vector fields since both the Preisach [12,19] and the J-A [3,40,46] models have been implemented in vector form.For this work, 200 different magnetic materials were used, each corresponding to a different hysteron distribution in the Preisach model.The hysteron distribution used for all materials was a 2D Gaussian distribution on the {, } plane, defined by a 1 × 2  vector for the mean value, and a 2 × 2  covariance matrix.A set of four waveforms was used as excitations for each of the 200 materials to generate the { REF ,  REF } dataset.The four waveforms are shown in Figures 4-7.In Figure 4, the major loop, usually found in literature as the main identification waveform, is shown.This set will be addressed as MAJ.
In Figure 5, the major loop with added First-Order Reversal Curves (FORCs) is shown.This waveform is often used to identify the Preisach Model, and the richness of information contained in the curves near the saturation points is well known in literature.This set will be addressed as REV.In Figure 6, a major loop and a minor loop, connected through the First Magnetization Curve (FMC), are shown.FMC gives information on the shape of minor loops (indeed, all symmetric minor loops are aligned on the First Magnetization Curve).This set will be addressed as MIN.The last reference dataset, shown in Figure 7, is a complex waveform created by using major loop, minor loop, and FORCs, aimed at pushing to the limits of the flexibility of the modified model and     evaluating its generalization capabilities.This set will be addressed as FULL.
Each material is identified using the four waveforms {MAJ, REV, MIN, FULL}.Identification is performed for both the J10P  and the J10P  models.This gives 800 different solutions (200 materials for 4 waveforms) for each of the two models.The error related to these solutions expresses the convergence error of the model on the identification process and is given in the following: where  is the number of samples of the reference set, [] and  REF [] are, respectively, th samples generated by the model and from the reference dataset.Since the real goal of the analysis is to test the generalization capabilities of the model, and this must be done against a validation dataset, a set of 25 new waveforms, featuring along the major loop a random composition of minor loops and FORCs, was created.This set of 25 waveforms (an example can be seen in Figure 8) was used on the same 200 magnetic materials.The error, given in the following, is the performance error and expresses the generalization capabilities of the modified model: The results obtained from the analysis are reported in the following.In Table 2, the flexibility results on the identification process are reported.In Table 3, the generalization results are reported.In both tables, the results are obtained by averaging Reference Static J-A model  the performances on the 200 different materials.As it can be seen, from the first table, as long as the waveform is simple, the J10P  and J10P  model performances on MAJ, REV, and MIN are comparable.On the other hand, on a very complex waveform (FULL), the dynamic dependence of the J10P  (i.e., relative to ) provides a double accuracy on the identification procedure.It can be concluded that, for simple waveforms, using a dependence on the excitation  or on the response  brings to the same results, but to identify more complex waveforms, a response dependence is needed.In Table 3, the generalization capabilities of the models are compared.As it was expected, simpler waveforms yield worse generalization capabilities than more complex ones.The MAJ results are not very meaningful, since if model generalization is the goal, identifying a model on the major loop is the worst strategy possible.This is confirmed by the high error for both models.The error, as expected, goes down for more complex waveforms (REV and MIN) and gets a minimum using Mathematical Problems in Engineering Reference Static J-A model the most complex one (FULL).The best generalization results are obtained with the J10P  using the FULL identification dataset.

Conclusions
A performance analysis on the ten-parameter model for the representation of the magnetic behavior of a ferromagnetic material was presented.The analysis focused on characterizing two key aspects of model performance: its capabilities to fit a set of experimental data from a material (flexibility) and the ability, once identified, to represent any behavior from the same material (generalization).The analysis compared two configurations of the same model, one featuring an excitation-dependent parameterization of the model and one featuring a response-dependent parameterization of the model.Both configurations were tested through identification and validation on a wide set of virtual magnetic materials.The faced identification problem was rather complex, being a nonlinear and multimodal optimization problem, with a large solution space.In light of that, a powerful algorithm based on swarm intelligence, featuring a hybrid configuration that exhibits both exploration and exploitation capabilities on parallel architecture, was used to identify the model.The results of the analysis confirm that the use of complex waveforms for the model identification, despite making the identification procedure more complex, pays off in terms of generalization.The most complex waveforms used in this work are better identified, also in terms of generalization, by the response-dependent model J10P  .This is the most suitable configuration to achieve a model that can give a considerable response when complex and distorted waveforms must be simulated.However, for simpler waveforms, good results (especially in terms of flexibility) have been obtained by the J10P  model.

A. On the Flexibility of the Static Jiles-Atherton Model
The original J-A model shortcomings can be expressed in a lack of generalization capabilities due to a low flexibility.Figure 9 shows an example of how the classic model would be identified in literature: by using as reference the simple saturating loop (MAJ).As it can be seen, the performance on the FORCs and minor loops are very scarce.Indeed, identifying the model on the MAJ waveform yields unacceptable generalization capabilities if complex waveforms are supposed to be simulated.As shown in the analysis of this work, using waveforms featuring complex patterns grants better generalization capabilities.However, such waveforms are beyond the flexibility limits of the classic J-A model, as can be seen in Figure 10: the model simply cannot be identified.
For the classic J-A model, the best results in terms of generalization capabilities can be obtained by identifying the model on a waveform composed of the major loop, the First Magnetization Curve, and a set of FORCs.Attempts to increase the complexity of the waveform further were unsuccessful.

B. On the Performance of the Static Jiles-Atherton Model with Less Than 10 Dynamic Parameters
The use of ten dynamic parameters yields, as shown in this work, the best performance for the J-A modified model.However, if computational costs are a key factor in the implementation, or if the identification algorithm is not powerful enough to identify the model correctly, a scaled-down model that falls in between the classic and the ten-parameter one can be used.Indeed, not all the parameters need to be dynamic, especially if the problem at hand is easy (i.e., very simple waveforms).Table 4 summarizes performance of the model by adding the dynamic parameters one at a time.For each model, only the best (i.e., the one that gave the largest increase in performance) new parameter was added.Two performance indexes are given: the RMSE encountered during the identification process (convergence error) and the percentage of increased performance for each model towards the one from the ten-parameter one (% Incremental Performance).The magnetic field dependence was used, and the waveform under test is the Full Wave.Comparable results were obtained with the flux density.

Figure 3 :
Figure 3: Preisach model for magnetic hysteresis implemented in Matlab environment.The hysteron distribution in the {, } plane can be seen on (b) (distribution is assumed null for  > ), resulting in hysteresis cycle on (a).

Figure 6 :
Figure 6: MIN reference dataset: major and minor loops.

Figure 8 :
Figure 8: An example of validation dataset, featuring two minor cycles, two topside FORCs, two botside FORCs, and the major loop.

Figure 9 :
Figure 9: Generalization capabilities of the classic J-A model with five parameters.

Figure 10 :
Figure 10: Failure in identifying the model using the Full Wave excitation.
Figure2: Conceptual representation of the CFSO 3 algorithm: the original domain for the solution space (yellow) is explored by the master using a CFSOpi algorithm.Once a candidate area is found by a particle, a slave is dispatched to explore the subdomain (light blue) using CFSOas algorithm.
VAL ,  VAL }.Fourth, the identified models must be tested for accuracy on the validation dataset.The performance error expresses the generalization capabilities of the model, which is their ability to represent the material behavior regardless of the particular waveform.It is important to emphasize that the capability to Mathematical Problems in Engineering

Table 2 :
Flexibility of the two models for the four waveforms used during identification.

Table 3 :
Generalization of the two models for the four waveforms used during identification.

Table 4 :
Intermediate models between the J5P and the J10P.