Universal Approximation of a Class of Interval Type-2 Fuzzy Neural Networks in Nonlinear Identification

Neural networks (NNs), type-1 fuzzy logic systems (T1FLSs), and interval type-2 fuzzy logic systems (IT2FLSs) have been shown to be universal approximators, whichmeans that they can approximate any nonlinear continuous function. Recent research shows that embedding an IT2FLS on an NN can be very effective for a wide number of nonlinear complex systems, especially when handling imperfect or incomplete information. In this paper we show, based on the Stone-Weierstrass theorem, that an interval type-2 fuzzy neural network (IT2FNN) is a universal approximator, which uses a set of rules and interval type-2membership functions (IT2MFs) for this purpose. Simulation results of nonlinear function identification using the IT2FNN for one and three variables and for the Mackey-Glass chaotic time series prediction are presented to illustrate the concept of universal approximation.

In this paper the main contribution is the proposed IT2FNNs architectures, which are shown to be universal approximators and are illustrated with several benchmark problems to verify their applicability for real world problems.

Interval Type-2 Fuzzy Neural Networks
An IT2FNN [15,31,35] combines a TSK interval type-2 fuzzy inference system (TSKIT2FIS) [13,14,33,34] with an adaptive NN in order to take advantage of both models best characteristics.In general, when representing IT2FNN graphically, rectangles are used to represent adaptive nodes and circles to represent nonadaptive nodes.Output values of pair nodes (green color) and odd nodes (blue color) represent uncertainty intervals (Figures 1-4).In this kind of interval type-2 neurofuzzy adaptive networks, nodes represent processing units called neurons, which can be classified into crisp and fuzzy neurons.The IT2FNN-1 architecture has 5 layers (Figure 1) [35] and consists of adaptive nodes with equivalent function to lower-upper membership in fuzzification layer (layer 1).Nonadaptive nodes in rules layer (layer 2) interconnect with the fuzzification layer (layer 1) in order to generate TSK IT2FIS rules antecedents.Adaptive nodes in the consequent layer (layer 3) are connected to the input layer (layer 0) to generate rules consequents.Nonadaptive nodes in typereduction layer (layer 4) evaluate left-right values with the Karnik and Mendel (KM) [13,14] algorithm.The nonadaptive nodes in the defuzzification layer (layer 5) average left-right values.
The IT2FNN-3 architecture has 8 layers (Figure 2) [35] and uses IT2FN for fuzzifying the inputs (layers 1-2).Nonadaptive nodes in rules layer (layer 3) interconnect with lower-upper linguistic values layer (layer 2) to generate TSK IT2FIS rules antecedents.Adaptive nodes in layer 4 adapt leftright firing strength, biasing rules lower-upper trigger forces with synaptic weights between layers 3 and 4. Layer 5's nonadaptive nodes normalize rules lower-upper firing strength.Nonadaptive nodes I consequent layer (layer 6) interconnect with input layer (layer 0) to generate rules consequents.Nonadaptive nodes in type-reduction layer (layer 7) evaluate Architectures IT2FNN-0 and IT2FNN-2, which will be shown in Sections 3.2 and 3.3, respectively, as universal approximators, are described with more details in Section 2.1.

IT2FNN-0
Architecture.An IT2FNN-0 is a seven-layer IT2FNN, which integrates a first order TSKIT2FIS (interval type-2 fuzzy antecedents and real consequents) with an adaptive NN.The IT2FNN-0 (Figure 3) layers are described as follows.
Layer 2. Nonadaptive T1FN.This layer contains T-norm and S-norm fuzzy nodes where  is the number of nodes in layers 1 and 2 )  = ℓ , for all  = 1, . . ., , and  = 1, . . ., , where ℓ , is the table of indices of the antecedents of the rules  = ∑ −1 =1 V  +||, where  is a vector of indices for each node of layer 2 else where    ,    are lower and upper membership function values, respectively.  () and   () are vectors with even and odd indices of the nodes of layer 2. Layer 3. Lower-upper firing strength (  ,   ).Having nonadaptive nodes for generating lower-upper firing strength of TSK IT2FIS rules (7), where Layer 4. Lower-upper firing strength rule normalization (  ,   ).Nodes in this layer are nonadaptive and the output is defined as the ratio between the th lower-upper firing strength rule (  ,   ) and the sum of lower-upper firing strength of all rules (13) and ( 14): If we view   ,   as fuzzy basis functions (FBF) (32) and (33) and   () as linear function (16), then ŷ() can be viewed as a linear combination of the basis functions (20) and (21): where where   = ∑  =1   .
Layer 5. Rule consequents.Each node is adaptive and its parameters are {   ,   0 }.The node's output corresponds to partial output of th rule   (16): where Layer 7. Defuzzification.This layer's node is adaptive, where the output ŷ, (20) and (21), is defined as weighted average of left-right values and parameter .Parameter  (default value 0.5) adjusts the uncertainty interval defined by left-right values [ ŷ , ŷ ]: where

IT2FNN-2
Architecture.An IT2FNN-2 [31] is a sixlayer IT2FNN, which integrates a first order TSKIT2FIS (interval type-2 fuzzy antecedents and interval type-1 fuzzy consequents), with an adaptive NN.The IT2FNN-2 (Figure 4) layers are described in a similar way to the previous architectures.

IT2FNN as a Universal Approximator
Based on the description of the interval type-2 fuzzy neural networks, it is possible to prove that under certain conditions, the resulting IT2FIS has unlimited approximation power to match any nonlinear functions on a compact set [36,37] using the Stone-Weierstrass theorem [5,6,10,30].

Stone-Weierstrass Theorem
Theorem 1 (Stone-Weierstrass theorem).Let  be a set of real continuous functions on a compact set .If (1)  is an algebra, that is, the set  is closed under addition, multiplication, and scalar multiplication, (2)  separates points on , that is, for every x, y ∈ , x ̸ = y, there exists  ∈  such that (x) ̸ = (y), and (3)  vanishes at no point of , that is, for each  ∈  there exists  ∈  such that (x) ̸ = 0, then the uniform closure of  consists of all real continuous functions on ; that is, (,  ∞ ) is dense in ([],  ∞ ) [36][37][38].

Applying Stone-Weierstrass Theorem to the IT2FNN-0
Architecture.In the IT2FNN-0, the domain on which we operate is almost always compact.It is a standard result in real analysis that every closed and bounded set in R  is compact.Now we shall apply the Stone-Weierstrass theorem to show the representational power of IT2FNN with simplified fuzzy if-then rules.We now consider a subset of the IT2FNN-0 on Figure 5.The set of IT2FNN-0 with singleton fuzzifier, product inference, center of sets type reduction, and Gaussian interval type-2 membership function consists of all FBF expansion functions of the form (38), (40).
), defined by ( 27) and ( 31).If we view   (),   () as fuzzy basis functions ( 32) and (33) and   () are linear functions (34), then ŷ() of ( 38) and ( 40) can be viewed as a linear combination of the fuzzy basis functions, and then the IT2FNN-0 system is equivalent to an FBF expansion.Let  be the set of all the FBF expansions (38) and (40) with   (),   () given by ( 13) and ( 38) and let ) is a metric space [38].We use the following Stone-Weierstrass theorem to prove our result.Suppose we have two IT2FNN-0s  1 ,  2 ∈ ; the output of each system can be expressed as where where where where Lemma 3.  is closed under addition.
Proof.The proof of this lemma requires our IT2FNN-0 to be able to approximate sums of functions.Suppose we have two IT2FNN-0s,  1 () and  2 () with  1 and  2 rules, respectively.The output of each system can be expressed as and that Φ  1 , 2 = ( , where the FBFs are known to be nonlinear.Therefore, an equivalent to IT2FNN-0 can be constructed under the addition of  1 () and  2 (), where the consequents form an addition of  2 can be linear since the FBFs are a nonlinear basis interval and therefore the resultant function, (), is nonlinear interval (see Figure 5).

Lemma 5. 𝑌 is closed under scalar multiplication.
Proof.Let an arbitrary IT2FNN-0 be () (20); the scalar multiplication of () can be expressed as Therefore we can construct an IT2FNN-0 that computes   () in the form of the proposed IT2FNN-0;  is closed under scalar multiplication.
); that is, only one interval type-2 fuzzy set is defined.We define two real value sets  1 and  2 with   () = ∑  =1      +   0 , where  = 1, 2. Now we have specified all the design parameters except   ; that is, we have already obtained a function  which is in the form of (10) with  = 2 and given by ( 18), (20), and (21).With this , we have where where Since x 0 ̸ = y 0 , there must be some  such that  0  =  0  ; hence, we have Separability is satisfied whenever an IT2FNN-0 can compute strictly monotonic functions of each input variable.This can easily be achieved by adjusting the membership functions of the premise part.Therefore, (,  ∞ ) separates points on .

Lemma 7.
For each  ∈ , there exists  ∈  such that () ̸ = 0; that is,  vanishes at no point of .
Proof of Theorem 2. From ( 20) and ( 21), it is evident that  is a set of real continuous functions on , which are established by using complete interval type-2 fuzzy sets in the IF parts of fuzzy rules.Using Lemmas 3, 4, and 5,  is proved to be an algebra.By using the Stone-Weierstrass theorem together with Lemmas 6 and 7, we establish that the proposed IT2FNN-0 possesses the universal approximation capability.

Applying the Stone-Weierstrass Theorem to the IT2FNN-2
Architecture.We now consider a subset of the IT2FNN-2 on Figure 2. The set of IT2FNN-2 with singleton fuzzifier, product inference, type-reduction defuzzifier (KM) [13,14], and Gaussian interval type-2 membership function consists of all FBF expansion functions. :  ⊂   → ,  = (  [38].The following theorem shows that (,  ∞ ) is dense in ([],  ∞ ), where [] is the set of all real continuous functions defined on .We use the following Stone-Weierstrass theorem to prove the theorem.
Suppose we have two IT2FNN-2s  1 ,  2 ∈ ; the output of each system can be expressed as where where where Advances in Fuzzy Systems where Lemma 8.  is closed under addition.
Proof.The proof of this lemma requires our IT2FNN-2 to be able to approximate sums of functions.Suppose we have two IT2FNN-2s  1 () and  2 () with rules  1 and  2 , respectively.The output of each system can be expressed as + (( + (( Therefore, an equivalent to IT2FNN-2 can be constructed under the addition of  1 () and  2 (), where the consequents form an addition of  1  2 can be linear interval since the FBFs are a nonlinear basis and therefore the resultant function, (), is nonlinear interval (see Figure 6).

Lemma 9. 𝑌 is closed under multiplication.
Proof.In a similar way to Lemma 8, we model the product of  1 () 2 () of two IT2FNN-2s which is the last point we need to demonstrate before we can conclude that the Stone-Weierstrass theorem can be applied to the proposed reasoning mechanism.The product  1 () 2 () can be expressed as + Therefore, an equivalent to IT2FNN-2 can be constructed under the multiplication of  1 () and  2 (), where the consequents form an addition of multiplied by a respective FBFs expansion (Theorem 1), and there exists  ∈  such that sup ∈ (|() − ()|) <  (Theorem 2).Since () satisfies Lemma 3 and  ∈ () =  1 () 2 () then we can conclude that  is closed under multiplication.Note that   1 1 and   2 2 can be linear intervals since the FBFs are a nonlinear basis interval and therefore the resultant function, (), is nonlinear interval.Also, even if   1 1 and   2 2 were linear, their product 2 is evidently polynomial interval (see Figure 10).
Lemma 12.For each  ∈ , there exists  ∈  such that () ̸ = 0; that is,  vanishes at no point of .
Proof of Theorem 2. From ( 20) and ( 21), it is evident that  is a set of real continuous functions on , which are established by using complete interval type-2 fuzzy sets in the IF parts of fuzzy rules.Using Lemmas 8, 9, and 10,  is proved to be an algebra.By using the Stone-Weierstrass theorem together with Lemmas 11 and 12, we establish that the proposed IT2FNN-2 possesses the universal approximation capability.
Therefore by choosing appropriate class of interval type-2 membership functions, we can conclude that the IT2FNN-0 and IT2FNN-2 with simplified fuzzy if-then rules satisfy the five criteria of the Stone-Weierstrass theorem.

Application Examples
In this section the results from simulations using ANFIS, IT2FNN-0, IT2FNN-1 [35], IT2FNN-2, and IT2FNN-3 [35] are presented for nonlinear system identification and forecasting the Mackey-Glass chaotic time series [39] with  = 60 with different signal noise ratio values, SNR(dB) = 0, 10, 20, 30, free as uncertainty source.These examples are used as benchmark problems to test the proposed ideas in the paper.We have to mention that the IT2FNN-1 and IT2FNN-3 architectures are very similar to I2FNN-0 and IT2FNN-2, respectively [35], and their results are presented for comparison purposes.The proposed IT2FNN architectures are validated using 10-fold cross-validation [40,41] considering sum of square errors (SSE) or root mean square error (RMSE) in the training or test phase.We use cross-validation to measure the variability of the RMSE in the training and testing phases to compare network architectures IT2FNN.Cross-validation procedure evaluation is done using Matlab's crossvalind function.Noise is added by Matlab's awgn function.
In -fold cross-validation [40], the original sample is randomly partitioned into  subsamples.Of the  subsamples,  7 show the resulting RMSE (CHK) values for ANFIS and IT2FNN; it can be seen that IT2FNN architectures [31] perform better than ANFIS.
Experiment 2 (identification of a three variable nonlinear function).A three-input one-output IT2FNN is used to approximate nonlinear Sugeno [27] function  : R 3 → R: 216 training data sets are generated with 10-fold crossvalidation and 125 for tests; 2 igaussmtype2 IT2MFs for each input, 8 rules, and 50 epochs.Once the ANFIS and IT2FNN models are identified, a comparison is made with RMSE statistic values and 10-fold cross-validation.Table 2 and Figure 8 show the resultant RMSE (CHK) values for ANFIS and IT2FNN.It can be seen that IT2FNN architectures [31] perform better than ANFIS.Mackey-Glass chaotic time series is a well-known benchmark [39] for systems modeling and is described as follows:

Conclusions
In this paper we have shown that an interval type-2 fuzzy neural network (IT2FNN) is a universal approximator.Simulation results of nonlinear function identification using  We have also illustrated the ideas presented in the paper with the benchmark problem of Mackey-Glass chaotic time series prediction.
left-right values adding lower-upper product of lower-upper triggering forces normalized by rules consequent left-right values.Node in defuzzification layer is adaptive and its output ŷ is defined as biased average of left-right values and parameter .Parameter  (0.5 by default) adjusts uncertainty interval defined by left-right values [ ŷ , ŷ ].

Experiment 3 .
Predicting the Mackey-Glass chaotic time series.time series  = 60
are generated based on initial conditions (0) = 1.2 and  = 60, using fourth order Runge-Kutta method adding different levels of uniform noise.For comparing with other methods, an input-output vector is chosen for IT2FNN model with the following format: [ ( − 18) ,  ( − 12) ,  ( − 6) ,  () ;  ( + 6)] .(59) Four-input and one-output IT2FNN model is used for Mackey-Glass chaotic time series prediction, choosing 500 data sets for training and 500 test data data sets with 10fold cross-validation test, 2 IT2MFs for each input with membership function igaussmtype2, 16 rules, and 50 epochs.ANFIS and IT2FNN models are identified, comparing RMSE statistical values with 10-fold cross-validation.
taking into account RMSE statistic values with 10-fold crossvalidation.Table1 and Figure

Table 3 and
Figures 9 and 10 show the number of  points out of uncertainty interval Ỹ() ∈ [ ŷ (), ŷ ()] evaluated by IT2FNN model, RMSE training values (TRN) and test (CHK) obtained for ANFIS and IT2FNN models.It can be seen that IT2FNN model architectures predict better Mackey-Glass chaotic time series.

Table 3 :
RMSE (TRN/CHK) and  values determined by ANFIS and IT2FNN models with 10 fold cross-validation for Mackey-Glass chaotic time series prediction with  = 60.