In many practical applications, it turns out to be useful to use the notion
of fuzzy transform: once we have functions A1(x)≥0,...,An≥0,
with
∑i=1nAi(x)=1, we can then represent each function
f(x)
by the coefficients
Fi=(∫f(x)·Ai(x)dx)/(∫Ai(x)dx). Once we know the coefficients Fi, we can
(approximately) reconstruct the original function f(x) as ∑i=1nFi·Ai(x). The original motivation for this transformation came from fuzzy modeling,
but the transformation itself is a purely mathematical transformation.
Thus, the empirical successes of this transformation suggest that
this transformation can be also interpreted in more traditional (nonfuzzy)
mathematics as well.
Such an interpretation is presented in this paper. Specifically, we show
that the 2002 probabilistic interpretation of fuzzy modeling by Sánchez
et al. can be modified into a natural probabilistic explanation of fuzzy
transform formulas.
1. Introduction: Fuzzy Transform and the Need for Its Probabilistic Interpretation1.1. Fuzzy Transform: A Definition
The notion of a fuzzy transform (F-transform, for short) turned out to be very useful in many application areas such as image compression and solving differential equations under initial uncertainty; see, for example, [1, 2] and references therein.
Generally speaking, the F-transform of function f is a vector with weighted local mean values of f as components. The first step in the definition of the F-transform of f:X→ℝ is a selection of a fuzzy partition of universal set X (e.g, a bounded interval [a,b] on ℝ) by a finite set of basic functions A1(x)≥0,…,An(x)≥0,
which are continuous and satisfy the condition ∑i=1nAi(x)=1. Basic functions are called membership functions of respective fuzzy sets, or, alternatively, granules, information pieces, etc. Their choice reflects the type of uncertainty which is related to the knowledge of x.
Once the basic functions are selected, we define the F-transform of a continuous function f:X→ℝ as a vector (F1,…,Fn), whereFi=def∫f(x)⋅Ai(x)dx∫Ai(x)dx.
F-transform satisfies the following properties [1, 2]:
y=Fi minimizes ∫ab(f(x)-y)2Ai(x)dx,
for a twice continuously differentiable function f, Fi=f(xi)+O(hi2), where hi is the length of the support of Ai.
F-transform is used in applications as a “skeleton model” of f. This model provides a compressed image if f is an image [3], values of a trend if f is a time series [4], a numeric model if f is used in numeric computations (integration, differentiation) [5], etc.
Once we know the F-transform components Fi, we can (approximately) reconstruct the original function f as
f¯(x)=∑i=1nFi⋅Ai(x).
In [1], the formula (3) is called the F-transform inversion formula. The formula (3) represents a continuous function that approximates f. Under certain reasonable conditions, a sequence of functions represented by (3) uniformly converges to f (see [1] for more details).
Example 1.
Let us give an example of the F-transform of x2 on the domain [0,1] with respect to A1,…,A5. For simplicity, we assume that basic functions A1,…,A5 are of triangular shape and constitute a uniform fuzzy partition of [0,1]. Their analytical representation is as follows:
A1(x)={1-4x,ifx∈[0,0.25],0,otherwise,A2(x)={4x,ifx∈[0,0.25],2-4x,ifx∈[0.25,0.5],0,otherwise,A3(x)={4x-1,ifx∈[0.25,0.5],3-4x,ifx∈[0.5,0.75],0,otherwise,A4(x)={4x-2,ifx∈[0.5,0.75],4-4x,ifx∈[0.75,1.0],0,otherwise,A5(x)={4x-3,ifx∈[0.75,1.0],0,otherwise.
By (2), the values of the components F1,…,F5 of the F-transform are
F1=10.125⋅∫00.25x2⋅(1-4x)dx≈0.01,F2=10.25⋅(∫00.25x2⋅4xdx+∫0.250.5x2⋅(2-4x)dx)≈0.08,F3=10.25⋅(∫0.250.5x2⋅(4x-1)dx+∫0.50.75x2⋅(3-4x)dx)≈0.26,F4=10.25⋅(∫0.50.75x2⋅(4x-2)dx+∫0.751x2⋅(4-4x)dx)≈0.58,F5=10.125⋅∫0.751x2⋅(4x-3)dx≈0.85.
Figure 1 provides a graphical representation of the basic functions A1,…,A5, of the function f(x)=x2, of its F-transform components F1,…,F5, and of the inverse F-transform f¯(x) of x2.
Function x2 on [0,1] and its F-transform components F1,…,F5 with respect to A1,…,A5 (a). Function x2 on [0,1] and its inverse F-transform f¯ (b).
1.2. F-Transform: Original Motivation
The original motivation for F-transform came from fuzzy modeling [1, 2]. For example, in the situation corresponding to the inverse F-transform, we have n rules IfxisA1theny=F1,⋮IfxisAntheny=Fn.
These rules are Takagi-Sugeno (TSK) rules with singleton (constant) right-hand sides. For TSK rules, the value corresponding to a given input x is f¯(x)=(∑i=1nFi·Ai(x))/(∑i=1nAi(x)). Since ∑i=1nAi(x)=1, we get formula (3).
The purpose was to show that this type of modeling can be as useful in applications as more traditional techniques such as Fourier transform and wavelet transform. Moreover, F-transform has a potential advantage over Fourier and wavelet transforms: in contrast to the purely mathematical basic functions used in Fourier and wavelet transforms, the basic functions Ai in a fuzzy partition usually come from natural language terms like “low” or “high” (for a detailed description of fuzzy modeling, see, e.g., [6, 7]).
Just like any other tool of applied mathematics, F-transform is not a panacea. It is more successful in some problems, and in other problems, it is less successful. It is therefore desirable to combine F-transform with other mathematical tools, so as to combine relative advantages of different techniques. For combining F-transform with other mathematical tools, it is desirable to come up with a purely mathematical (nonfuzzy) interpretation for this transform.
In particular, since most mathematical data processing tools are based on probability and statistics, it is desirable to come up with a probabilistic interpretation for F-transform.
1.3. The Known Probabilistic Interpretation of Fuzzy Modeling Leads to a Probabilistic Interpretation of F-Transform
We have mentioned that F-transform was originally designed as a particular case of fuzzy modeling. A seminal paper [8] provided a reasonable probabilistic model for a particular case of fuzzy modeling. Specifically, this paper shows that if we use piecewise constant probability density functions for describing the output, then we get a particular case of a fuzzy model—the case when we use product for “and” and sum for “or.” Since F-transform corresponds to exactly this type of fuzzy modeling, we thus get a probabilistic model for F-transform as well.
1.4. What We Do in This Paper
In this paper, we show that a modification of the probabilistic interpretation from [8] enables us to justify formulas of F-transform without making any additional assumptions about the probability distributions. In mathematical terms, this modification consists of using Bayes formulas—and making assumptions about prior distributions (a natural way to describe prior knowledge in statistics) instead of making assumptions about the actual distributions.
Thus, we get an even more natural probabilistic interpretation of F-transform. Specifically
the paper [8] shows, in effect, that there exists a reasonable probabilistic interpretation of the F-transform formulas;
however, in principle, this interpretation leaves the possibility that there exist other equally reasonable assumptions about the probability distributions can lead to different formulas;
in our modified interpretation, we show that the basic probabilistic setting uniquely determines the F-transform formulas—without the need to make any assumptions about the probability distributions.
We also show that a similar modification can be applied to the probabilistic interpretation of general fuzzy modeling formulas.
Comment 1.
From the mathematical viewpoint, the resulting formulas are very similar to the formulas from [8] (with the exception of the Bayes formula step). However, in our opinion, this mathematically minor modification leads to a major change in interpretation: now, to probabilistic researchers, F-transform is
not just a possible model, corresponding to one of the possible reasonable choices of probability distributions,
but the model uniquely emerging from the natural probabilistic setting.
Similar conclusion can be made about the probabilistic interpretation of more general fuzzy models. In other words, our minor modification uncovers an even deeper fundamental meaning of the probabilistic interpretation originally proposed in [8].
2. A Natural Practical Problem that Leads to F-Transform2.1. Physical Setting: General Discussion
Let us assume that we have a physical process that is characterized by two quantities x and z, and we know that these quantities are related by a functional dependence z=f(x).
In the ideal situation of complete knowledge,
we know the exact value of x,
we have the exact description of the function f.
In this case, we can get the corresponding exact value z=f(x) of the second quantity.
In practice, we know the value x with uncertainty, that is, several different values of x are consistent with our knowledge. We must therefore provide a reasonable estimate for z. Finding such an estimate will be the first problem with which we will be dealing. In this first problem, we assume that the function f is known exactly.
If this function has to be determined empirically, then we will transform the empirical (often, partial) knowledge about f into a reasonable estimate for this function. This will be the second problem with which we will be dealing in this section.
2.2. First Problem: Estimating the Value f(x) for an Imprecisely Known x
If we only know one piece of information Xi about x, what is the reasonable estimate for z=f(x)?
2.3. Second Problem: Estimating the Function z=f(x) Based on Partial Information about the Dependence between x and z
Assume that for every information piece Xi, 1≤i≤n, we have the corresponding measured value Fi of z. Since we know only n numerical characteristics Fi of the unknown function f, we cannot exactly reconstruct this function. Instead, we need to provide a good estimate for each value f(x) of this function.
3. A Natural Probabilistic Problem that Leads to the Probabilistic Interpretation of F-Transform3.1. Uncertainty in x: A General Probabilistic Description
Assume that we have a model of the estimation procedure, that enables us, given the actual value x, to compute the probability P(Xi∣x)≥0 of this procedure resulting in Xi—under the condition that the actual (unknown) value of the estimated quantity is x.
To simplify formulas, we denoteAi(x)=defP(Xi∣x).
Since for every x, we must have exactly one of the n possible outcomes, we thus conclude that the probabilities P(X1∣x),…,P(Xn∣x) of different estimation results must add up to one, that is, we must haveP(X1∣x)+⋯+P(Xn∣x)=1.
In the above simplified notation, this formula takes the formA1(x)+⋯+An(x)=1.
3.2. First Problem: Estimating the Value f(x) for an Imprecisely Known x
Let us consider the first problem. In practice, we do not know the exact value of the quantity x. Instead, we only have one of the information pieces Xi, 1≤i≤n. Under the assumption that we know Xi, what is the reasonable estimate for z=f(x)?
In terms of probability theory, we would like to find the conditional expected value Fi=defE[z∣Xi]=E[f(x)∣Xi] of z=f(x) under the condition Xi.
By definition, this expected value is equal toFi=E[f(x)∣Xi]=∫f(x)⋅P(x∣Xi)dx.
Thus, to compute this expected value, we must know the probabilities P(x∣Xi). Instead, we know the probabilities P(Xi∣x).
In general, the problem of reconstructing probabilities P(Hx∣Xi) of different hypotheses Hx based on the observation Xi from conditional probabilities P(Xi∣Hx) of this observation under different hypotheses Hx is well known in probability theory; it is solved by applying the Bayes theorem. The continuous version of this theorem isP(Hx∣Xi)=P(Xi∣Hx)⋅P(Hx)∫P(Xi∣Hy)⋅P(y)dy,
in which P(Hx) is a prior probability of the hypothesis Hx (strictly speaking, P(Hx∣Xi) and P(Hx) are probability densities).
In our case, different hypotheses Hx correspond to different possible values x of the quantity of interest. Thus, (11) takes the formP(x∣Xi)=P(Xi∣x)⋅P(x)∫P(Xi∣y)⋅P(y)dy.
Since there is no a priori reason to prefer one value of x to the other, it is reasonable to assume that all the values x are equally probable, that is, that all prior values P(x) are equal to each other: P(x)=P0.
Substituting P(x)=P0 into the formula (12) and dividing both the numerator and the denominator by the common factor P0, we get the expressionP(x∣Xi)=P(Xi∣x)∫P(Xi∣y)dy.
Substituting this expression into formula (10) (and renaming the variable in the denominator), we getFi=E[f(x)∣Xi]=∫f(x)⋅P(Xi∣x)dx∫P(Xi∣x)dx.
In terms of the simplified notation (7), we thus getFi=E[f(x)∣Xi]=∫f(x)⋅Ai(x)dx∫Ai(x)dx,
that is, exactly the formula (2) corresponding to F-transform.
3.3. Second Problem: Estimating the Function z=f(x) Based on Partial Information about the Dependence between x and z
In some practical situations, we do not know the exact expression for the function f(x). Instead, we must estimate f(x) from the empirical data, that is, from the previous results of simultaneous measuring x and z.
In each such measurement, the only information that we get about x is one of the values X1,…,Xn. For each case when the information about x is Xi, we have one or several values z.
Ideally, we should have a large number of values z corresponding to each x-measurement result Xi. Based on these values z, we should then be able to reconstruct the conditional distribution of z under the condition of Xi. Based on these conditional distributions, we should be able to reconstruct the values f(x) for all x.
In practice, however, we have only a few values z corresponding to each x-measurement result Xi. In this case, at best, instead of the entire conditional probability distribution, we can only reconstruct a single parameter—the conditional mean Fi=E[z∣Xi]. Since we only know n characteristics Fi of the unknown function f(x), we cannot exactly reconstruct this function. Instead, we need to describe a good estimates for each value f(x) of this function.
Similarly to the first problem, we take the mean as a reasonable estimate. Thus, in the above practical setting, the problem of estimating the function f(x) takes the following form:
for every i, we know the conditional mean Fi=E[f(x)∣Xi];
based on these conditional means, for every x, we want to estimate the mean value f¯(x)=defE[z∣x].
For this problem, the formula of full probability leads to the following result:E[z∣x]=∑i=1nE[z∣Xi]⋅P(Xi∣x).
By using the notations f¯(x) for E[z∣x], Fi for E[z∣Xi], and Ai(x) for P(Xi∣x), we can transform the formula (16) into the formf¯(x)=∑i=1nFi⋅Ai(x),
that is, exactly the F-transform inversion formula (3).
3.4. Conclusion
The above (minor) modification of a probability model from [8] uniquely determined both basic formulas (2) and (3) related to F-transform.
3.5. Relation with the Random Set Interpretation of Fuzzy Sets
It is worth mentioning that the probabilistic interpretation from [8] is related to the random set interpretation of fuzzy sets (see, e.g., [9]).
In this interpretation, the meaning of an imprecise (fuzzy) term like “small” is based on the following idea. The fact that the term is imprecise means that for the same value x, some people will say that this value is small, while other people will say that this value is not small. To take this imprecision into account, we can store, for each person, a set of all the values that this person considers small.
Since there is no prior reason to prefer the opinion of one of these folks, we consider their opinions equally reasonable. We can then take the ratio μsmall(x) of people who consider x to be small as a reasonable measure of smallness (this is actually one of the standard ways to construct a membership function corresponding to a certain term).
We can describe this ratio in probabilistic terms if we assume that all the persons are equally probable. In these terms, the value μsmall(x) can be interpreted as the probability P(small∣x) that a randomly selected person would consider x to be small.
This interpretation of the membership function Ai(x) as the conditional probability P(Xi∣x) is exactly what we used in our probabilistic interpretation of F-transform.
3.6. Terminological Comment
For completeness, let us explain why the above interpretation is called the random sets interpretation.
For crisp (well-defined) properties, each property can be described by the set of all the values that satisfy this property.
For each imprecise property like “small,” instead of a single set describing all the values that satisfy this property, we have several sets describing the opinions of several persons. We consider the opinions of all these persons to be equally valid, so each of N persons has the exact same probability 1/N of being correct. In this case, we have different sets, each occurring with probability 1/N.
In mathematical terms, we can describe this situation by saying that we have a probability distribution on the class of all possible sets. In probability theory, such a distribution is called a random set—similarly to the fact that a probability distribution on the class of all possible numbers is called a random number.
4. Discussion
Let us discuss what the consequences of the above results for the meaning and usage of F-transforms are(the authors are greatly thankful to the anonymous referees who proposed the main ideas of this discussion).To start this discussion, let us recall why F-transforms were proposed in the first place.
4.1. Need for F-Transforms and the Resulting Main Advantage of F-Transforms: Reminder
One of the main objectives of F-transform is to approximate general functions by functions from a selected finite-parametric family. This is a well-known mathematical problem, and many successful techniques have been developed for solving this problem. For example, we can expand the original function by a polynomial, and then use the first few terms in this expansion as the desired approximation. We can also use transforms such as Fourier transform or wavelet transform, and keep only the first few terms in the corresponding expansion as the desired approximation.
All existing approximation techniques take a function f(x) and approximate this function. In situations in which the only information that we have about the desired dependence y=f(x) are the values of y measured for several values of x, this is the only thing we can do. However, in practice, we often have additional expert knowledge about the dependence y=f(x). It is therefore desirable to take this understanding into account when we approximate a function.
The expert knowledge is often imprecise (fuzzy), that is, formulated in terms of imprecise expert rules. A natural way to describe imprecise rules is to use fuzzy logic and fuzzy modeling, and, as we have shown, the fuzzy modeling approach naturally leads to F-transforms.
The ability to take into account expert knowledge is thus the main advantage of F-transforms, the main reason why F-transform has led to many successful applications.
4.2. The Probabilistic Interpretation of F-Transform Leads to an Additional Advantage of F-Transform in Comparison with Other Approximation Techniques
The above probabilistic interpretation of F-transforms shows that each component Fi of an F-transform can be interpreted as a mean value E[f(x)∣Xi] of the approximating function f(x) under the condition that the unknown value x is consistent with the measurement result Xi. It is well known that in probability theory, the mean value can be alternatively described as the value z that minimizes the mean square difference between this value and the actual value f(x), that is, that minimizes the expression E[(f(x)-z)2∣Xi]. Thus, the above relation provides an additional advantage of F-transforms in comparison with other approximation tools:
F-transforms not only reflect expert knowledge,
F-transforms also provide a solution which is optimal (in a well-defined reasonable sense).
4.3. Gauging the Accuracy of the Resulting Approximation
We have shown that each component z=Fi of the F-transform provides the approximation to f(x) which is the most accurate. The next natural question is how accurate is it? In other words, what is the corresponding mean square difference σ2=E[(f(x)-z)2∣Xi]? It turns out that the answer to this question can also be provided in terms of F-transforms.
Namely, as it is known, for z=Fi=E[f(x)∣Xi], we have σ2=E[(f(x)-z)2∣Xi]=E[f2(x)∣Xi]-(E[(f(x)-z)2∣Xi])2.
That is, σ2=E[f2(x)∣Xi]-Fi2. The expression E[f2(x)∣Xi] can also be described in terms of F-transforms. Indeed, our result about the relation between F-transform and conditional expected value applies to all possible functions, including the square f2(x) of the original function f(x). Thus, each value E[f2(x)∣Xi] is equal to the ith component Si of the F-transform of this square.
So, we arrive at the following conclusion. If we only know that x is consistent with the measurement result Xi, then
a reasonable approximation for f(x) is the value Fi: the ith component of the F-transform,
the root mean square accuracy σ of this approximation is determined by the formula σ2=Si-Fi2, where Si is the ith component of the F-transform of the function f2(x).
Similarly, for the second problem—reconstructing f(x) when we only know finitely many values corresponding to different i—the mean square accuracy of the corresponding approximation of the actual (unknown) function f(x) by its inverse F-transform f¯(x) is equal to σ2(x)=defE[(f(x)-f¯(x))2∣x]=E[f2(x)∣x]-(f¯(x))2.
The first term E[f2(x)∣x] in this difference is equal to the inverse F-transform f2¯(x)=∑i=1nSi⋅Ai(x),
where the values S1,…,Sn form an F-transform of the squared function f2(x).
Thus, we arrive at the following conclusion:
If we only know the values F1,…,Fn of the F-transform of the actual (unknown) dependence f(x), then, as a reasonable approximation to f(x), we can take the inverse F-transform f¯(x)=∑i=1nFi·Ai(x).
If, in addition to the values Fi, we also know the F-transform S1,…,Sn of the square f2(x), then we can estimate the root means square accuracy σ(x)=E[(f(x)-f¯(x))2∣x] by using the formula
σ2(x)=f2¯(x)-(f¯(x))2,
where f2¯(x)=∑i=1nSi·Ai(x) is the inverse F-transform of the squared function.
5. A Similar Modification of a Probabilistic Interpretation Is Possible for Mamdani-Style Fuzzy Modeling (and Fuzzy Control)5.1. From F-Transform to Fuzzy Modeling
Let us show that the above modification of a probabilistic interpretation from [8] can be extended from F-transform to a more general case of Mamdani-type fuzzy modeling and fuzzy control.
Comment 2.
In this section, we concentrate on Mamdani's approach since F-transform can be viewed as a particular case of this approach, and since for Mamdani's approach, a probabilistic interpretation is possible [8]. Please note that while Mamdani's approach was historically the first, at present, there are many different approaches to fuzzy modeling and fuzzy control; we mention some of them in this chapter, but there are many others; see, for example, [10–12]. How to best interpret these other approaches in probabilistic terms—and whether such an interpretation is at all possible—is an interesting open question.
For example, an interesting question is how to interpret type-2 approaches to fuzzy modeling and fuzzy control; see, for example, [13–16]; maybe via interval-valued probabilities?
5.2. Mamdani's Approach to Fuzzy Modeling and Fuzzy Control: A Brief Reminder
In Mamdani's approach, we start with rules like
“if x is small, then u should be medium”,
and then use membership functions for “small” and “medium” to transform these rules into an exact control strategy.
In general, we have rules
“if x has a property Ai then u has the property Bi” (1≤i≤n),
with known membership functions Ai(x) and Bi(u) for the corresponding properties. Mamdani's methodology is based on saying that for each input x, the value u is a reasonable value of control if and only if one of the above n rules is applicable, that is,
either the first rule is applicable, that is, x satisfies the property A1 and u satisfies the property B1,
or the second rule is applicable, that is, x satisfies the property A2 and u satisfies the property B2,
⋯
or the nth rule is applicable, that is, x satisfies the property An and u satisfies the property Bn.
Once we select functions f&(a,b) and f∨(a,b) to represent “and” and “or” (these functions are called t-norm and t-conorm), we can thus describe the degree of our belief μx(u) that u is reasonable (for a given input x) asμx(u)=f∨(f&(A1(x),B1(u)),…,f&(An(x),Bn(u))).
In particular, if we select f&(a,b)=a·b and f∨(a,b)=min(a+b,1) (and if the added values do not go beyond 1), we getμx(u)=∑i=1nAi(x)⋅Bi(u).
Once we know this membership function, we can find the appropriate value of u by using the so-called centroid defuzzification:u¯(x)=∫u⋅μx(u)du∫μx(u)du.
5.3. A Natural Probabilistic Analog of Mamdani's Approach to Fuzzy Modeling
In [8], it was shown that in a probabilistic setting, we get formulas similar to Mamdani rules corresponding to f&(a,b)=a·b and f∨(a,b)=min(a+b,1)—if we assume a uniform distribution on the outputs. Let us show that by using Bayes formula, we can avoid this additional assumption, and thus, make the resulting probabilistic analog of Mamdani's fuzzy modeling even more natural.
Similarly to the above probabilistic interpretation of F-transform, let us assume that we have possible pieces of information X1,…,Xn about the quantity x, and that for each piece of information, we also know the corresponding probability P(Xi∣x) which we will be denoted by Ai(x).
Similarly, let us assume that we have possible pieces of information U1,…,Um about u, and we know the corresponding probabilities P(Ui∣u) which we will denote by Bi(u).
We know that u depends on x, but we do not know the exact dependence. Instead, for each information Xi about x, we know the corresponding information Uj about the corresponding u.
Since we did not select any specific order for the information Ui, we can select the value corresponding to X1 as U1, the value corresponding to X2 by U2, etc. Under this selection, the available information simply means that if x is described by the piece of information Xi, then the corresponding u is described by the piece of information Ui.
Our objective is, given these rules and given a new value x, to find a good estimate for the appropriate u.
Due to the formula of full probability, the conditional probability density P(u∣x) of u under the condition x has the formP(u∣x)=∑i=1nP(u∣Ui)⋅P(Xi∣x).
We know the probabilities P(Xi∣x)=Ai(x). The probability densities P(u∣Ui) can be determined by using the Bayes theorem—similarly to the F-transform case—asP(u∣Ui)=P(Ui∣u)∫P(Ui∣y)dy,
that is, in terms of the values Bi(u), asP(u∣Ui)=Bi(u)∫Bi(y)dy.
Substituting the formula (27) and the expression (7) into the formula (25) (and changing the multiplication order), we get the formulaP(u∣x)=∑i=1nAi(x)⋅Bi(u)∫Bi(y)dy.
Once we know these probabilities, we can produce the mean u¯ as a reasonable estimate for u:u¯(x)=∫u⋅P(u∣x)du∫P(u∣x)du.
These are exactly the formulas derived in [8] from the additional assumption of a piecewise constant output distribution. Thus, our (minor) modification of [8] indeed uniquely determines the corresponding probabilistic analog of Mamdani's formulas.
5.4. In Mamdani-Type Setting, Fuzzy and Probabilistic Formulas Are, in General, Different
It is worth mentioning that
while in F-transform, the probabilistic and fuzzy derivations lead to exactly the same formulas,
in the general fuzzy modeling case, as mentioned in [8], the formulas are somewhat different:
the formula (29) is exactly the same as (24), with P(u∣x) instead of μx(u);
the formula (28) is slightly different from Mamdani's formula (23)—by the integral in the denominator.
5.5. Cases when Fuzzy and Probabilistic Formulas Coincide
For F-transform (and, more generally, in all the cases when the value ∫Bi(y)dy is the same for all i), this additional denominator simply divides all the values P(u∣x) by the constant. This constant appears both in the numerator and in the denominator of the formula (28) and thus, it does not affect the resulting value u¯(x).
Another case when the fuzzy and probabilistic formulas coincide is the case of the Takagi-Sugeno (TSK) approach; see, for example, [10]. This equivalence is, in effect, proven in [8]. In the TSK approach, rules have the type
“if x has a property Ai then u=fi(x)” (1≤i≤n),
for known functions fi(x). In the probabilistic setting, we assume that under a piece of information Ui, we must take u=fi(x). Thus, for a given input x, we select fi(x) with probability P(Xi∣x)=Ai(x), where ∑i=1nAi(x)=1. The resulting mean u¯(x) is thus equal to ∑i=1nAi(x)·fi(x). For the case when ∑i=1nAi(x)=1, this is exactly the TSK formula.
5.6. Comparison between Fuzzy and Probabilistic Modeling
For Mamdani-type situations when fuzzy and probabilistic formulas are different, the comparison of the corresponding probabilistic and fuzzy rules is done, in detail, in [8].
Let us add three more situations to this comparison, situations that are naturally related to our modified derivation.
5.7. Case when Probabilistic Control Is Better
When the values ∫Bi(y)dy are different, probabilistic control and fuzzy control lead, in general, to a different value u¯. We will show, on an example originally proposed by R. Yager, that in this case, the result of the probabilistic control is closer to common sense that the result of Mamdani's control.
Indeed, let us consider the situation in which we have two rules:
the first rule is a more general rule saying that if x is small, then u should be small;
the second rule is a very specific rule, saying that if x is very close to 0.11, then u should be very close to 0.15.
Intuitively, if we have a value x for which a very specific rule is applicable, for example, the value x=0.11, then this specific rule should have a priority over the general rule. However, since the width of the membership function B2(u) is small, the corresponding term in (23) will practically not affect the resulting estimate (24).
In contrast, in the probabilistic control, the effect of B2(u) is normalized by, crudely speaking, the total width of the corresponding function B2(u). Thus, even the most specific rules will have—as desired—the significant influence on the result (29).
Comment 3.
It should be mentioned that the problem with specific rules occurs only in Mamdani's approach to fuzzy control. In the alternative logical approach, this problem does not appear; see, for example, [17].
5.8. Another Case When Probabilistic Control Is Better
The probabilistic interpretation enables us to naturally consider more general situations in which the rules are themselves probabilistic, that is, when, for each i and j, we know the conditional probability P(Ui∣Xj) that if x has the property Xj, then u has the property Ui.
In other words, instead of the original rules
“if x has the property Xi, then u has the property Ui, ”
we now have rules
“if x has the property Xj, then u has the property Ui with probability P(Ui∣Xj).”
Indeed, in this case, due to the formula of full probability, the conditional probability density P(u∣x) of z under the condition x has the formP(u∣x)=∑i=1n∑j=1nP(u∣Ui)⋅P(Ui∣Xj)⋅P(Xj∣x).
Here, we know the original probabilities P(Ui∣Xj) and the probabilities P(Xi∣x)=Ai(x). The probability densities P(u∣Ui) can be determined by using the Bayes theorem as an expression (27). Substituting the formula (27) and the expression P(Xi∣x)=Ai(x) into the formula (30) (and changing the multiplication order), we get the formulaP(u∣x)=∑i=1n∑j=1nP(Ui∣Xj)⋅Aj(x)⋅Bi(u)∫Bi(y)dy.
Once we know these probabilities, we can produce the mean u¯ by using the formula (29).
5.9. In Some Cases, Fuzzy Control Is Better
We have shown that in some situations, probabilistic control is better than the original Mamdani's fuzzy control. However, in other situations, the fuzzy control is better. Let us give two examples.
5.10. Case when Mamdani's Formulas Are Better
The above probabilistic formulas only work for the case when ∑i=1nAi(x)=1, that is, in the probabilistic terms, when the properties Ai are mutually exclusive. In practice, we may have nonexclusive properties, in which case we may have ∑i=1nAi(x)>1.
It is not clear how to handle this situation within the probabilistic approach. However, such situations are not a problem if we apply fuzzy control: its formulas are applicable no matter whether we satisfy the requirement ∑i=1nAi(x)=1 or not.
Other Cases when Mamdani's Formulas Are Better
The probabilistic interpretation is only possible when we use multiplication and addition as “and” and “or” operations f& and f∨.
Fuzzy control does not necessarily have to use these operations, it can use different t-norms and t-conorms. It is an empirical fact that in many control situations, the use of t-norm different from the product and of the t-conorm different from the sum leads to a much better quality control—for example, a more stable or a smoother one.
In [18], we have formulated the problem of selecting the t-norm and the t-conorm as a precise optimization problem, and for several objective functions like smoothness or stability, we gave an explicit analytical solutions to these optimization problem—specifically, we described the selection that leads to the optimal values of smoothness or stability. In many of these case, the optimal selection is indeed different from the probabilistic case of product and sum. Thus, fuzzy control methodology indeed leads to a better quality control.
6. Conclusion
The fuzzy transform (F-transform) techniques have been lately shown to be very successful in various applications, including applications where until recently, only more traditional tools like Fourier transform or wavelet transform have been applied. In many other applications, however, the traditional tools have a clear advantage. It is therefore desirable to combine F-transform with the more traditional tools, so as to combine the relative advantages of both techniques. To make this combination easier, it is desirable to interpret F-transform in traditional mathematical terms.
In this paper, we describe a modification of a probabilistic interpretation described in [8]. In this modification, the corresponding probabilistic model uniquely leads to the formulas of the F-transform. A similar modification is described in a more general situation of fuzzy modeling.
Acknowledgments
This paper was supported in part by the National Science Foundation Grant HRD-0734825, by Grant 1 T36 GM078000-01 from the National Institutes of Health, and by Grant MSM 6198898701 from MŠMT of Czech Republic. The authors are thankful to Josef Štěpán and Ron Yager for motivation and valuable discussions, and to the anonymous referees for valuable suggestions.
PerfilievaI.Fuzzy transforms: theory and applications2006157899310232-s2.0-3364498994610.1016/j.fss.2005.11.012PerfilievaI.Fuzzy transforms: a challenge to conventional transforms20071471371962-s2.0-3454709320410.1016/S1076-5670(07)47002-1di MartinoF.SessaS.LoiaV.PerfilievaI.An image coding/decoding method based on direct and inverse fuzzy transforms20084811101312-s2.0-4004908782210.1016/j.ijar.2007.06.008PerfilievaI.NovákV.PavliskaV.DvořákA.ŠtěpničkaM.Analysis and prediction of time series using fuzzy transformProceedings of the IEEE International Conference on Neural Networks (IJCNN’08)June 2008Hong Kong38753879PerfilievaI.de MeyerH.de BaetsB.PlškováD.Cauchy problem with fuzzy initial condition and its approximate solution with the help of fuzzy transformProceedings of the IEEE International Conference on Fuzzy Systems FUZZ-IEEE (WCCI '08)June 2008Hong Kong22852290KlirG.YuanB.1995Upper Saddle River, NJ, USAPrentice HallNguyenH. T.WalkerE. A.2006Boca Raton, Fla, USAChapman & Hall/CRCSánchezL.CasillasJ.CordónO.JesusM. J.Some relationships between fuzzy and random set-based classifiers and models20022921752132-s2.0-003646736310.1016/S0888-613X(01)00063-9NguyenH. T.2006Boca Raton, Fla, USAChapman & Hall/CRCNguyenH. T.PrasadN. R.WalkerC. L.WalkerE. A.2003Boca Raton, Fla, USAChapman and Hall/CRCNguyenH. T.SugenoM.1998Boston, Mass, USAkluwer Academic PublishersNguyenH. T.SugenoM.TongR.YagerR.1995New York, NY, USAJohn Wiley & SonsCastilloO.MelinP.Special Issue on Hybrid Intelligent Systems200717710199719982-s2.0-3384767411010.1016/j.ins.2006.09.004CastilloO.MelinP.Special issue on high order fuzzy sets200917913205320542-s2.0-6454915563810.1016/j.ins.2009.01.001HagrasH.ZuradaJ. M.YenG. G.WangJ.Type-2 fuzzy logic controllers: a way forward for fuzzy systems
in real world environments5050Proceedings of the Computational Intelligence: Research Frontiers, IEEE World Congress on
Computational Intelligence, (WCCI '08)June 2008Hong KongSpringer181200Plenary/Invited Lectures, Springer Lecture Notes in Computer ScienceSepúlvedaR.CastilloO.MelinP.DíazA. R.MontielO.Experimental study of intelligent controllers under uncertainty using type-1 and type-2 fuzzy logic200717710202320482-s2.0-3384766653210.1016/j.ins.2006.10.004NovakV.PerfilievaI.MočkořJ.1999Dordrecht, The Netherlandkluwer Academic PublishersSmithM. H.KreinovichV.NguyenH. T.SugenoM.TongR.YagerR.Optimal strategy of switching reasoning methods in fuzzy control2005Nwe York, NY, USAJohn Wiley & Sons117146