AFSAdvances in Fuzzy Systems1687-711X1687-7101Hindawi Publishing Corporation98583910.1155/2011/985839985839Research ArticleWhy Fuzzy Transform Is Efficient in Large-Scale Prediction Problems: A Theoretical ExplanationPerfilievaIrina1KreinovichVladik2SessaSalvatore1Institute for Research and Applications of Fuzzy ModelingUniversity of OstravaOstrava 70100Czech Republicosu.eu2Department of Computer ScienceUniversity of Texas at El PasoEl Paso, TX 79968USAutep.edu20112772011201115052011060620112011Copyright © 2011 Irina Perfilieva and Vladik Kreinovich.This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

In many practical situations like weather prediction, we are interested in large-scale (averaged) value of the predicted quantities. For example, it is impossible to predict the exact future temperature at different spatial locations, but we can reasonably well predict average temperature over a region. Traditionally, to obtain such large-scale predictions, we first perform a detailed integration of the corresponding differential equation and then average the resulting detailed solution. This procedure is often very time-consuming, since we need to process all the details of the original data. In our previous papers, we have shown that similar quality large-scale prediction results can be obtained if, instead, we apply a much faster procedure—first average the inputs (by applying an appropriate fuzzy transform) and then use these averaged inputs to solve the corresponding (discretization of the) differential equation. In this paper, we provide a general theoretical explanation of why our semiheuristic method works, that is, why fuzzy transforms are efficient in large-scale predictions.

1. Formulation of the Problem1.1. Predictions Are Needed

One of the main objectives of science is to predict the future values of the physical quantities. For example, it is desirable to predict tomorrow's weather, the weather for several days ahead, and so forth. For a spreading flu epidemic, it is desirable to predict how this epidemic will spread if we do not introduce any restrictions on travel-and how this spread will change if such restrictions are introduced.

1.2. Detailed Predictions Are Often Impossible

Of course, ideally, it is desirable to have predictions which are as detailed as possible. For example, ideally, we would like to know the exact value of tomorrow's temperature and wind speed at all possible spatial locations within a given region—or to predict exactly where the epidemics will spread and exactly how many people will fall ill if we do not introduce any travel restrictions.

However, in many practical situations, such a detailed prediction is impossible. In some of these situations, prediction is potentially possible, but it requires such a large amount of computations that even on the fastest modern computers, the computations finish long after the future event (that we are trying to predict) has already occurred.

1.3. Large-Scale Predictions Are Usually Sufficient

In many practical situations in which we cannot predict the exact values of the future quantities, it is often sufficient to predict the average values of the future quantities, averaged over certain areas.

For example, from the practical viewpoint, even though we cannot predict the exact value of tomorrow's temperature at all possible spatial locations, it would be beneficial to predict the average temperature over a given small geographic region. Similarly, for an epidemic, even though we are unable to predict where exactly it will spread and how many people will fall ill in different small towns, it is very beneficial to be able to predict how many people on average will get ill in the region.

For predicting time series, for example, financial time series formed by the prices of different stocks at different moments of time, though it is impossible to predict the exact values of the future prices, it is desirable to at least be able to predict the trends, that is, the prices averaged over a certain time period.

Comment 1.3.

For clarity and simplicity, in the following text, we will describe the case when both the input x(t) and the output y(t) depend only on time t. The exact same formulas can also be applied if we have a spatial dependence; in this case, t and s are the corresponding spatial points.

1.4. Towards a Precise Mathematical Description of Quantities Predicted by Large-Scale Prediction

Instead of predicting the values y(t) for different moments of time t, we predict the weighted averages y¯(t), that is, the average of the values y(s) for the values s which are close to t.

It is reasonable to assume that for different moments t we use the same averaging, that is, the weight with which the value y(s) contributes to y¯(t) depends only on the difference t-s and not on the absolute values of t or s. Under this assumption, the general formula for the weighted average takes the form y¯(t)=w(t-s)y(s)ds, where all the weights are nonnegative and for each t, the total weight of all the values y(s) is equal to 1: w(s)ds=1.

1.5. An Example and a Useful Equivalent Reformulation of Averaging

A natural example of such averaging is a Gaussian averaging, where we use Gaussian weights: w(s)=12πσexp(-s22σ2). It is often convenient to represent this Gaussian weight function as w(s)=constW(s), where the new weight function W(s) is described by a simpler formula W(s)=exp(-s22σ2). This new weight function satisfies the property and max  s  W(s)=1.

1.6. Large-Scale Quantities and Fuzzy Transform

A similar representation is often useful for other weight functions as well. In general, once we know this new weight function W(s), we can use the normalized condition (2) to find that w(s)=W(s)W(t)dt. Thus, in terms of the new weight function W(s), the weighted average (1) takes the form y¯(t)=W(t-s)y(s)dsW(s)ds. Expression (8) is a particular case of the expression of a fuzzy transform  which is, in general, defined as Y=A(s)y(s)dsA(s)ds for some function A(s)0 for which maxs  A(s)=1. For a special uniform case [2, 3], we have several functions A(s) of the form An(s)=W(tn-s), where W(s) is a given function. The corresponding values Yn of the fuzzy transforms are then equal to Yn=An(s)y(s)dsAn(s)ds=W(tn-s)y(s)dsW(s)ds, that is, coincide with the values y¯(tn) corresponding to different points tn.

Thus, from the mathematical viewpoint, the weighted averages are simply the values of the fuzzy transform.

1.7. Typical Prediction Procedure: Solving a Differential Equation

Most relations in physics are described by differential equations. In particular, the relation between the observed signals x(t) and the predicted values y(t) can also be described by a differential equation.

1.8. Traditional Procedure for Large-Scale Predictions

Since prediction usually means solving a known differential equation, a usual procedure for large-scale predictions is as follows:

first, we use the known values x(t) to solve the differential equations and get the values y(t);

then, we apply the weighted average procedure (8) to the resulting values y(t) and get the desired large-scale predictions y¯(t).

1.9. Drawbacks of the Traditional Procedure

The main drawback of the traditional procedure is that we spend a lot of computation time to get a detailed solution y(t)—but at the end, we only return a few values corresponding to large-scale predictions.

For example, in weather prediction, we spend hours of computer time on high-performance supercomputers to solve a complex system of differential equations with thousand of variables and then only use the large-scale weighted average of this solution.

1.10. Natural Idea

We are only interested in large-scale predictions, that is, only in the weighted averages of the result y(t) of solving the differential equation, averages that ignore the fine structure of the solution y(t). So why not start with the averaged values of the input x(t), that is, why not ignore the fine structure of x(t) from the very beginning and thus, save computation time.

In other words,

traditionally, we first integrate the differential equation and then average the solution;

what we propose is that we first average and only then integrate; in this manner, we will need fewer values to integrate and, thus, less computation time.

1.11. Empirically, This Idea Seems to Work

For several differential equations, we implemented the above idea of how to speed up computations. Specifically,

instead of the original input x(t), we use the fuzzy transform values X1,,Xn,

then we use the values Xi in the discretized version of the original differential equation, then

we use the results Y1,,Yn of this solution as an estimate for the desired large-scale averages (= fuzzy transform of y(t)).

Surprisingly, we got a very good approximation to the values Yi computed based on the detailed y(t) .

1.12. What We Do in This Paper

In this paper, we provide a theoretical explanation for the empirical success of the fuzzy-transform-based methods of speeding up computations.

This explanation makes us confident that this fuzzy transform technique can be successfully used in other large-scale prediction problems as well.

2. Theoretical Explanation2.1. Linearization

Usually, the effect of each input value x(t) on the prediction results is small. In this sense, we can say that the inputs are relatively small. Thus, we can use the standard technique of dealing with dependence on small value:

extend the dependence of y(t) on x(s) in Taylor series,

ignore quadratic and higher order terms, and thus

keep only linear terms in this dependence.

In this case, we get the following dependence: y(t)=y0(t)+y1(t,s)x(s)ds, for some functions y0(t) and y1(t,s).

2.2. Shift-Invariance

We are interested in systematic predictions, predictions that need to be repeated again and again. In these predictions, there is no fixed moment of time: if we start with the same input repeated later (i.e., shifted in time, from x(t) to xnew(t)=x(t-t0)), we get the same result (similarly shifted) ynew(t)=y(t-t0).

For the formula (11), this shift-invariance means that

first, we must have y0(t)=y0(t-t0) for all t and t0; in particular, for t0=t, we conclude that y0(t)=y0(0), that is, y0 should not depend on time at all: y0(t)=y0;

second, we must have y1(t,s)=y1(t-t0,s-t0) for all t, s, and t0; in particular, for t0-s, we conclude that y1(t,s)=y1(t-s,0) and that the function y1(t,s) should only depend on the difference t-s.

Thus, we arrive at the following dependence: y(t)=y0+y1(t-s)x(s)ds.

2.3. Main Result: Formulation

In the traditional approach, we first find the detailed output (12) and then average it by applying the averaging y¯(t)=W(t-s)y(s)dsW(s)ds.

An alternative approach is to first apply the same averaging to the original signal x(t), resulting in x¯(t)=W(t-s)x(s)dsW(s)ds, and try use this averaged signal x¯(t) as the input to the corresponding dynamical systems (i.e., in effect, to transformation (12)): y¯f(t)=y0+y1(t-s)x¯(s)ds. Our claim is that these two approaches always lead to the same result, that is, y¯f(t)=y¯(t) for all moments of time t.

Proof.

In terms of the normalized weight function (7), the original signal has the form y¯(t)=w(t-s)y(s)ds, where y(s) is determined by formula (12). Substituting the expression y(s)=y0+y1(s-u)x(u)du into formula (17), we conclude that y¯(t)=y0+w(t-s)y1(s-u)x(u)dsdu, that is, y¯(t)=y0+w¯(t,u)x(u)du, where w¯(t,u)=defw(t-s)y1(s-u)ds.

Similarly, in terms of the normalized weight function w(t), we have x¯(t)=w(t-s)x(s)ds. Substituting the corresponding formula x¯(s)=w(s-u)x(u)du into expression (15) for y¯f(t), we conclude that y¯f(t)=y0+y1(t-s)w(s-u)x(u)dsdu, that is, y¯f(t)=y0+w¯f(t,u)x(u)du, where w¯f(t,u)=defy1(t-s)w(s-u)ds.

In view of formulas (20) and (25), to prove that the values y¯(t) and y¯f(t) always coincide, it is sufficient to prove that the corresponding functions w¯(t,u) and w¯f(t,u) coincide for all t and u. These functions are defined by expressions (21) and (26).

To prove that these expressions coincide, let us try to transform them into each other. In expression (26), we take the value of the normalized weight function w(t) at the point s-u. In contrast, in expression (21), we use the value w(t-s) for the corresponding auxiliary variable s. To transform expression (26) into the form (21), let us introduce a new auxiliary variable v for which s-u=t-v. From this formula, we conclude that s=t+u-v, hence t-s takes the form t-(t+u-v)=v-u. Thus, in terms of the new variable v, the integrated expression in (26) takes the form y1(t-s)w(s-u)=y1(v-u)w(t-v)=w(t-v)y1(v-u). Hence, the integrals of these two expressions must also coincide: y1(t-s)w(s-u)ds=w(t-v)y1(v-u)dv. The right-hand side of this equality is exactly the expression (21)—the only difference is that we use a different name for the integration variable (v instead of s). Thus, the functions w¯(t,u) and w¯f(t,u) indeed coincide—and, hence, y¯f(t)=y¯(t).

The equality is proven.

Comment 2.3.

In the ideal case, when quadratic terms can be completely ignored and there is no dependence on absolute time, the new method leads to exact same large-scale predictions as the traditional one. In practice, if we take into account that

the quadratic terms are small but non-zero, and that

there may be an underlying trend-like dependence on absolute time (like global warming in weather prediction),

we end up with approximate equality between the traditional and fuzzy-transform-based predictions—and this approximate equality is what we observed in our experiments .

Since large-scale predictions are approximate anyway, this approximate equality means that, in terms of accuracy, the new predictions are, in effect, as good as the traditional ones. Since the new predictions are much faster to compute, they have a clear practical advantage.

Acknowledgments

This work was supported in part by the National Science Foundation Grant HRD-0734825, by Grant 1 T36 GM078000-01 from the National Institutes of Health, and by Grant MSM 6198898701 from MŠMT of Czech Republic. The authors are thankful to the anonymous referees for valuable suggestions.

PerfilievaI.PetersJ. F.Fuzzy transformsTransactions on Rough Sets II20043135Springer6381Lecture Notes in Computer SciencePerfilievaI.Fuzzy transforms: theory and applicationsFuzzy Sets and Systems2006157899310232-s2.0-3364498994610.1016/j.fss.2005.11.012PerfilievaI.Fuzzy transforms: a challenge to conventional transformsAdvances in Imaging and Electron Physics20071471371962-s2.0-3454709320410.1016/S1076-5670(07)47002-1Di MartinoF.LoiaV.SessaS.Fuzzy transforms method and attribute dependency in data analysisInformation Sciences201018044935052-s2.0-7114911086910.1016/j.ins.2009.10.012Di MartinoF.LoiaV.SessaS.Fuzzy transforms method in prediction data analysisFuzzy Sets and Systems201118011461632-s2.0-7864956150410.1016/j.fss.2010.11.009NovákV.ŠtěpničkaM.PerfilievaI.PavliskaV.Analysis of periodical time series using soft computing methodsProceedings of the Computational Intelligence in Decision and Control (FLINS '08)September 2008SingaporeWorld Scientific55602-s2.0-58049170494PerfilievaI.NovákV.DvořákA.Fuzzy transform in the analysis of dataInternational Journal of Approximate Reasoning200848136462-s2.0-4004908906810.1016/j.ijar.2007.06.003PerfilievaI.NovákV.PavliskaV.DvořákA.ŠtěpničkaM.Analysis and prediction of time series using fuzzy transformProceedings of the IEEE World Congress on Computational Intelligence2008Hong Kong38753879ŠtěpničkaM.DvořákA.PaviskaV.VavříčkováL.Linguistic Approach to Time Series Analysis and ForecastsProceedings of the IEEE World Congress on Computational Intelligence (WCCI '10)2010Barcelona, SpainŠtěpničkaM.DvořákA.PaviskaV.VavříčkováL.Fuzzy approach to time series analysis: partII – Problems of linguistic descriptionsProceedings of the 5th International Conference on Fuzzy Set Theory and Applications(FSTA '10)2010127ŠtěpničkaM.DvořákA.PaviskaV.VavříčkováL.A linguistic approach to time series modeling with help of the F-transformFuzzy Sets and Systems20111801164184ŠtěpničkaM.PavliskaV.NovákV.PerfilievaI.Time series analysis and prediction based on fuzzy rules and the fuzzy transformProceedings of the IFSA World Congress/EUSFLAT Conference2009Lisbon, PortugalUniversidade Técnica de Lisbon483488