Spurious Regression

The spurious regression phenomenon in least squares occurs for a wide range of data generating processes, such as driftless unit roots, unit roots with drift, long memory, trend and brokentrend stationarity. Indeed, spurious regressions have played a fundamental role in the building of modern time series econometrics and have revolutionized many of the procedures used in applied macroeconomics. Spin-offs from this research range from unit-root tests to cointegration and errorcorrection models. This paper provides an overview of results about spurious regression, pulled from disperse sources, and explains their implications.


Introduction
During the last 30 years econometric theory has undergone a revolution.In the late seventies, economists and econometricians recognized that insufficient attention was being paid to trending mechanisms and that, in fact, most macroeconomic variables were probably nonstationary.Such an appraisal gave rise to an extraordinary development that substantially modified the way empirical studies in time-series econometrics are carried out.Research in nonstationarity has advanced significantly since the early important papers, such as Granger and Newbold 1 , Davidson et al. 2 , Hendry and Mizon 3 , Plosser and Schwert 4 , Bhattacharya et al. 5 , and Phillips 6 .Nelson and Plosser 7 asserted that many relevant U.S. macroeconomic time series were governed by a unit root a random trending mechanism , based on Dickey and Fuller's 8 Unit-root test.Several years later, Perron 9 argued that the trending mechanism in macro variables was deterministic in nature with some transcendent structural breaks .The debate continues between "unit rooters" and "deterministic trenders", though there is very general consensus as to the presence of a trending mechanism in the levels of most macroeconomic series.In the words of Durlauf and Phillips 10 : "Traditional Analyses of Economic time series frequently rely on the assumption that the time series in question are stationary, ergodic processes . . . .However, the assumptions of the traditional theory do not provide much solace to the empirical worker.Even casual examination of such time series as GNP reveals that the series do not possess constant means." Econometrics should work hand-in-hand with economic theory by providing it with the tools it requires to understand economic activity.The modeling of such mechanisms is thus a major goal of time series econometrics that said, it should be acknowledged that neither unit roots nor deterministic trends are able to model satisfactorily most series; they are what Phillips 11 labeled "heroically naive" concepts .Spurious regression can be considered as having played a fundamental role in this development.To understand it, we paraphrase Granger et al. 12 , who provide an illuminating definition Phillips 6 showed analytically that, when regressing two independent stochastic trends, the estimates of the regression coefficient do not converge to their real value of zero particularly noteworthy among the various spin-offs of the aforementioned studies are the Error Correction Model first proposed by Sargan 13 as a link between static equilibrium economic theory and dynamic empirical models, and further developed by Hendry  Spurious regression has contributed to the general improvement in the level of empirical work .Phillips's 6 results are detailed in a simple case in Appendix A : "A spurious regression occurs when a pair of independent series, but with strong temporal properties, is found apparently to be related according to standard inference in a Least Squares regression."Phillips 19 presented a counterargument on the usefulness of spurious regression: trend specifications are just coordinate systems for representing behavior over time.Phillips argues that even if the series are statistically independent, when they include a trending mechanism in their DGP, they admit a regression representation even in the absence of cointegration .This is in sharp contrast to the usual concept of spurious regression.We usually conceive this phenomenon as the statistical identification of a commonality of trending mechanisms.Phillips ventures that such results-the "spurious"constitute an adequate representation of the data.His main result applies for regressions among stochastically trended series on time polynomials as well as regressions among independent random walks.Phillips 19 proves that Brownian motions can be represented by deterministic functions of time with random coefficients.Given that standardized discrete time series with a unit root hereinafter UR converge weakly to Brownian motion processes, it is argued that deterministic time functions may be used to model them.Such representations include polynomial trends, trend breaks as well as sinusoidal trends; it is also proved that a stochastic process can represent an arbitrary deterministic function on a particular interval, so a regression of a UR process on an independent UR process is thus also a valid representation of the data.In both cases, the t-statistics diverge at rate T 1/2 , which is consistent since such parameterization reflects a partial-though correct-specification of the DGP.One of the most significant conclusions of Phillips concerns the long-standing debate of UR versus Trend Stationarity.To quote Phillips: " . . .Our results show that such specifications Trend stationary processes are not, in fact, really alternatives to a UR model at all.Since the UR processes have limiting representations entirely in terms of these functions deterministic , it is apparent that we can mistakenly "reject" a UR model in favor of a trend "alternative" when in fact that alternative model is nothing other than an alternate representation of the UR process itself ." This perspective has the virtue of allowing variables with different trending mechanism deterministic or stochastic to be related without being limited to the somewhat restrictive case of cointegration.Phillips advances this as an appropriate approach to study stochastically unbalanced relationships such as the ones that may arise between variables such as interest rates, inflation, money stock and GDP for further detail, see Phillips 20 .To the best of our knowledge, little has been done in the way of bringing the most important works in this field together, treating them in any kind of standardized way, making connections between them, and of any real study of the profound implications for economics they might indicate.This article aims to rectify this situation.

Appraisal of the Spurious Phenomenon
Much progress has been made with Least Squares statistical inference since it was first proposed more than two centuries ago as a means of estimating the course of comets 21 .Theoretical developments in econometrics address the nonexperimental nature of economic data sets.Least Squares LS offers a trade-off between simplicity and powerful inference.Nevertheless, LS has certain limitations, such as potential confusion between correlation and causality, and used unwisely may produce misleading evidence.Statisticians and econometricians had been aware of the "spurious phenomenon" since Yule 22 , and Pearson 23 , for excellent reviews of these works, see Hendry and Morgan 24 , and Aldrich 25 .These results led to the common expertise in the time-series field that indicated the need to differentiate potentially nonstationary series when using these to run regressions or detrending these by fitting trend lines estimated with LS.See Morgan 26 .There are many examples of spurious regression.Some of these are commented on in Phillips 19 , where we discover the implausible relationship between "the number of ordained ministers and the rate of alcoholism in Great Britain in the nineteenth century"; the equally "remarkable relationship" presented in Yule 27 concerning the "proportion of Church of England marriages to all marriages and the mortality rate over the period 1866-1911"; the "strange relationship" between price level and cumulative rainfall in the UK, which was advanced as a curious alternative version of quantitative theory by Hendry 28 .Plosser and Schwert 4 presented another example of nonsense correlation when they proposed their quantity theory of sunspots.The main argument is that the log of nominal income can be explained by way of the log of accumulative sunspots.Not only did they find statistically significant estimates, but the goodness of fit, measured with the R 2 , is quite high: 0.82 a variant of this example, used by the authors to demonstrate the danger of nonsense correlation was taken seriously 100 years ago, by Jevons 29 .Granger and Newbold 1 computed a Monte Carlo Experiment where a number of regressions, specified as 1 , were run using simulated variables, each perfectly independent of the others: where t 1, . . ., T, being T the sample size.The variables x t and y t are independent.Under standard regularity conditions, LS delivers no evidence of a linear relationship between y and x.In particular, β should be statistically equal to zero.Nevertheless, Granger and Newbold's 1 , they were generated as I 1 processes, I d notation, for d an integer, refers to the number of differences to be performed so the variable becomes stationary .

Data Generating Processes
Research on spurious regression has been making use of increasingly complex Data Generating Processes DGPs .T b iw /T ∈ 0, 1 , where T is the sample size; d ∈ −1/2, 3/2 .Only DGPs 1, 2 and 10 for d < 0.5 satisfy the weak stationarity definition.The remaining DGPs generate nonstationary series notice that DGP 6 is a special case of DGP 10; nevertheless, both will be treated as if they were different, given that each has been considered independently in the literature .FI processes deserve further discussion; contrary to regular ARMA p, q processes-made popular by Box and Jenkins in the 1970s-such as DGP 1, whose autocorrelation function decays at an exponential rate short memory , FI d processes have an autocorrelation function that decays at a hyperbolic rate long memory .
There are a number of empirical examples in time series in which dependence falls slowly across time.This phenomenon, known as Long Memory or Long-Range dependence, was observed in geophysical data, such as river flow data 32 and in climatological series 33 , as well as in economic time series 34 .In three important papers 35-37 , the authors extended these processes to provide more flexible low-frequency or Long Memory behavior by considering I d processes with noninteger values of d.As pointed out by Granger and Joyeux 36 , "It was standard procedure to consider differencing time series to achieve stationarity"-thus obtaining a form of the series that can be identified as an ARMA modelhowever, "Some econometricians were reluctant to this technique, believing that they may be losing information, by zapping out the low frequency components."But using infinite variance series without differencing them was also a source of difficulties at that time.Fractional integration encompasses ARMA models for d 0, and ARIMA models for d 1.The process is stationary and ergodic when d ∈ −1/2, 1/2 ; nonstationary but mean-reverting that is, it returns to its equilibrium or long-run behavior after any random shock .when 1/2 < d < 1, and nonstationary and mean averting when d ≥ 1. Mean stationary processes DGPs 2 and 3 have been used to model the behavior of real exchange rates, unemployment rates, current account, and several great ratios, such as the output-capital ratio and the consumption-income ratio.Unemployment has also been conceived as a nonstationary fractionally integrated process 38 11 of Appendix B, a few more examples are provided, in which the link between time-series econometrics and important economic issues is acknowledged.

Spurious Regression Since the Roaring Twenties
We now begin our survey of the development of the theory of spurious regression.The related literature is vast, for which reason we focus mainly on a limited selection of articles which, in our view, are particularly representative as noted earlier, Pearson 23 , and Yule 22 became aware of several problems concerning the interpretation of the correlation coefficient at the end of the nineteen century, forging by the way the concept of spurious correlation.The debate, in that time, was greatly related with the interpretation of correlation as causality although the seeds of the modern interpretation that prevails in econometrics are also present; further details can be obtained from Aldrich's 25 historical review .Unless otherwise specified, the regression specification for which all asymptotics are presented is hereinafter expression 1 .We focus mainly on on the rate of divergence of the relevant tratios and let aside in most cases the asymptotic distributions that would be obtained had the statistics being correctly normalized.This is so because such distributions do have nuisance parameters that prevent one from making use of these to do correct inference; moreover, practitioners cannot be aware a priori of the adequate normalization.However, in some cases 52-54 the asymptotic distributions are important because of the relevance for practitioners Valkanov suggests a T 1/2 normalization of the t-ratio and the simulation of the distribution in the context of Long-Horizon regressions; Sun proposes a consistent t-statistic with a nuisancefree asymptotic distribution, and; Kim, Lee and Newbold precisely emphasize the fact that, even if the t-ratios of their regression do not diverge, inference can not be drawn because of the nuisance parameters .

Yule's Experiment
Spurious correlation was evidenced by Yule 27 in a computerless Monte Carlo experiment.Shuffling decks of playing cards, Yule obtained independent series of random numbers.In fact, Yule generated independent I 0 , I 1 , and I 2 series and computed correlation coefficients amongst them.Such correlation coefficients provided correct inference when using I 0 series, but became nonsensical when the order of integration of the variables was higher.With independent I 0 variables the correlation coefficient remained close to zero.The same estimate achieved using I 1 independent variables no longer worked; many times it was close to unity, resulting in what we now refer to as a spurious correlation.If variables were I 2 , the most probable outcome was actually a correlation coefficient of close to 1: the spurious phenomenon was even stronger.

Reappraisal in the Seventies: Spurious Least Squares
Throughout most of the past century, it was commonly recommended to first-difference the series if these seemed to have a trending mechanism.However, not every econometrician was in agreement with such a method because, it was argued, differencing causes losses in the information contained in the original series.A profound reappraisal of this issue began with Granger and Newbold's 1 article, which allowed the spurious regression in Least squares estimators to be identified.The Monte Carlo experiment described above revealed, among other things, that high R 2 and low Durbin-Watson statistics hereinafter DW should be considered as a sign of misspecification.They also pointed out, comparing the outcome of simulation with many results in applied econometrics, that problems of misspecification seemed to be widespread.It was proposed that first-differenced series be used, although the authors warned about the risks of catch-all solutions; their results may be considered as the seed of many fruitful extensions in time series econometrics.Phillips 6 proposed the theoretical framework necessary to allow an understanding of Granger and Newbold's earlier results and provided a first insight into the phenomenon of spurious regression.His development set the groundwork for the spurious regression literature in econometrics.Whilst Granger and Newbold used an i.i.d.noise in their simulations, Phillips allowed a flexible autocorrelation structure, as well as some degree of heterogeneity.He then proved that, when specification 1 is estimated the following asymptotics in Table 2 are obtained: where O p m denotes the order of magnitude and p − → means convergence in probability.The most relevant results are that the R 2 does not collapse, and, the t-statistic associated to β diverges at a rate T 1/2 .This means that the t-statistic will exceed the classical 5%-level critical values −1.96 and 1.96 as the sample size grows and, with a sufficiently large sample, the null hypothesis of the t-statistic H 0 : β 0 will be rejected with certainty.

Theory at Last: Asymptotics in Nonsensical Regressions
Table 3: Orders in probability: variables y t and x t both independently generated by DGP 9.Both variables are integrated of the same order.
Park and Phillips 55 demonstrated the presence of spurious regression when the independent variables are I 2 .Marmol 56 proved that the phenomenon of spurious regression occurs in a more general nonstationary framework; he demonstrated that spurious regression occurs when independent series x t , y t are integrated of order d for d ∈ N {1, 2, . ..} see DGP 9 .Empirically relevant DGPs, such as the I 2 process identified, as mentioned earlier, with uncontrolled inflation can produce spurious regression.The asymptotic results are somewhat similar to those of Phillips see Table 3 : Although the parameter estimate of the constant term, α, diverges at a different rate, the t-statistic associated with β diverges at the same rate as in phillips seminal work, that is, O p T 1/2 .
One limitation of Marmol's 56 results is that both the dependent and the independent variables share the same order of integration.Banerjee et al. 57 , provided Monte Carlo evidence of spurious regression when the order of integration of each variable is different.Marmol  Indeed, spurious regression persists in the presence of the so-called unbalanced regressions.The rate of divergence of t β remains T 1/2 .De Jong 59 extended the study of spurious regression using independent driftless unit root processes; he used DGP 6 to generate the series and ran the following specification, which operates with logarithmic transformation of both variables-a direction which has proved extremely relevant for empirical purposes: applying the logarithmic transformation in econometrics is typical when the practitioner wants either to homogenize the variance or to directly obtain estimates of average elasticity amongst variables : a regression limit theory for non stationary panel data with large numbers of cross section and time series.Kao 62 studied the Least-Squares Dummy Variable estimator LSDV where the spurious regression phenomenon is still present for independent nonstationary variables generated by DGPs 6 and 7 .Amongst the most relevant consequences of such drift is the fact that there is not only a stochastic trend but also a deterministic one.In the long run, the deterministic trend dominates the stochastic see Appendix C .The asymptotic results of estimating 1 using independent variables generated by DGP 7 are given in Table 6: Note that t β grows at rate T instead of T 1/2 , contrary to the results presented so far, due to the presence of a deterministic trend.

Spurious Regression and Long Memory: An Unforgettable Extension
Among the first papers to deal with spurious regression in econometrics Bhattacharya et al. 5 proved that the R/S test yields spurious inference concerning long memory when the data have a deterministic trend mechanism.See Section 5 .using long memory processes are those of Cappuccio and Lubian 63 , and Marmol 64 .The authors use the nonstationary fractionally-integrated processes specified in DGP 11.Under these conditions, the asymptotics of an LS regression as specified in expression 1 are given in Table 7 here the underlying theory is an invariance principle see Davydov, 1970, Theorem 2, and Sowell, 1990, Theorem 1 .Note that as most of the previous cases the t-statistic of β diverges at rate T 1/2 .The main difference lies in the DW, the rate of divergence of which varies according to the degree of long memory, measured by d.These results can be understood as an argument against the usefulness of the "rule-of-thumb"; a regression was usually considered spurious when R 2 > DW.However, when long memory is present in the variables, the regression may well be spurious even if R 2 < DW.It is worth mentioning that the fractional integration parameter, d, is the same in both variables.This "shortcoming" is fixed by Tsay and Chung 65 , who used a manifold approach.They used two DGPs No 10 and 11 and then combined these We can summarize the results by stating that t β O p T r where 0 < r < 1 which means that the t-ratio always diverges; the divergence rate, r, is, , and T d 1 for cases 1 -6 .Special attention should be given to the Durbin-Watson statistic, which collapses to zero as usual except in the second case, where DW O p 1 and hence does not converge to zero as it does elsewhere.This also could be interpreted as an important indication of the need for caution when using the rule-of-thumb, which states that when R 2 > DW there may be spurious regression.As in Cappuccio and Lubian 63 , there is evidence that the rule-ofthumb may be a dangerous tool.Another important result appears when we compare model 1 and model 2. The reader may notice that the variables used in the second model are merely the first differences of those used in the first.What is surprising is that the spurious regression persists after the variables have been differenced the same argument was advanced by Marmol 64 .It should be mentioned that the alternative strategy, "detrending" removing the trend component in the data by running a regression on time was also known to provide spurious results 10, 66 .This goes against fifty years of tradition differencing is used to deal with spurious regression see Dickey and Pantula 67 , e.g. .Tsay and Chung 65 actually go further by suggesting that the spurious phenomenon is due to the long memory properties of the series and not to the presence of unit roots see also Cappuccio and Lubian 63 .Sun 68 shows that spurious regression can occur between two stationary generalized fractional processes, as long as their generalized fractional differencing parameters sum up to a value greater than 1/2 and their spectral densities have poles at the same location.

Spurious Regression With Stationary Series: Size-Matters
Granger et al. 12 show, both through Monte Carlo experiments and theoretically that the spurious regression phenomenon may occur amongst independent stationary series see also Mikosch and Stǎricǎ 69 .Two independent AR 1 series see DGP 2 with P w 1 were run together to estimate regression 1 , resulting in a rejection rate of the t-statistic greater than the expected 5%.Granger et al. 12 provide a theoretical proof of why this happens although the variance of the estimates does not diverge, it may not be unity, depending on the values of the DGPs parameters DGP innovations may be drawn from distributions other than normal Cauchy, Exponential, Laplace and so on , although these remain i.i.d. and independent of each other .These results are extended to long-span MA processes see DGP 1 .Curiously, the spurious regression does not depend on the sample size, but rather on the MA span parameter, q w .What is interesting about this result is the fact that it better explains the spurious regression phenomenon in small-sized samples.This would complete the theoretical framework necessary to allow an understanding of the Monte Carlo results of Granger and Newbold 1 , Ferson et al. 30 .
Granger et al. 12 results differ from all others, the effect being discussed is not an asymptotic phenomenon but rather a size distortion.Size distortions arise because standardizing a test statistic is difficult unless the exact form of the spectral density of the residuals is known.The intuition behind the spurious regression phenomenon is thus, different from the one that underlies all other results.
Mikosch and de Vries 70 provide an alternative theory to explain spurious-type behavior akin to the financial risk measurement literature Furman and Zitikis 71, Page 7 suggest that "the validity of the CAPM Capital Asset Price Model is closely related to the linearity of the regression function of the return on asset i given the return on the market portfolio of all assets".Nonlinearities such as mean or variance shifts may induce the spurious regression phenomenon; this implies that the CAPM beta regression estimations should be performed cautiously .In the words of Mikosch and de Vries 70 : "Estimators of the coefficients in equations of regression type which involve financial data are often found to vary considerably across different samples.This observation pertains to finance models like the CAPM beta regression, the forward premium and the yield curve regression.In economics, macro models like the monetary model of the foreign exchange rate often yield regression coefficients which significantly deviate from the unitary coefficient on money which is based on the theoretical assumption that money is neutral."Mikosch and de Vries 70 prove that, when the distribution of the innovations is heavy tailed, that is, when there is a departure from the normality assumption usually made, using standard statistical tools such as LS, e.g. can be misleading the non-normal behavior of financial data has been widely documented see Fama 72 .

The Last Newcomer: Trend Stationarity
Hasseler 73 studied the spurious regression phenomenon from a different perspective.He considered the possibility of spurious regression when the variables do have a deterministic trend component what we have defined earlier as trend stationarity DGP 4 .Thus, there are two independent nonstationary variables.Hasseler's 73 results, as with those of Kim et al. 52 , who worked with the same DGPs, are as in Table 8.
In addition, Kim et al. 52 proved that when one of the deterministic trend components is taken out i.e., either β y 0 and β x / 0 or vice versa , the t-ratios converge to centered normal distribution, although its variance is not unity, which may provoke spurious rejection of the null hypothesis that the parameter is zero.That means that running a trend stationary variable DGP 4 on a mean stationary variable DGP 2 or vice versa lessens the "spuriosity" in the regression.The authors also generated independent series with the DGP 4 , as did Entorf 60 , although they ran a different specification: The asymptotics provided in Kim et al. 74 only concern β and t β .We computed the orders in probability of the other two parameters, given their relevance.We included in Appendix D a guide on how to obtain these asymptotics Table 9 : It should be noted that the t-ratio converges to a normal distribution.However, since the variance is not unity, spurious regression may well be still present.
A relevant extension of Hasseler 73 , and Kim et al. 52 was provided by Noriega and Ventosa-Santaulária 75 where possibly multiple structural breaks are added to the Trend Stationary DGP, that is, DGP 5. Adding breaks to the specification of the DGPs makes the divergence rate of the t-statistic associated with β return to the usual T 1/2 "norm".They also proved that adding a deterministic trend into the regression specification see 4.2 does not prevent the phenomenon of spurious regression; t δ remains O p T 1/2 .In an effort to unify the related literature, Noriega and Ventosa-Santaulária 76 filled a number of gaps until then unaddressed.They studied several previously overlooked combinations of DGPs and summarized most of the others' results.DGPs 2-9 may indistinctly generate x and/or y obviously not every combination was carried out by Noriega and Ventosa-Santaulária 76 ; besides those presented previously, the following combinations where made by Hasseler 77, 78 : DGP 6-DGP 2, DGP 2-DGP 3, DGP 3-DGP 2, and, DGP 3-DGP 3 .In fact, amongst the new possible combinations, the divergence rate of t β usually remains T 1/2 .The combination of a Trend Stationary and an I 1 plus drift stands out since the divergence rate of the t-ratio is T rather than T 1/2 as in Kim et al. 52 , although this result should have been anticipated given the asymptotic dominance of the deterministic trend over the stochastic.This combination is thus clearly linked to Entorf's 60 results.
Granger et al. 12 showed that the use of an Heteroskedasticity and Autocovariance covariance matrix diminishes size distortions in some cases but their Monte Carlo evidence also showed that this is true only when the sample size is large greater than 500 .The use of HAC standard errors see Newey and West 79 .when the DGPs are stationary or have a deterministic trend is not so obvious.We performed a simple simulation to support this

Next of Kin: Statistical Tests, Long Horizon, and Instrumental Variables
Not all work on the spurious phenomenon has been monopolized by the LS simple estimator.Stochastic trending mechanisms have been used in most of the studies, although some exceptions are presented at the end of this section.Equally relevant are the Instrumental Variables IV estimates, which have also been analyzed 80 .
Giles 81 studied two important residual tests: the Jarque-Bera test hereinafter JB; see Jarque and Bera 82 and the Breusch-Pagan-Godfrey test 83, 84 , used to test for normality and autocorrelation/heteroskedasticity evidence in the residuals of an LS regression, respectively; he used independent driftless random walks DGP 6 and proved that both statistics diverge at rate T , that is: The null hypothesis of normality or serial independence/homoskedasticity will be eventually rejected even if it is correct, given a large enough sample these residuals are neither normal nor serially independent or homoskedastic; hence, the statistical tests provide appropriate inference by rejecting the null .Furthermore, Ventosa-Santaulária and Vera-Valdés 85 studied the behavior of the classical Granger-Causality test see Granger 86 , hereinafter GC.It is proved that the classical GC test fails to accept the null hypothesis of no GC between independent broken-trend DGP 5 or broken-mean DGP 3 processes whether the former series are differenced or not.93 .All these tests were studied by way of finite-sample experiments using independent driftless random walks see DGP 6 .They also presented the asymptotics of the RESET and Keenan tests; both test-statistics are O p T , so the null of linearity is eventually rejected as the sample size grows.In the same vein, Noriega and Ventosa-Santaulária 94 proved that, when the variables are generated independently by any combination of DGPs 3 , 5 , 7 , and 8 , the Engle and Granger cointegration-test spuriously rejects the null of no cointegration of course, there is an important exception: if both variables are independent I 1 processes, the Engle and Granger test tends to accepts the null of no-cointegration . in an indeterminate number of cases since the relevant t-statistic is O p T 1/2 the t-statistic may diverge toward ∞ or −∞; the sign depends on the unknown values of the DGPs; Monte Carlo evidence is presented to assert that in many cases, such sign is negative .
Spurious regression has also been identified in the context of Long-Horizon henceforth LH regressions, which are used in situations where previous "short-term" that is, using the variables in levels, without doing any kind of temporal aggregation .studies have failed.In fact, Valkanov's 95 insight is that rolling summations of I 0 variables, that is LH variables, behave asymptotically as I 1 series.Hence, the theory of spurious regression between independent I 1 variables, discussed earlier, should give readers the intuition.Late eighties' studies provided interesting results in economics and finance.As asserted by Valkanov 95 : "The results . . .are based on long-horizon variables, where a long-horizon variable is obtained as a rolling sum of the original series.It is heuristically argued that long-run regressions produce more accurate results by strengthening the signal coming from the data, while eliminating the noise."More precisely, if the specification to be estimated is 1 , a Long-Horizon "reinterpretation" is carried out with the building of partially aggregated variables.As usual, let, w x, y, then w k t k−1 j 0 w t j .The LH regression specification can usually be one of the following:

4.3
Such specifications are used mostly in the estimation of the equity/dividend relationship and to test both the neutrality of money or the Fisher effect.Based upon the previous specifications, Valkanov 95 ,and Moon et al. 54 proved that this regression strategy also presents the spurious regression phenomenon.To do this, they let time overlap in the summations as a fixed fraction of the sample size, k λ • T where λ ∈ 0, 1 .Valkanov 95 then defines the following DGP: where 1 − L φ L x t u xt .The variable x t follows an autoregressive process whose highest root is unity whilst the rest, represented in the polynomial φ L , is invertible.Let ω u yt , u xt be defined as a martingale difference sequence with 11 σ 12 ; σ 21 σ 2 22 and finite fourth moment.We could argue that, when β 0, the t-ratio associated with its estimate should be small enough for the null hypothesis to be accepted.

Journal of Probability and Statistics
As in previous results, that does not happen; t β O p T 1/2 .Valkanov 95 ,and Moon et al. 54 suggest the use of a rescaled t-ratio, t β /T 1/2 .Although the limiting distribution of such t-statistic's is neither normal nor pivotal, it can be easily simulated and hence used.Lee 96 extends the results for fractionally integrated processes and finds that the t-ratio associated with β in 3 diverges: Ventosa-Santaulária 80 studies the asymptotics of Instrumental Variables IV using DGPs 5 and 7 , not only for x and y, but also for a spurious and independent instrument.It is shown that the t-ratio still diverges at rate T 1/2 which confirms the Monte Carlo simulations carried out by Leybourne and Newbold 97 .This result can be also seen as a complement to those presented by Phillips and Hansen 98 , and Hansen and Phillips 99 , in which the use of spurious instruments is proposed to improve the estimates whenever strong endogeneity is present between x and the residual term in cointegrated relationships.
Sun 53 developed a convergent t-statistic for correcting the phenomenon of spurious regression.The new t-ratio is based on an estimate of the parameter variance made in the same manner as HAC standard errors.This t-statistic converges to a non-degenerate limiting distribution for many cases of spurious regression.He considered the regression between two independent nonstationary I d processes with d > 1/2 as well as the regression between an independent nonstationary I d process and a linear trend see 4.5 : In previous studies, the presence of unit roots in the series generally produced a T 1/2 divergent t-ratio.To avoid such divergence, Sun proposes a rescaling of the parameter estimate using a new standard error.The new Standard Error is computed in the same way HAC is.The main difference is that HAC estimates usually require a bandwidth or truncation lag.Sun suggests using the entire sample length: where x t j − x u t j u t x t − x for j < 0, 4.7 where κ is a kernel function that belongs to a class that ensures positive definitiveness.Whether x t ∼ I d or x t t, well-defined asymptotic distributions for the t-ratio are provided.

What to do if One Fears Spurious Regression
The spurious regression phenomenon pervades many subfields in time series analysis.It might be controlled by using correctly scaled t-ratios, as suggested earlier, but having a clear idea of what DGP best emulates the properties of the series this could be labeled as DGPification would therefore be necessary examples of this can be: i evidence of UR must be obtained before a cointegration analysis is undertaken; ii the nature of the trending mechanism should be identified prior to the application of a transformation intended to render stationary the series, and; iii a test such as Robinson's 100 could be undertaken in order to identify long-memory behavior .This can be achieved by means of applying a battery of tests to our series.Such an approach is not exempted of failures.Many statistical tests are known to yield spurious evidence under specific circumstances see the previous subsection .Nevertheless, pretesting the series remains an adequate strategy and allows the practitioner to be aware of the potencial difficulties he could face.In this section we include a short-and incomplete-list of tests that are employed in the DGPi-fication of the series:

Concluding Remarks
Spurious regression can arise wherever a trending mechanism is present in the data.Even some stationary-autocorrelated processes cause spurious results.Applied macroeconomists and financial experts have been steadily incorporating technical advances in the analysis of the spurious regression, a phenomenon identified for many empirically-relevant Data Generating Processes.These include stationary processes with AR or long MA structure and/or level breaks; random walks with or without drifts , trend stationarity with possible level and trend breaks , long-memory processes whether stationary or not and so on.These processes have been associated with unemployment rates, price levels, real exchange rates, monetary aggregates, gross domestic product and various financial variables.The use of Least Squares with such variables entails a high risk of obtaining a spurious relationship.
Differencing the series may not always prevent spurious estimates; nor should the R 2 > DW rule-of-thumb be seen as an adequate rule to identify a spurious regression.Cointegration analysis appears to better prevent non-sensical statistical relationships although, one should bear in mind Phillips's 20 and further study the statistical relationship at hand.Out-of-sample forecasting evaluation could be an option.Most macroeconomic variables are either nonstationary or very persistent.Pre-testing the variables in order to identify the nature of the trending mechanism arises as the golden rule to avoid non-sense regression.Once the DGP is correctly identified, spurious regression is "easier" to deal with.
Attaining a clear understanding of any problem is the first step toward finding its solution.

A. Spurious Regression Using Independent UR Processes
The theoretical explanation of the spurious regression phenomenon documented in Granger and Newbold 1 was provided by Phillips 6 .We present here a simple version of Phillips's 6 results.Assume two independent nonstationary unit root processes: where, for simplicity, u zt ∼ N 0, σ 2 z ; u xt ⊥ u y t.By solving recursively both equations, we obtain: where X 0 and Y 0 are initial conditions.It can be proved see, e.g., Hamilton 133, Pages 486 and 548 that: where ω z 1 is a standard brownian motion and d − → denotes convergence in law.Therefore assume that u z0 0 : The LS formula of β is: A.5 Replacing the sums that appear in this formula with the above asymptotic expressions and letting T → ∞ yields: A.6 The same can be done with the variance of the regression as well as with the t-ratio associated to β:

A.7
Note that the t-ratio diverges as the sample size grows.The usual critical values to test the null hypothesis β 0 are ±1.96level: 5% so it is straight forward to see that the null hypothesis will always be rejected for a sample sufficiently large: this is a simple example of spurious regression.

C. Dominance of the Deterministic Trend Over the Stochastic Trend
In order to explain the dominance of the deterministic trend over the stochastic trend we may study the sums of a UR with drift.Assume x t is generated by DGP 7; x t μ x x t−1 u xt .By solving recursively this we get: where the first term 1 is an initial condition, the second term 2 represents the deterministic trend, and the third term 3 accounts for the stochastic trend; note that ξ xt t i 1 u xt−i .Then all sums run from 1 to T , The orders in probability of the underbraced sums can be found in Hamilton 133, chapters 16 and 17 .Note that the leading term in expressions C.2 and C.3 is always the deterministic component T 2 in x t−1 and T 3 in x 2 t−1 .This is why it is said that the time trend asymptotically dominates the stochastic components.

D. Asymptotics of LS Estimates of Specification 2
We present a guide as to how to obtain the order in probability of the estimates and their associated t-ratios using LS where the variables y and x are generated by DGP 4 and specification 2 is estimated.Proof of such was provided with the aid of Mathematica 4.1 software.We use the classical LS.Let Θ α, β, δ :

D.4
The orders in convergence of the underbraced expressions can be found in Hamilton 133,Chapter 16 .We can fill the previously-cited matrices and then compute the LS parameter estimates and the t-statistic associated with each one.The asymptotics are computed by the program available at the following URL: http://www.ventosasantaularia.com/JPSappD.pdf .
. Some examples can be found in Perron and Vogelsang 39 , Wu 40 , Wu et al. 41 , and D'Adda and Scorcu 42 .Trend Stationarity and I 2 processes DGPs 4 and 9, resp.have been used to model growing variables, real and nominal, such as output, consumption and prices; several macro variables have been conceived as DGPs 5, 6, and 7 9, 43-45 .Variables identified as I 2 processes can also be found in Juselius 46, 47 , Haldrup 48 , Muscatelli and Spinelli 49 , Coenen and Vega 50 , and Nielsen 51 .In Table

β σ y σ x 1 0 1 0 ω x r dr 1 0 ω y r dr 1 0 ω x r 2 dr − 1 0 ω x r dr 2 O p 1 .
ω x r ω y r dr − and Anderson 14 , Davidson et al. 2 , Hendry and Mizon 3 , and Hendry et al. 15 and the entire theory of cointegration first proposed by Granger 16 , Granger and Weiss 17 , and Engle and Granger 18 .The linking of Cointegration with Error Correction Models can also be found in these articles .
Table1provides a summary of those appearing in this survey: where u wt are independent innovations obeying inPhillips 6, Assumption 1 , wt is an i.i.d white noise with mean zero and variance σ 2 , and DU iwt , DT iwt are dummy variables allowing changes in the trend's level and slope respectively, that is, DU iwt 1 t > T b iw andDT iwt t − T b iw 1 t > T b iw, where 1 • is the indicator function, and T b iw is the unknown date of the ith break in w.We denote the break fraction as λ iw

Table 1 :
The DGPs for w t x t , y t , z t .Note: TS, br and dr stand for trend-stationarity, breaks, and drift, respectively.
t u wt with 1 − L d u wt wt for d ∈ 0, 3/2

Table 2 :
Orders in probability: variables y t and x t both independently generated by DGP 6.

Table 4 :
58 , later allowed the relevant DGPs for both y and x to be integrated of different orders; y t ∼ I d 1 and x t ∼ I d 2 where d 1 and d 2 ∈ N {1, 2, . ..}. Marmol's 58 results are twofold; either d 1 > d 2 or vice versa: i when d 1 > d 2 ii when d 1 < d 2

Table 5 :
iii in both cases R 2 O p 1 .
Entorf's results apply to fixed-effects models where N the dimension of the cross section is treated as fixed.The phenomenon of spurious regression in panel data under more general conditions has been further developed; Phillips andMoon 61 provided

Table 4 :
Orders in probability: variables y t and x t both independently generated by DGP 9. Variables are integrated of different orders d 1

Table 5 :
Orders in probability: variables y t and x t both independently generated by DGP 9. Variables are integrated of different orders.

Table 6 :
Orders in probability: variables y t and x t both independently generated by DGP 7.

Table 7 :
Orders in Probability: variables y t and x t both independently generated by DGP 11.The variables are integrated of the same order.Thus, there are four fractionally-integrated processes, two stationary, FI d 1 and FI d 2 with d i ∈ 0, 1/2 , and two nonstationary, FI d i 1 for d i ∈ 1/2, 3/2 .The authors then studied several specifications using six DGPs.The first four are used in the estimation of specification 1 whilst the remaining two estimate y t α δt u t :1 y t ∼ FI1 d 1 and x t ∼ FI 1 d 2 , the variables may be integrated of order d ∈ 1/2, 3/2 which generalizes Phillips's 6 results , 2 y t ∼ FI d 1 and x t ∼ FI d 2 where d 1 d 2 > 1/2, 3 y t ∼ FI 1 d 1 and x t ∼ FI d 2 where d 2 > 1/2, 4 y t ∼ FI d 1 and x t ∼ FI 1 d 2 where d 1 > 1/2, 5 y t ∼ FI 1 d 1 where d 1 > 0, 6 y t ∼ FI d 1 where d 1 > 0.

Table 8 :
Orders in Probability: variables y t and x t both independently generated by DGP 4.

Table 9 :
Orders in Probability: variables y t and x t both independently generated by DGP 4. The estimated specification is 2 .

Table 10 :
t-ratio rejection rates using a 1.96 critical value 5% level for a standard normal distribution: spurious regression using LS and HAC standard errors.

Table 10 .
The variables x and y are independently generated either by DGP 4 , 5 , or 6 .Lag selection was done as in Granger et al. 12 , that is, using the formula: l integer 4 T/1001/4.Simulation results show that size distortions are less severe when HAC is used, but remain extremely high perhaps results would improve further if the data could follow a pre-whitening procedure .
Kim et al. 87 study several important nonlinearity tests: the RESET test 88 , the McLeod and Li test 89 , the Keenan test 90 , the Neural Network test 91 , White's Information test 92 , and the Hamilton test

1
Drawing inference concerning the nonstationarity of the series can be done by means of Dickey-Fuller-type DF tests see Dickey and Fuller 8, 101 .The original DF test distinguishes between the null hypotheses of UR DGPs 6 and 7 and the alternatives of stationarity DGPs 1, 2 and 4 .Other well-known UR tests are: i the KPSS test 102 ; ii the GLS-detrended DF test 103 ; iii the Phillips-Perron test 104 , and; iv the Ng and Perron test 105, 106 . 2 The UR tests previously mentioned provide severely biased results under the hypothesis of trend-stationarity in the presence of structural breaks: DF-type tests i over-accept the null hypothesis of unit root when there is a trend/level break in the-trend-stationary process 9, 107-113 , and; ii over-reject the null hypothesis when there is a trend/level/variance break in the unit root process 114-117 .Several alternatives are available.Perron 9 suggested the use of a DF-type test with point,level and trend breaks specified in the auxiliary regression see DGP 5 ; the break dates must be decided by the practitioner.Zivot and Andrews 118 also proposed to modify the DF test in the same direction than Perron, only they allowed the break date to be endogenously specified; their test allows for a single break see Lumsdaine and Papell 44 for an extension that allows for two breaks under the alternative hypothesis DGP 5 and rules out the possibility of a break under the null hypothesis of UR DGP 8 ; Carrion-i-Silvestre and Sans ó 119 proposed a test where a break under the null hypothesis is taken into account.3 Bai and Persson 120 proposed a test to distinguish between DGPs 2 and 4 and DGPs 3 and 5.The test presupposes that the trending mechanism is exclusively deterministic.stationary when the latter contains a trending mechanism.Moreover, Mikosch and Stȃricȃ 123 , and Mikosch and Stȃricȃ 69 proved that the sample autocorrelation function sample ACF can also be a misleading statistical tool when used to identify LM; stationary series that include a non-linear component might yield a sample ACF usually attributed to LM processes see also Teverovsky and Taqqu 124 .Several Short memory processes may thus seem to behave as LM processes.This phenomenon can be labeled as spurious long memory 69 .5 Many other tests have been proposed to identify LM while they control for possible non-linearities structural breaks in the mean or the variance of the series .See Liu et al. 125 , Robinson 100 , Lobato and Robinson 126 , Giraitis et al. 127, 128 , Berkes et al. 129 , Zhang et al. 130 , et al. 131 , and Jach and Kokoszka 132 .

Table 11
explains empirical applications in macroeconomics and finance.
XWe will now describe the process involved in establishing the aforementioned proof.α, β, δ and their corresponding t-ratios are functions of the following expressions unless indicated otherwise, all sums run from t 1 to T .Let w y, x: