We proposed a robust mean change-point estimation algorithm in linear regression with the assumption that the errors follow the Laplace distribution. By representing the Laplace distribution as an appropriate scale mixture of normal distribution, we developed the expectation maximization (EM) algorithm to estimate the position of mean change-point. We investigated the performance of the algorithm through different simulations, finding that our methods is robust to the distributions of errors and is effective to estimate the position of mean change-point. Finally, we applied our method to the classical Holbert data and detected a change-point.
Change-point analysis has been an active research area since the early 1950s. During the following period of sixty-some years, numerous articles have been published in various journals and proceedings. Chen and Gupta [
The Schwarz information criterion (SIC) proposed by Schwarz [
However, in practice, we do not know the real distribution of the data, and it is difficult to determine the real distribution especially when some change-point is present in the data. So, the normal assumption is not always suitable, for a lot of real data usually shows heavy tail and skewness. In such cases, some robust change-point detecting model with heavy-tailed distribution might be better than the normal model. Osorio and Galea [
The symmetric Laplace distribution, also known as the double exponential distribution or the first law of Laplace, is another heavy-tailed error distribution besides the Student
In recent years, statistical models based on Laplace distribution have developed rapidly both in theory and application. Purdom and Holmes [
In this paper, we study the single mean change-point problem in linear regression model assuming that the error follows the Laplace distribution via EM algorithm and use the SIC model selection method to estimate the position of the mean change-point. Then, we investigate the robustness of the algorithm through simulations under different error distributions. Finally, we apply our method to some stock market data set.
The symmetric Laplace distribution is commonly denoted by
Andrews and Mallows [
Suppose random variable
Let
Suppose random variables
Proposition
The single mean (regression coefficients) change-point problem in a Laplacian linear regression model can be formulated as to test the following null hypothesis:
Denote
Given the initial values
In order to obtain
Therefore, we can obtain
Denote
Denote
The likelihood function of
From the stochastic representation in Section
So, the likelihood function of the complete data
Consequently, the complete log-likelihood function is
Given initial values
The marginal conditional pdf of
Denote
It can be seen from (
The work by Chen [
In the Laplacian linear regression model, the Schwarz information criterion under
The selection criteria are to choose a model with a change-point in the
In this section, we investigate the performance of the proposed approach to detect mean change-point through simulations, and we compare our procedure with the change-point detecting procedure proposed by Chen [ Simulation 1: normal distribution: Simulation 2: Laplace distribution: Simulation 3: Simulation 4: Simulation 5: log-normal distribution: Simulation 6: Cauchy distribution.
In order to evaluate the finite sample performance of the proposed method, 500 replications are conducted for different error distribution, respectively. In each replication, the initial values of
Finally, the mean and standard difference of the estimated change-point position In the In the skew In the An interesting phenomenon appears in the case of Cauchy distribution, where the normal method simply uses integers around 100 to estimate the true values of
Comparison of results for Laplace model and normal model based on 500 replications in Simulations 1–3.
Error distribution |
|
Estimate. |
Estimate. |
diff. |
diff. |
sd. |
sd. |
---|---|---|---|---|---|---|---|
|
40 | 40.07 | 40.11 | 0.07 | 0.11 | 1.92 | 1.77 |
60 | 59.85 | 59.81 | 0.14 | 0.18 | 1.83 | 1.93 | |
80 | 79.92 | 79.91 | 0.07 | 0.08 | 1.88 | 1.85 | |
100 | 99.92 | 99.90 | 0.07 | 0.10 | 1.78 | 1.74 | |
120 | 120.00 | 119.97 | 0.00 | 0.02 | 1.78 | 1.74 | |
140 | 139.87 | 139.90 | 0.12 | 0.09 | 1.64 | 1.53 | |
160 | 159.92 | 159.85 | 0.08 | 0.14 | 1.86 | 1.77 | |
|
|||||||
|
40 | 39.98 | 38.89 | 0.02 | 0.11 | 2.50 | 2.96 |
60 | 59.94 | 60.08 | 0.06 | 0.08 | 2.47 | 2.83 | |
80 | 80.06 | 80.14 | 0.06 | 0.14 | 2.95 | 3.19 | |
100 | 100.21 | 99.99 | 0.21 | 0.01 | 2.83 | 3.18 | |
120 | 120.03 | 120.07 | 0.03 | 0.07 | 2.23 | 2.91 | |
140 | 140.13 | 140.23 | 0.13 | 0.23 | 2.29 | 2.52 | |
160 | 160.03 | 159.81 | 0.03 | 0.19 | 2.55 | 5.29 | |
|
|||||||
|
40 | 40.14 | 40.79 | 0.14 | 0.79 | 2.54 | 11.26 |
60 | 59.96 | 61.10 | 0.04 | 1.10 | 2.51 | 13.22 | |
80 | 80.22 | 80.80 | 0.22 | 0.80 | 2.56 | 12.67 | |
100 | 100.05 | 99.32 | 0.05 | 0.68 | 2.99 | 12.68 | |
120 | 120.03 | 119.53 | 0.03 | 0.47 | 2.99 | 10.06 | |
140 | 140.11 | 139.51 | 0.11 | 0.49 | 2.82 | 12.15 | |
160 | 159.67 | 159.27 | 0.33 | 0.73 | 7.54 | 11.82 |
Comparison of results for Laplace model and normal model based on 500 replications in Simulations 4–6.
Error distribution |
|
Estimate. |
Estimate. |
diff. |
diff. |
sd. |
sd. |
---|---|---|---|---|---|---|---|
|
40 | 42.00 | 46.42 | 2.00 | 6.42 | 16.74 | 28.94 |
60 | 60.67 | 61.58 | 0.67 | 1.58 | 9.62 | 19.85 | |
80 | 80.37 | 81.44 | 0.37 | 1.44 | 7.76 | 17.26 | |
100 | 99.18 | 99.03 | 0.82 | 0.97 | 8.76 | 14.99 | |
120 | 119.82 | 120.36 | 0.18 | 0.36 | 11.13 | 16.98 | |
140 | 139.26 | 137.25 | 0.74 | 2.75 | 10.98 | 22.95 | |
160 | 158.21 | 152.70 | 1.79 | 7.30 | 15.74 | 30.94 | |
|
|||||||
|
40 | 41.23 | 49.78 | 1.23 | 9.78 | 13.49 | 38.92 |
60 | 60.40 | 63.87 | 0.40 | 3.87 | 4.83 | 27.14 | |
80 | 80.58 | 83.08 | 0.58 | 3.08 | 4.66 | 25.60 | |
100 | 100.00 | 101.35 | 0.00 | 1.35 | 6.50 | 21.20 | |
120 | 120.44 | 117.74 | 0.44 | 2.26 | 6.03 | 24.65 | |
140 | 139.92 | 136.16 | 0.08 | 3.84 | 5.48 | 27.81 | |
160 | 158.80 | 152.27 | 1.20 | 7.73 | 13.71 | 35.13 | |
|
|||||||
Cauchy | 40 | 51.75 | 98.10 | 11.75 | 58.10 | 46.99 | 67.04 |
60 | 64.81 | 98.77 | 4.81 | 38.77 | 34.23 | 66.56 | |
80 | 81.63 | 81.44 | 1.63 | 1.44 | 7.76 | 17.26 | |
100 | 98.61 | 99.34 | 1.39 | 0.66 | 29.76 | 64.17 | |
120 | 119.83 | 105.36 | 0.17 | 14.64 | 28.86 | 65.80 | |
140 | 134.43 | 107.23 | 5.57 | 32.78 | 36.47 | 63.55 | |
160 | 149.95 | 105.12 | 10.05 | 54.90 | 46.53 | 66.94 |
Another simulations design with
Holbert [
SIC calculated by Laplace method and normal method for Holbert data.
Time point | Calendar month | NYAMSE | BSE | SIC Laplace | SIC normal |
---|---|---|---|---|---|
1 | Jan. 1967 | 10581.6 | 78.8 | — | — |
2 | Feb. 1967 | 10234.3 | 69.1 | 364.2829 | 368.5739 |
3 | Mar. 1967 | 13299.5 | 87.6 | 363.3368 | 367.8817 |
4 | Apr. 1967 | 10746.5 | 72.8 | 363.3110 | 367.7757 |
5 | May 1967 | 13310.7 | 79.4 | 361.7661 | 366.4980 |
6 | Jun. 1967 | 12835.5 | 85.6 | 359.5049 | 365.7947 |
7 | Jul. 1967 | 12194.2 | 75.0 | 357.4092 | 364.8795 |
8 | Aug. 1967 | 12860.4 | 85.3 | 354.8215 | 363.9410 |
9 | Sep. 1967 | 11955.6 | 86.9 |
|
363.5574 |
10 | Oct. 1967 | 13351.5 | 107.8 | 354.3397 | 363.5818 |
11 | Nov. 1967 | 13285.9 | 128.7 | 357.2243 | 364.6607 |
12 | Dec. 1967 | 13784.4 | 134.5 | 360.1891 | 365.4162 |
13 | Jan. 1968 | 16336.7 | 148.7 | 361.0186 | 365.3077 |
14 | Feb. 1968 | 11040.5 | 94.2 | 362.3622 | 365.5670 |
15 | Mar. 1968 | 11525.3 | 128.1 | 363.4817 | 366.6527 |
16 | Apr. 1968 | 16056.4 | 154.1 | 364.0982 | 366.8008 |
17 | May 1968 | 18464.3 | 191.3 | 362.8076 | 366.9825 |
18 | Jun. 1968 | 17092.2 | 191.9 | 360.4358 | 367.2177 |
19 | Jul. 1968 | 15178.8 | 159.6 | 359.1133 | 367.3715 |
20 | Aug. 1968 | 12774.8 | 185.5 | 359.5155 | 368.4097 |
21 | Sep. 1968 | 12377.8 | 178.0 | 359.5954 | 368.3030 |
22 | Oct. 1968 | 16856.3 | 271.8 | 356.8656 | 363.5156 |
23 | Nov. 1968 | 14635.3 | 212.3 | 355.3476 |
|
24 | Dec. 1968 | 17436.9 | 139.4 | 357.0433 | 361.1139 |
25 | Jan. 1969 | 16482.2 | 106.0 | 360.8022 | 364.8916 |
26 | Feb. 1969 | 13905.4 | 112.1 | 360.8808 | 365.1567 |
27 | Mar. 1969 | 11973.7 | 103.5 | 360.4217 | 365.0086 |
28 | Apr. 1969 | 12573.6 | 92.5 | 360.9285 | 365.3012 |
29 | May. 1969 | 16566.8 | 116.9 | 363.5041 | 367.3072 |
30 | Jun. 1969 | 13558.7 | 78.9 | 364.3949 | 368.2468 |
31 | Jul. 1969 | 11530.9 | 57.4 | 362.7110 | 368.2235 |
32 | Aug. 1969 | 11278.0 | 75.9 | 360.3198 | 367.7685 |
33 | Sep. 1969 | 11263.7 | 109.8 | 362.6290 | 368.1350 |
34 | Oct. 1969 | 15649.5 | 129.2 | — | — |
35 | Nov. 1969 | 12197.1 | 115.1 | 358.0474 | 361.4956 |
The bold SIC value in Table
Minimum SIC values in Holbert data based on different models.
Method | Laplace | Normal |
|
|
---|---|---|---|---|
|
358.0474 | 361.4956 | 358.082 | 363.358 |
min |
353.8327 | 358.1847 | 355.035 | 357.416 |
|
9 | 23 | 23 | 9 |
Scatter plot and regression lines of BSE versus NYAMSE. The red points and line correspond to the first 9 observations; the green points and line correspond to the 10–35 observations.
In this paper, we proposed the Laplace linear regression model with a mean change-point and developed the EM algorithm with SIC model selection criterion to estimate the position of mean change-point. We investigated the performance of the algorithm for different simulations, finding that the algorithm behaved quite well when the errors follow the Laplace distribution. Besides that, our Laplace method is more robust than the normal method to estimate the position of mean change-point when the errors follow skew and heavy-tailed distributions, especially when the true position of change-point is in the head part or in the tail part of the data. Finally, we applied our method to the Holbert data and detected a mean change-point. Considering the difficulty in estimating the unknown degree of freedom in the Student
The author declares that there is no conflict of interests regarding the publication of this paper.