Change Point Detection with Robust Control Chart

Monitoring a process over time using a control chart allows quick detection of unusual states. In phase I, some historical process data, assumed to come from an in-control process, are used to construct the control limits. In Phase II, the process is monitored for an ongoing basis using control limits from Phase I. In Phase II, observations falling outside the control limits or unusual patterns of observations signal that the process has shifted from in-control process settings. Such signals trigger a search for assignable cause and, if the cause is found, corrective action will be implemented to prevent its recurrence. The purpose of this paper is to introduce a new methodology appropriate for constructing a robust control chart when a nonnormal or a contaminated data that may arise in phase I state. Through extensive Monte Carlo simulations, we examine the behaviors and performances of the proposedMM robust control chart when there is a process shift in mean.


Introduction
Statistical process control SPC concepts and methods have become very significant in the manufacturing and process industries.Their goal is to monitor the performance of a process over time in order to justify whether or not the process is remaining in a "state of statistical control."This state of control is said to occur if certain process or product variables remain near to their desired values and the only source of variation is "common-cause" variation, that is, the variation which affects the process all the time and is essentially inevitable within the current process.Shewhart charts are used to monitor key product variables in order to detect the occurrence of any event having a "special" or "assignable" cause.By discovering assignable causes, long-term improvements in the process and in product quality can be accomplished by eliminating the causes or improving the process or its operating procedures.
Detecting one or more change points in a batch of observations has attracted substantial investigation in the statistical, engineering, and econometric literature.Assuming that there is an ordered sequence of observations, usually, but not necessarily, taken at equally spaced times, there is a change point between two successive observations if their statistical distributions are different.Between change points, the distributions are usually considered to be identical.In practice, recognizing when a process has changed would simplify the search for the special cause.If the time of the change could be identified, process engineers would have a smaller search window within which to look for the special cause.Consequently, the special cause can be determined more quickly, and necessary actions needed to improve quality can be carried out sooner.In this paper, we will analyze the efficiency of a change point estimator in process mean 1 for each of Shewhart X, Median, and the proposed MM control chart once issues a signal.The derivation of the change point estimator that is shown in the appendix is in virtue of Hinkley 2 .Hinkley discussed the asymptotic properties of the estimator.Whenever each of Shewhart X, Median, or the proposed MM chart signals that a special cause is present.The estimator provides practitioners with a useful estimate of the time of the process change.In Section 2, we will introduce a model for a step change in the location of a process.We consider a step change for a process mean occurs when the mean suddenly changes its value and then remains unchanged again until corrective action has been taken.On the basis of this step-change model, we adopt the estimator of the time of the process change when the corresponding chart does signal.In Section 5, we analyze the performances of each chart by means of Monte Carlo simulation.

Process Step-Change Model
Suppose that the process is initially in control, with observations coming from a Normal distribution with a known mean of μ 0 and a known standard deviation of σ 0 .Even so, after an unknown point in time T known as the process change point , the process location changes from μ 0 to μ 1 μ 0 δσ 0 / √ n, where n is the subgroup size and δ is the unknown magnitude of the change.Assuming also that once this step change in the process location occurs, the process remains at the new level of μ 1 , until the special cause has been identified and removed.We let X T be the first subgroup average to exceed a control limit and that this signal is not a false alarm.Hence, X 1 , X 2 , . . ., X τ are the subgroup averages that come from the in-control process, while X τ 1 , X τ 2 , . . ., X T are from the changed process.

Definitions
To illustrate, we concentrate on robust estimates for the simple location-scale model, by letting x 1 , . . ., x n be n observations on the real line satisfying where ε i are independent and identically distributed observations with variance equal to 1.We are interested in estimating μ and the scale σ which is a nuisance parameter.We consider M-location estimates which is proposed by Huber 3 .He defined μ n as the solution of an estimating equation of the form where σ n is a robust estimate of the residuals scale, and ψ : R → R is a bounded, nondecreasing, and odd real function.We focus on ψ function which is continuous and differentiable influence function given in 3.2 see 4 where c > 0 is a user-chosen tuning constant, and p 4 u 38.4 − 175u 300u 2 − 225u 3 62.5u 4 see 5 , for other choices of smooth functions ψ.
The scale estimate σ n in 3.2 is an S-estimate of scale see 6 which is defined as follows.Let ρ : R → R be a bounded, continuous, and even function satisfying ρ 0 0 and let b ∈ 0, 1 .The S-scale σ n is defined by where, for each t ∈ R, s n t is the solution of Indeed, associated with this family are the S-location estimates μ n is given by μ n arg inf t∈R s n t .

3.6
Beaton and Tukey 7 proposed a family of functions ρ d given by where the tuning constant d is positive.According to Yohai 8 , these M-location estimates obtained with an S-scale estimate are called MM-location estimates.Specifically, the estimates μ n , σ n , and μ n solve the following system of equations:

3.8
Let d 1.548 for ρ d in 3.7 , b 0.5 in 3.5 , and c 1.525 for ψ c in 3.3 , which yields a location estimate μ n with 50% breakdown point and 95% efficiency when the errors have a normal distribution.

Sample Median
Sample median has been used in early process control charts as it is insensitive to behavior in the tails of the distribution.However, under the normal distribution, the efficiency of the sample median drops off rapidly towards its asymptotic value of 0.64 as sample size increases.For a random sample of size n observations X 1 , X 2 , . . ., X n , the sample median, denoted by MD, is defined as follows: , if n is even.

3.9
The interest of using the sample median, MD, is that it is easy to determine, requires only the middle values to calculate, can be used when a distribution is skewed, is not affected by outliers, and has a maximal 50 percent breakdown point.Moreover, its gross-error sensitivity is low and as the sample size n increases, the variance of the MD decreases as 1/n, but the maximum bias does not change.Hence, the bias is the property of importance for large sample sizes, and MD is the estimator possesses the smallest maximum bias for a given proportion of contamination ε.Huber 3 showed that it minimizes the maximum asymptotic bias over contamination neighborhoods.As opposed to that, the disadvantages for the sample median, MD, are its difficulty to handle in mathematical equations, nonutilizing all available values, and being misleading when the distributions come from a long tail distribution as it might sometimes discard some useful information see 9, 10 .However, the sample median has become as a good general purpose estimator and is generally considered as an alternative average to the sample mean especially whenever outliers might present in the distribution.

Median Absolute Deviation from the Sample Median
The median absolute deviation from the sample median, denoted as MAD, is a more robust scale estimator than the standard deviation.The MAD was first introduced by Hampel 11 who attributed it to Gauss.It is simple and easy to compute and mainly used in detecting outliers in a data.The estimate is often used as an initial value for the computation of more efficient robust estimators.Let us denote X 1 , X 2 , . . ., X n as a random sample of size n observations with sample median MD.MAD possesses the following properties: i it has a maximal 50 percent breakdown point which is twice as the IQR; ii in the case of the standard normal distribution, F, the influence function of the MAD estimator, IF X; MAD, Φ , is a step function that takes on two values.This IF is bounded by the sharpest possible bound among all scale estimators.With regard to the optimality properties of MAD, Martin and Zamar 12 established expressions for the maximum asymptotic bias of M-estimates of scale over contamination neighborhood as a function of the fraction of contamination and show that the similar strong results are obtained in terms of maximum asymptotic bias for MAD as with the MD.

Control Limits
In this paper, we consider the general equations for constructing control limits see 13 .Thus, with a robust location estimator T and the corresponding scale estimator S, the control limits are given by

3.10
The constant A in 3.10 is determined in such a way that S/A is an unbiased estimator of the scale parameter.The most commonly used control charts are Shewhart X charts using the sample range.For X charts using the sample standard deviation, the T in 3.10 is the sample mean X and S is the sample standard deviation with A c 4 .Shewhart modified these control limits using rational subgroups see 14 in which m rational subgroups with each of size n are taken.According to Shewhart's suggestion, these subgroups are formed in order that the between-groups variability is maximized while the within-group variation is minimized.In this view, then where X i and S i are the subgroup mean and standard deviation, respectively.Each of these estimates is an unbiased estimate of the corresponding parameter, then the control limits using rational subgroups are

3.12
In practice, the control limits are the average of the control limits for the m subgroups.In the case of X chart for which we employ the sample range as the scale parameter, it is estimated with the average range computed by averaging over the m subgroups

3.13
Then, the control limits are defined as

3.14
For constructing the control charts under a normal distribution using the robust estimators, we will determine the appropriate constant A for the desired estimators through computer simulations.To illustrate, a sample of size n was taken from N 0, 1 .The constant A was computed by averaging over 100,000 repetitions.Here, for instance, if we consider using S as scale estimator, then over 100,000 repetitions, we expect E S/A σ for any σ.Table 1 exhibits the constant A such that E scale estimator/A σ.It can be seen that if scale estimator is the sample range, the simulated values E 1 agree closely with the standard tabled values.E 2 and E 3 are the corresponding estimates for the Median and the proposed MM charts.
Thus, in the same way, the estimators for location and scale for Median-MAD chart are given by

Confidence Regions (Confidence Set)
One of the benchmarks for assessing the performance of a control chart is to construct a confidence region for the time of the process change.The use of confidence region on the change point is that it will suggest practitioners with useful starting points for searching their process log books and records for the special cause.This will provide the practitioners a "search window" for the special cause and aid in quicker identification.Hence, practitioners can then take necessary action for the special cause sooner in order to improve quality as well as to reduce process downtime.
Basically, we will incorporate the likelihood function to obtain a confidence region for the process change point.The confidence region approaches in the statistics literature involve the likelihood function that relies on asymptotic theory see 15 .In process monitoring, there are relatively small time intervals between the process change point and the time of the control chart signal.Thus, approximations based on asymptotic theory may not be appropriate.
Box and Cox 16 proposed a method involving the log likelihood function for constructing a possibly noncontiguous confidence region also called a confidence set on a parameter.Their approach can be used to build a confidence set CS for the process change point using the log likelihood function having the form CS {t : ln L τ − ln L τ < D}. 3.17 Here, ln L τ is the maximum value of the log likelihood function; τ is the MLE of τ i.e., the value of t that maximizes the log likelihood function , where Mathematical Problems in Engineering is the value of the log likelihood function at t.We let k 1 and k 2 represent constants determined by the subgroup averages.Thus, ln L t can be expressed as

3.19
Box and Cox 16 proposed using D 1/2 χ 2 1,α to obtain a 100 1 − α % confidence region.Siegmund 17 used asymptotic theory to develop a 100 1 − α % confidence set for the change point of a normal process mean based on the log likelihood function.He proposed using the value

3.20
By means of Monte Carlo simulation, we study the confidence sets obtained with nominal confidence coefficient of 1 − α 0.90.In accordance with Box and Cox 16 and Siegmund 17 , the D values for a 90% confidence set are D 1.353 and D 2.97, respectively.It was observed that D 1.353 value suggested by Box and Cox 16 provides a 90% coverage for a value of δ between 2.0 and 3.0, while Siegmund's 17 of D 2.97 provides at least 90% coverage for δ ≥ 1.0.By trial and error, the value of D 3.7 provides at least 90% coverage for δ ≥ 0.5.

Methodology
In order to analyze the performance of the control charts, we consider using Shewhart X, Median, and our proposed MM control chart.When a control chart signals that suggest a process change has occurred, the change point estimator see the appendix is then applied to the data to estimate the time of the change at which we need to find the value of t in the range 0 ≤ t < T which maximizes C t T − τ X T,t − μ 0 2 .The reverse cumulative average, X T,τ T − τ −1 T i τ 1 X i , is the overall average of the T − t most recent subgroups for which the value of t maximizing the C t values is our estimator of the last subgroup from in-control process.

Simulation Study
We will now analyze the performance of the change point estimator adopting the three control charts through the simulation study.Assuming that the process is initially in control, with observations coming from a normal distribution with a known mean of μ 0 and a known standard deviation of σ 0 .However, after an unknown point in time τ known as the process change point , the process location changes from μ 0 to μ 1 μ 0 ± δσ 0 / √ n, where δ is the unknown magnitude of the change.We also assume that once this step change in the process location occurs, the process remains at the new level of μ 1 , until the special cause has been identified and removed.
To illustrate, the data are generated under different settings of distributions.In addition to the normal distribution, two alternative distributional forms are considered.
They are contaminated model Case 2 of 5.2 and Slash distribution.Under different types of distributions, for each run, the data consist of m 30 subgroups of size n 5 are used to construct the control limits and summary statistics are calculated.In order to assess the performance of the corresponding chart, observations for 1 to 100 are generated from standard normal distribution.Then, starting from subgroup 101, observations were randomly generated from a normal distribution with mean δ and standard deviation 1 until each of Shewhart X, Median, and the proposed MM control chart produces a signal.The procedure was repeated a total of 10,000 times for each of the values of magnitude that was studied, namely δ 1.0, 2.0, and 3.0.For each simulation run, the change point estimate was computed.Subsequently, the average of the estimates of τ for the 10,000 simulation runs was computed along with its standard error, expected length, and coverage probability.
For analyzing the outlier model, we modified the contaminated model by Davis and Adams 18 .Specifically, the in-control conditions with contaminated data values are determined by generating a random number from a Uniform 0, 1 distribution and the corresponding frequency of contaminated data, 0 ≤ β ≤ 1.A Uniform 0, 1 random probabilistic value p ij is generated for observations j of sample i, x ij .Let I a,b p represent an indicator function with

5.1
A random observation x ij in the simulated data is described by 5.2 .Different changes for each of process states are illustrated in the expression.For the purpose of comparisons, the frequencies of contamination of β 0.05 and β 0.10 with C 9.0 are considered for the purpose of creating some disturbances in the data N μ, σ 2 I β,1 p ij , for i 1, 2, . . ., 10, 000, j 1, 2, . . ., n.
As with the computation of sizes of confidence sets obtained using a specific value of D, for each control chart and magnitude of change studied, a step change in the normal process mean was simulated following τ 100.The confidence set estimator was applied following a signal from the corresponding control charts considered.The size of the confidence set was recorded as well as whether the confidence set covered the true process change point of τ 100.This procedure was repeated for a total of N 10, 000 simulation runs for each of values δ considered.The proportion of the 10,000 runs that covered the true process change point was also determined.This was reflected by the resulting estimates of the coverage probabilities obtained by specifying the D value to be 2.97, such that it provides at least 90% coverage for δ ≥ 1.0, which has been discussed in Section 4. The results are tabulated along with the average sizes of confidence sets.For a given coverage probability, a smaller confidence set is preferred, so that process engineers can more narrowly focus their search for the special cause.In general, it is presumed that the increase in the magnitude of shift will be followed by the increase in the corresponding coverage probability.
From Tables 2, 3, 4, and 5, we can see that the performances of the three charts in terms of coverage probability are quite similar especially when the process data come from a normal distribution.Generally, the Median chart and the proposed MM chart perform better in most of the cases particularly for a larger proportion of contamination .For instance, for a change of magnitude δ 3.0, with β 0.05, C 9.0, the coverage probability utilizing X chart is 0.588, while the coverage probability for Median chart and the proposed MM chart are 0.626 and 0.632, respectively.
In Tables 2-5, averages of change point estimates τ are also tabulated for various sizes of change in the process mean together with its corresponding standard error estimates for a normal case setting.As the actual change point for the simulation was at time 100, the average estimated time of the process change, τ, should possibly be close to 100.With X chart, we see that when the process step change of standardized magnitude δ 1, the average estimated time of the process change was 100.00, which is fairly close to the actual change point of 100.While for a standardized process location change of size δ 2, the average estimated time of the change is 99.60.Meanwhile, when δ 3, the average estimated time of the change is 99.47.Hence, on average, the change point estimate of the time of the process change is considerably close to the actual time of the change, regardless of the magnitude of the change.
By the same taken, with the Median chart, for the process step change of standardized magnitude δ 1, the average estimated time of the process change was 99.70, which is also close to the actual change point of 100.00.As for a standardized process location change of size δ 2, the average estimated time of the change is 99.49.And when δ 3, the average estimated time of the change is 99.57.
Lastly, when the process is monitored with our proposed robust MM chart, by and large, the change point estimate of the time of the process change is fairly close to the actual time of the change, regardless of the magnitude of the change.In the case when the process step change of standardized magnitude δ 1, the average estimated time of the process change was 99.90.It turns out that for a standardized process location change of size δ 2, the average estimated time of the change is 99.42.As with δ 3, the average estimated time of the change is 99.59.Overall, we could say that for all types of charts under study, the change point estimator of the time of the process change is able to detect the change point considerably close to the actual time of the change, irrespective of the magnitude of the change.
Another benchmark of evaluating the control chart is by examining the expected length of the signal.This is the expected time at which the control chart signals a change in the process mean that is supposed to occur at time 100.It is generally perceived that Shewhart X control chart might issue a signal of a change in a process mean a considerable amount of time after the change in the process mean actually occurred.Thus, estimating the time of process change with the time when the control chart indeed issues a signal would lead to an unfavorably biased estimate.As a consequence, probably a misleading estimate of the time of the process changes.This bias is in virtue of the potentially large delay in generating a signal from the control chart.Hence, the criterion for evaluating the performance of the control chart is how quick the chart would signal Expected Length .Tables 2-5 and Figures 1-4 summarize the performance in terms of expected length for the three charts.
In a normal distribution situation, for a step change in the process mean of magnitude δ 1, it is easy to see from Figure 1 that the expected length for a Shewhart X chart is 57.
The result seems to suggest that X chart is the best compared to the Median chart and the proposed MM chart for which each needs 157 and 98, respectively, when the shift is small.The situation improves as the magnitude of shift δ increases to 2 or 3.All types of charts considered here seem to be relatively comparable and perform quite closely.
On the other hand, the Shewhart X chart is inferior in an outlier model.With β 0.10 and C 9, according to Table 4, both the Median chart and the proposed MM chart appear to be better than Shewhart X chart for different magnitudes of shifts.The similar situation arises when a very heavy-tailed distribution Slash distribution is considered.Again, the Median chart and the proposed MM chart outperformed Shewhart X chart with respect to the expected length.It can be observed from Table 5 that the differences are quite apparent, whereby Shewhart X chart requires the expected length of 24 and yet the Median chart and the proposed MM chart both demand about 3 subgroups before detecting the first signal when the magnitude of shift is 1.
We now turn to evaluate the observed frequency in which the estimates of the time of the step were within m observations of the actual time of the change, for m 0, 1, 2, . . ., 10.The results are tabulated in Tables 6, 7, 8, 9, 10, 11, 12, 13, and 14.This provides an indication of the precision of the estimator by means of the three different charts.The proportion of the 10,000 runs where the estimated time of the change was within ±m of the actual change is expected to be increase in size as m increases.Referring to Tables 6-14, we observed that the precision increases with the increases of m for each δ value.Let us first focus our attention to a normal setting, when the process step change of magnitude δ 2. Monitoring the process using traditional Shewhart X chart identified correctly the change point in 60.49% of the trials.It was within one observation of the actual change point in 83.23% of the trials, and within two observations of the actual change point in 91.39% of the trials.Turning to the Median chart, which is shown in Table 7, the chart detected correctly the change point in 59.59% of the trials.It was within one observation of the actual change point in 83.46% of the trials, and within two observations of the actual change point in 91.66% of the trials.It then follows that, based on Table 8, our proposed MM chart located accurately the change point in 59.61% of the trials.It was within one observation of the actual change point in 82.88% of the trials, and within two observations of the actual change point in 91.25% of the trials.All types of charts considered here seem to be comparable and performed quite equally.
Next, we observe the situation under outlier model setting, with β 0.10 and C 9. Consider again when the process step change of magnitude is δ 2, for step change of this magnitude, the Shewhart X chart estimator exactly identified the time of the change in just 23.65% of the trials and was within one two observation of the time of the actual process change in 45.03% 56.69% of the trials.As for the Median chart, we can notice that, of the 10,000 simulation trials conducted for δ 2, 24.99% of those simulation trials identified the change point precisely.It was in 46.92% and 58.11% of the trials that the change point was estimated to be within ±1 and ±2, respectively, from the actual time of the process change.The results of the study also indicate that 25.25% of those simulation trials identified the change point correctly for the proposed MM chart.In 47.53% of the trials, the estimate was within ±1 observation, and in 58.61% of the trials, the estimate was within ±2 observations.Overall, the procedure using the proposed MM chart seems to perform slightly better than the other two charts in this respect.
Finally, we focus on nonnormal data without outliers' situation, the Median chart and the proposed MM chart generally lead to shorter control limits than the traditional Shewhart   X chart.Here, we just limit ourselves to the study of Slash distribution see 13 which is a very heavy-tailed distribution.When the process step change of magnitude δ 2, monitoring the process using traditional X chart identified correctly the change point in 5.31% of the trials.It was within one observation of the actual change point in 14.44% of the trials, and within two observations of the actual change point in 20.64% of the trials.For the Median chart which is shown by Table 13, the chart was able to detect correctly the change point in 5.45% of the trials.It was within one observation of the actual change point in 12.08% of the trials, and within two observations of the actual change point in 17.95% of the trials.It then follows that, based on Table 14, our proposed MM chart discovered accurately the change point in 5.49% of the trials.It was within one observation of the actual change point in 12.28% of the trials, and within two observations of the actual change point in 18.02% of the trials.
On the whole, we would say that the Median, and the proposed MM charts perform better and more consistently than the X chart in this setting.

Conclusions
Control charts are used to detect whether or not a process has changed.When a control chart signals indicate that a process has changed, practitioners must initiate a search for the special cause.However, given a signal from a control chart, practitioners generally do not know  what caused the process situation to change or when the process has changed.Identifying the time of the process change would simplify the seeking of the special cause.In the event that the practitioners knew when the process changed, the search would simply be reduced for discovering what aspect of the process changed at that time.As a result, practitioners would increase their chances of identifying the special cause more correctly and quickly.Subsequently, This allows them to take the appropriate actions immediately to improve the quality.
In this paper, monitoring processes in the presence of data contamination and under nonnormal setting are of primary concern.We have applied an estimator that is useful for identifying the change point of a step change in normal process mean, nonnormal, and when contamination may exist.We have discussed the performance of the change point estimator and other criteria when they are monitored by Shewhart X, the Median and the proposed MM control charts.The results show that the proposed MM robust control chart consistently performed well in a range of situations.It provides a useful and much better alternative in using the time of the signal from the conventional X control chart.Although the proposed MM chart and the Median chart are comparable under nonnormal and contamination situation; the Median chart becomes worse in the event of normal setting.The performance of our proposed MM control chart has good properties in the aspect of expected length and coverage probability for contamination data and that arise from a heavy-tailed distribution functions for moderate sample sizes.The proposed MM chart compares favorably with traditional Shewhart X control chart in normal setting especially when magnitudes of shift are 2 and 3.It is interesting to note that the proposed robust MM chart is more efficient than the Shewhart X chart when the process distribution function has a heavy-tailed distribution.  of τ is the value of τ that maximizes the likelihood function or, equivalently, its logarithm.It is shown that logarithm of the likelihood function is A.1 We can see that there are two unknowns in the log-likelihood function:

Figure 1 :
Figure 1: Simulation result: expected length for change point of τ 100 normal distribution .

τ and μ 1 . 1 x i τμ 2 0 2 . A. 3
If the change point τ was known, the MLE of μ 1 would be μ1 X T,τ T − τ −1 T i τ 1 X i ,the average of the T − τ most recent subgroup averages.Substituting this back into A.1 , we obtain log L τ | x to verify that this is equivalent to log L τ | x − n − T − τ X T,τ − μ 0

Table 1 :
Simulated values of A for different scale estimators.
t∈R s n t .3.16In this study, three different estimators under investigation are as follows:E 1 : T Sample Mean; S Range, E 2 : T Median; S MAD Median{|x i − T|}, E 3 : T μ n arg inf t∈R s n t ; S σ n inf t∈R s n t .

Table 2 :
Simulation result: estimates of expected length, average change point, standard error, average size of confidence set and coverage probability for change point of τ 100 normal distribution .

Table 3 :
Simulation result: estimates of expected length, average change point, standard error, average size of confidence set, and coverage probability for change point of τ 100 contaminated distribution, β 0.05, C 9 .

Table 4 :
Simulation result: estimates of expected length, average change point, standard error, average size of confidence set, and coverage probability for change point of τ 100 contaminated distribution, β 0.10, C 9 .

Table 6 :
46 114.63 103.00 102.17 101.46 103.00 102.24 101.50 Precision of the estimator when used with X-Bar chart for different magnitudes of process change magnitude of shift, δ based on 10,000 trials.Subgroup size n 5, change point τ 100 normal distribution .

Table 14 :
Precision of the estimator when used with proposed MM chart for different magnitudes of process change magnitude of shift, δ based on 10,000 trials.Subgroup size n 5, change point τ 100 slash distribution .In what follows, the value of τ that maximizes the log-likelihood function is t T − τ X T,t − μ 0 2 arg max t {C t }, A.4where C t T −τ X T,t −μ 0 2 , that is, τ is the value of τ in the range 0 ≤ t < T which maximizesC t T − τ X T,t − μ 0 2 .A.5