An Improvement of the Hotelling T 2 Statistic in Monitoring Multivariate Quality Characteristics

The Hotelling T2 statistic is themost popular statistic used inmultivariate control charts to monitor multiple qualities. However, this statistic is easily affected by the existence of more than one outlier in the data set. To rectify this problem, robust control charts, which are based on the minimum volume ellipsoid and theminimum covariance determinant, have been proposed.Most researchers assess the performance of multivariate control charts based on the number of signals without payingmuch attention to whether those signals are really outliers. With due respect, we propose to evaluate control charts not only based on the number of detected outliers but also with respect to their correct positions. In this paper, an Upper Control Limit based on the median and the median absolute deviation is also proposed. The results of this study signify that the proposed Upper Control Limit improves the detection of correct outliers but that it suffers from a swamping effect when the positions of outliers are not taken into consideration. Finally, a robust control chart based on the diagnostic robust generalised potential procedure is introduced to remedy this drawback.


Introduction
In statistical quality control, a process changes into an out-of-control situation when outliers appear in two different ways, namely, outliers that are randomly distributed within a data set and outliers that sequentially occur after a specific observation during a specific period of time in the data set. The former and the latter situations are referred to as scatter outliers and a sustained step shift, respectively, 1-4 .
The detection of correct outliers in phase I of the monitoring scheme is crucial. If outliers are not correctly detected, the result leads to model misspecification and to incorrect results during phase II 5 . The Hotelling T 2 statistic, which was first introduced by Hotelling in 6 , is the most popular statistic used in multivariate control charts to monitor multiple quality characteristics 7-11 . Vargas 12 demonstrated that a T 2 statistic based on the usual classical estimators fails to detect multiple scatter outliers for individual observations n 1 , although this statistic is effective in the presence of a small number of outliers. It is now evident that the Hotelling T 2 statistic, which is based on the usual classical sample mean vector and variancecovariance matrix, is easily affected by the existence of more than one outlier in the HDS Historical Data Set . In addition, the T 2 statistic suffers from a masking or swamping effect 13, 14 . Sullivan and Woodall 10 showed that the T 2 statistic based on the usual sample variance-covariance matrix for individual observations is not only less effective in detecting scatter outliers in the HDS but also poor in sustained step shifts in the mean vector.
Robust methods for multivariate data, based on the MVE and MCD, have been widely used in regression contexts for diagnosing influential observations and high leverage points and outliers 14, 15 but have only recently been applied to multivariate quality control process applications. Vargas 12 employed the minimum volume ellipsoid MVE and the minimum covariance determinant MCD as two robust estimates of location and dispersion 16 , instead of the usual classical sample mean vector and covariance matrix in the Hotelling T 2 statistic. The application of robust control charts for individual observations based on the MVE and MCD has also been discussed extensively by Jensen et al. 17 . It is worth mentioning that there is no guarantee that the mathematical distribution of the T 2 statistic is preserved by replacing the location and scale estimators with robust versions. To remedy this problem, Vargas 12 and Jensen et al. 17 used an empirical distribution of the robust T 2 statistic for calculating the emprical upper control limits UCLs of their proposed robust control charts.
We have seen the application of MVE and MCD methods in the development of robust control charts. Other methods such as outlier identification in high dimensions 18 and some proposed multivariate outlier detection techniques 19 may also be considered in the control process applications. Our main aim in this paper is to propose a robust multivariate control chart based on the diagnostic robust generalised potential DRGP , which was initiated by Habshah et al. 20 . This is the first attempt to introduce another robust control chart as an alternative to the two existing robust MVE-based and MCD-based control charts. Hence, our focus in this paper is only limited to the DRGP-based control chart and compare its performance with the two preceding charts. We do not wish to compare these charts with other charts based on other methods mentioned in Wilcox 19 and Filzmoser et al. 18 . Vargas 12 and Jensen et al. 17 evaluated the performance of each control chart based on the number of detected outliers, regardless of whether they came from a correct outlier positions. In other words, their work is only devoted to the detection of outliers regardless of whether the outliers are true correct outlier position or false outliers. Their work has motivated us to consider the identification of correct outliers when evaluating different control charts. In this regard, we introduce another empirical method for calculating the UCLs of the robust T 2 statistic based on the median and the MAD of the estimators.

Diagnostics for the Identification of High Leverage Points
High leverage points are a type of influential observation that is substantially different for one or more predictor variables 21, 22 . It is now evident that high leverage points are responsible for leading to model misspecifications and misleading results 23-25 .

Mathematical Problems in Engineering 3
There are some methods in the literature for identifying high leverage points in linear regression models 14, 22, 25-28 . The ith diagonal elements of hat matrix H are referred to as leverage points and are denoted by where V is an m × k matrix of predictor variables of regression model. Hoaglin and Welsch 29 considered observations to be high leverage points when h ii are greater than 2k/m. Hadi 30 introduced another measure to diagnose high leverage points, which is known as "potential measures." According to Hadi 30 , the ith potential is defined as follows: where V i is the data matrix with its ith row deleted. By using simple matrix algebra, it is easy to obtain a relationship between the potentials and the diagonal elements of H, as follows: Hadi 30 suggested a confidence bound cutoff point for p ii as follows: Median p ii cMAD p ii , 2.4 where MAD p ii Median{|p ii − Median p ii |}/0.6745 and c is a constant that is chosen between 2 or 3, as appropriate. Robust version of the Mahalanobis distance is also being used to identify high leverage points 14,18,20 . Habshah et al. 20 pointed out that although these robust diagnostic techniques can rectify the masking problem, they are affected by the swamping effect, which is not desirable either.
To remedy this problem, Habshah et al. 20 proposed a unified approach, which is called the diagnostic robust generalised potential DRGP method which accommodates both the diagnostic and robust approaches together. The robust approach is utilised to detect the suspected high leverage points and then diagnostic approach is utilised to confirm our suspicion. The DRGP partitions the data into two sets. The first set consists of suspicious cases, which are deleted from the original observations denoted by D , and the second set contains the remaining data denoted by R . It is clear that if d is the number of cases that includes D, then the R set contains m − d observations and d < m − k. Without loss of generality, we assume that d cases are placed in the last d rows of Y and V so that the hat matrix is partitioned and that the ith deletion leverage is defined as follows: where V R indicates the remaining observation matrix 31 . By considering 2.2 , 2.3 and 2.5 , the generalised potential is defined as follows:

2.6
Similar to the potential values and with regard to 2.4 , the cutoff point for p * ii is It is worth mentioning that the DRGP employs the robust Mahalanobis distance in the first step to detect high leverage points as preliminary suspicious observations; these points are placed in the D set. Next, in the second step, only the cases that are greater than 2.7 are reported as the final detections. In the next section, the DRGP is employed to effectively detect outliers in multivariate quality control charts for individual observations.

Multivariate Robust T 2 Control Charts
Suppose that there is an HDS in the phase I monitoring scheme that consists of m time-ordered observation vectors of dimension p, which are observed independently, where p is the number of quality characteristics that are measured p < m . It is assumed that each vector comes from a p-variate normal distribution. Thus, if X i ∈ R p is a vector in the HDS for the ith time period, X i ∼ N p μ, Σ , where μ and Σ are the population mean vector and the variancecovariance matrix, respectively. As mentioned earlier, the Hotelling T 2 statistic is used to detect outliers in multivariate control charts. The general form of this statistic is Because the parameters in 3.1 are usually unknown, the usual sample mean and variance-covariance matrix are used as the classical estimations of μ and Σ. In practice, these variables are expressed by

3.2
In phase I, the parameters are retrospectively estimated based on the current HDS; as a result, the vector X i is not independent of the estimators X and S. In this situation, Mathematical Problems in Engineering 5 the statistical distribution of 3.1 is given as where α is the probability of a false alarm for each point plotted on the control chart and B α, p/2, m − p − 1 /2 is the αth upper quantile of the beta distribution with parameters p/2 and m − p − 1 /2. The lower control limit LCL is often set to zero 8, 32 . It should be noted that the aforementioned UCLs are exact when applied to a single point in phase I, whereas phase I is a retrospective analysis of all observations. Therefore, the values of α cannot be applied to a set of points. In this situation, if all of the statistics were distributed independently, then the overall probability of a false alarm would be where α indicates the probability of a false alarm, which is assigned for each observation plotted on the control chart in a subgroup of size m. In practice, it is reasonable to determine the UCL by simulation to give a specified overall false alarm 5, 33, 34 . Hereafter, in this paper, we still refer to the overall false, α as α for simplicity. The use of 3.1 is not effective in the presence of multiple outliers, so robust alternatives are proposed. In this regard, two of the recently proposed robust alternative approaches to T 2 are based on the MVE and the MCD estimators, which will be denoted by T 2 mve and T 2 mcd , respectively, and are defined as follows: where X mve and X mcd are the robust estimations of the sample mean and S −1 mve and S −1 mcd are the corresponding estimators of the sample variance-covariance matrix. As previously mentioned in Section 1, due to the unknown distribution of T 2 mve and T 2 mcd , empirical methods are used to determine the UCLs. The empirical simulated UCLs for 3.5 or 3.6 are usually determined by finding the αth upper quantile of the empirical distribution of the corresponding statistic. For this situation, Jensen et al. 17 and Vargas 12 defined α as the overall false alarm. Following the idea of Habshah et al. 20 , another empirical UCL is proposed as follows: where MAD is the median absolute deviation of either T 2 mve,i or the T 2 mcd,i , as defined in 2.4 . For simplicity, the first empirical UCL is referred to as Empr, and the proposed UCL will be denoted by Med-Mad. As will be shown later, 3.7 tends to declare too many observations as 6 Mathematical Problems in Engineering outliers, by detecting outliers regardless of whether the detected outliers are true in a correct outlier position or false, even though 3.7 has a better performance in detecting real outliers at their correct positions in the HDS than Empr UCL. In this regard, we propose to apply the DRGP based on the MVE and MCD with the Med-Mad UCLs to reduce the number of undue signals caused by the detection of outliers irrespective of their correct positions and, at the same time, to effectively detect correct outliers. Vargas 12 and Jensen et al. 17 employed the probability of signals to evaluate and compare control charts based on T 2 mve and T 2 mcd . Their work is only based on the number of detections, without considering whether those detections are correct outliers. It is worth mentioning that the probability of signals cannot be properly judged if there is a swamping effect in the monitoring scheme. In the following section, we will present our proposed control scheme and explain how it can detect real outliers at their correct positions.

Simulation Study
In this section, a Monte Carlo simulation study is carried out to assess the performance of the control schemes discussed previously. The simulation is designed based on three subsamples, each of size m 30, 50, and 100, with a number of characteristics p 2, 3, 5, and 10. Let us assume that the in-control process is a p-variate normal distribution with mean vector μ 0 and covariance matrix Σ.
The simulated empirical UCLs are obtained by generating 5000 in-control data sets for each combination of m and p. Due to the affine equivariant property of the T 2 statistics, these limits are applicable to any values of μ and Σ. Many researchers, such as Jensen et al. 17 and Vargas 12 , have determined the Empr UCLs by calculating all of the T 2 statistics for each observation in generated data sets of each subgroup of size m and recording the maximum value of the T 2 statistics. Subsequently, the upper αth percentile of the 5000 recorded maximum values of the T 2 statistics is declared as the Empr UCL. In this manner, they defined α as the overall false alarm. In this paper, we consider the probability of the overall false alarm, which is equal to α 0.05.
To make the overall false alarm of the control charts based on the Med-Mad UCLs equivalent to Empr UCLs, which is equal to α 0.05, the following steps are considered. The  Table 1.
Then, a contaminated data set of size m in the p dimension is generated for different values of the noncentrality parameter ncp . The out-of-control process is a p-variate normal distribution with the same covariance matrix but with a shifted mean vector of μ 1 . Thus, the variation here remains stable. The magnitude of the shift is measured by a scalar defined as follows:   This measure is called the noncentrality parameter and is hereafter referred to as ncp. Four outlier percentage levels are considered, which are denoted by 5%, 10%, 15%, and 8  20%. It is clear from 4.1 that the severity of the shift only depends on the values of μ 1 . Hence, without loss of generality, it can be assumed that μ 0 is a zero vector and Σ is a p × p identity matrix. The control charts are assessed based on the proposed criterion, which are based on the number of true detected outliers with regard to the correct position of the generated out-ofcontrol observations in the data set. The number of outliers detected without regard to their positions is also presented for comparison.

Mathematical Problems in Engineering
Repeating this process 5000 times, the number of detections and the number of correctly detected outliers with correct positions are recorded for each replication. The numbers of detected outliers are determined by comparing each of the T 2 values with the respective UCLs given in Table 1. Each detected outlier is checked by its position in the data set to determine whether it can truly be generated from the intentional simulated contaminated points in the data set. The true or correctly detected outliers refer to the outliers detected at the correct position. The number of detected outliers simply indicates the outliers that have been detected irrespective of their correct position. The average number of detections over 5000 iterations is presented for p 2, m 30, 50, and 100 in Tables 2, 3, and 4. It is important to note that the presented values were rounded up to two digits. The values in parentheses represent the number of outliers detected regardless of their correct positions.   Let us first focus on the results of T 2 , T 2 mve , and T 2 mcd obtained by using the Empr and Med-Mad UCLs. Following these results, we will see later why the DRGP approach is proposed.
As can be seen from these tables, the UCLs based on the Med-Mad approach for all three control charts have better performance in detecting the real outliers at their correct positions, compared to Empr UCL. The classical T 2 chart, based on Empr UCL, performs very poorly. It can be seen that the Med-Mad UCLs are more reliable in detecting the correctly detected outliers for the robust control charts compared to Empr UCLs, particularly when the percentage of outliers increases. However, the results signify that both T 2 mve and T 2 mcd based on the Med-Mad UCLs suffer from a swamping effect due to the detection of more outliers without regard to their correct positions.
For example, at 10%, regarding outliers in the HDS with m 100, p 2, and ncp 25 and both T 2 mve and T 2 mcd based on the Med-Mad UCLs, the methods detect exactly 10 outliers at the correct positions, but they also detect 15 outliers irrespective of their correct positions. It is interesting to note that the performance of the robust control charts based on the Empr UCLs is reasonably close to that of the robust control charts based on the Med-Mad UCLs for very large values of ncp, such as ncp ≥ 45. The T 2 mve and T 2 mcd parameters, based on the Med-Mad UCLs, are equally good at detecting correct outliers. Nonetheless, with increasing subgroup size, T 2 mcd based on the Med-Mad UCL is slightly better than the T 2 mve based on Med-Mad, particularly for a large percentage of outliers.
We can see that although the classical T 2 Med-Mad-based method is better than the T 2 Empr-based method, it detects a smaller number of exact outliers as the percentage of outliers increases. These results are consistent with other values of p but are not reported here due to space constraints. The findings of Tables 2 to 4 seem to suggest that the Med-Mad UCLs are more reliable than Empr UCLs in detecting correct outliers. However, the Med-Mad UCLs detected more outliers irrespective of their correct positions due to swamping effects. In other words, we have shown that the robust control charts based on the Med-Mad UCLs effectively detect the number of correct outliers, but they overdetected outliers irrespective of their correct positions. As such, we need to employ control charts that can reduce such undue detections. In this regard, we suggest applying the DRGP procedure discussed in Section 2. The same simulation procedure was then carried out, and the DRGP was applied to the data sets.
Mathematical Problems in Engineering 11 10  The results obtained by using the DRGP approach are exhibited in Table 5. Due to space limitations, the results are presented only for m 30, 100 and 5%, 20%. As can be seen from Table 5, there is a steady decrease in the number of undue observations when the DRGP approach is applied. For example, as shown in Table 4, the total number of detections by the Med-Mad UCL with ncp 55 is 24 for the MVE and MCD, while it decreases to 21 in Table 5. To simplify the presentation of results, the proportion of correctly detected outliers to detected outliers is calculated and referred to as the correct detection rate. The values of the correct detection rates for m 50 and p 2 are shown in Table 6. The results indicate that the DRGP approach provides higher correct detection rates compared to the other methods.
However, these results were not very encouraging for small shifts ncp 5 . It can be seen from Table 6  not tabulated due to space limitations. For more clarification, the correct detection rates for various values of m and p are plotted in Figures 1 and 2. These figures confirm that the DRGP approach gives a higher correct detection rate.

Numerical Example
In this section, a numerical example is introduced to assess the performance of our method. This is a bivariate data set which is taken from Shewhart 35 . It presents the measurements of the depth of sapwood and the depth of penetration of creosote in telephone poles. The subgroup size of the original dataset is 10 and we only focus on the first column of the data set based on 20 subgroups. The T 2 , T 2 mve , T 2 mcd , and the DRGP statistics were then applied to the data set. The Empr UCLs and the Med-Mad UCLs for T 2 ,  are 9.010, 31.723.738, and 113.140 and 7.010, 5.910, and 9.461, respectively. These UCLs are calculated using the simulation, as discussed in Section 4, for α 0.05. Figure 3 shows the different T 2 control charts for Empr UCLs. As can be seen from this graph, none of the control charts based on the Empr UCL is able to detect any outlier in the data set. On the other hand, robust T 2 statistics based on the Med-Mad UCLs can identify severed outliers Figure 4 .
The points 6 and 16 are detected by T 2 mve as outliers, and T 2 mcd identified the cases of 1, 6, 8, 16, and 21. The robust control charts based on the DRGP approach detected one observation as outlier which is observation number 6. The results are not included here due to space limitations.

Conclusions
Most research studies evaluate the performance of robust multivariate T 2 mve and T 2 mcd control charts based on the number of outliers detected without regard to the correct position of outliers in the data set. The detection of real outliers outliers at the correct position in the data set is crucial to avoid making wrong inferences. This study has shown that although the T 2 mve and the T 2 mcd based on the Med-Mad UCLs are effective in the detection of correct outliers, they have the tendency to declare undue observations as outliers irrespective of whether they are true or false outliers due to a swamping effect. In this respect, in the evaluation of robust control charts, we suggest not only a consideration of the number of outlier detections but also a consideration of the correct position of the detected outliers.
Our findings also suggest that the proposed Med-Mad UCLs have better performance than the commonly used Empr UCLs in detecting outliers with regard to the correct position of outliers, especially for higher proportions of outliers. The practical finding of this paper is that the robust control chart based on the DRGP with the proposed Med-Mad UCLs gives credible performance.
The limitations of our study are that inferences or conclusions are only confined to the detection of multiple outliers for individual observations, scatter outlier situations, and moderate dimensional data sets p ≤ 10 .