On the Effect of Estimation Error for the Risk-Adjusted Charts

Control charts are a popular statistical process control (SPC) technique for monitoring to detect the unusual variations in different processes. Contrary to the classical charts, control charts have also been modified to include covariates using regression approaches. *is study assesses the performance of risk-adjusted control charts under the complexity of estimation error by considering logistic and negative binomial regression models. To be more precise, risk-adjusted Cumulative Sum (CUSUM) and Exponentially Weighted Moving Average (EWMA) charts are used to evaluate the impact of the estimation error. To compute the average run length (ARL), Markov Chain Monte Carlo simulations are conducted. Furthermore, a bootstrap method is also used to compute the ARL assuming different Phase-I data sets to minimize the effect of estimation error on risk-adjusted control charts. *e results for cardiac surgery and respiratory disease data sets show that the modified control charts improve the performance in detecting small shifts.


Introduction
e history of quality control is as old as the origin of industry. Statistical process control techniques can be applied to various kinds of information such as clinical outcomes, manufacturing processes, hazard management, and patient satisfaction; see, for example, Lowry and Montgomery [1], Woodall [2], Anderson and ompson [3], Duncan [4], and Stoumbos et al. [5]. ere is, however, a sharp distinction between industrial and healthcare applications. In industrial settings items are produced under very controlled conditions yielding mostly homogeneous products. Considering typical healthcare applications, on the other hand, patients have a wide variability in their personal profiles. is diversity, also known as risk factors in the literature, can have a substantial impact on the outcome and must not be ignored.
Control charts are used to detect any undesirable changes in the process. ere are two main types of control charts: memoryless and memory-type control charts. e memoryless charts are also known as the Shewhart charts, and their major drawback is the usage of current information while ignoring the history of the process. is disadvantage makes the Shewhart control chart quite insensitive to small shifts in the process. To overcome this problem, alternatively, the CUSUM [6] and EWMA [7] control charts are suggested to detect small shifts in the processes [8]. e CUSUM was developed on the basis of the total deviations of successive samples from the target value.
us, each point plotted on the chart represents the sum of the deviations up to that particular point. e CUSUM has been proven to be efficient in detecting small shifts in the process.
Recently, accelerated interest has been shown in monitoring heterogeneous clinical processes. e goal is to monitor and control the outcome of a clinical process by taking into account the risk elements, i.e., patient covariates. e standard EWMA approach considers equal risk factors for all patients, which makes the method ineffective when monitoring healthcare outcomes. For instance, an unexpected increase in the number of failures may be due to the treatment of numerous high risk patients and not because of alternate in the healthcare service. This would result in increased false alarms in the monitoring. Likewise, treating various low-risk cases yield a decreased wide variety of failures and may also result in an undetected deterioration of the service when using the traditional EWMA. Grigg and Spiegelhalter [9] have addressed these issues and developed various promising techniques to include risk factors in process monitoring and control of healthcare processes.
If the data are heterogeneous, the explainable variation present in the samples must be accounted, and in such cases, risk-adjusted control charts are suggested. A regression model is applied to the sampled data, which takes into account the explainable variation or covariates. Chen [10] introduced a set of methods for monitoring adverse outcome of the surveillance system. Later, Gallus et al. [11] proposed an optimal procedure for surveillance of congenital malformations and pointed out that the proposed procedure represents a clear improvement over the existing ones and is more efficient than the CUSUM scheme in certain conditions. Williams et al. [12] discussed an application of the CUSUM chart for monitoring surgical performance. Poloniecki et al. [13] proposed cumulative risk-adjusted mortality chart for detecting changes in the death rate. Steiner et al. [14] discussed the problem of monitoring outcome in pediatric cardiac surgery. Lovegrove et al. [15] described an alternative approach that accounts for an individual cardiac surgeon's case-mix by explicitly incorporating the inherent risk faced by patients due to a combination of factors relating to their age and the degree of disease they have. Steiner et al. [16] proposed a new CUSUM method for accounting preoperative risk or surgical failure through the use of the likelihood-based weighted scheme. Some more recent work on risk-adjusted charts can be seen in [17][18][19][20][21][22][23][24][25][26][27] and references cited therein.
Woodall [2] discussed industrial applications for practitioners which are similar to healthcare monitoring in public-health surveillance. Thor et al. [28] systematically reviewed the literature regarding how statistical process control has been applied to healthcare quality improvement and discussed the benefits, limitations, barriers, and facilitating factors related to such methods.
To account the estimation effect on the risk-adjusted charts, Jones and Steiner [29] studied the binary CUSUM using data on patients undergoing coronary artery bypass surgery and assessed mortality up to 30 days postsurgery. Later, Gandy and Kvaløy [30] suggested an adjustment to overcome the estimation error problems. Urbieta et al. [31] compared the CUSUM and EWMA control charts for the negative binomial distribution assuming the daily number of hospitalizations due to respiratory diseases for people over 65-years old in São Paulo city, Brazil. Höhle and Paul [32] discussed the control charts based on the Poisson and negative binomial distribution for monitoring time series of counts typically arising in the surveillance of infectious diseases. More relevant discussion can be seen in [27,33,34].
The aim of this study is to discuss the problem of estimation error on the risk-adjusted logistic and negative binomial CUSUM charts. The performance of the charts is evaluated in terms of average run length (ARL). To account the estimation error, bootstrapping with different phases cardiac surgery data set is used.
The rest of the article is organized as follows. Section 2 defines the standard CUSUM chart and risk-adjusted CUSUM chart, whereas Section 3 presents the risk-adjusted EWMA chart. The negative binomial regression model is discussed in Section 4. The performance of the risk-adjusted CUSUM chart is presented in Section 5, whereas a bootstrap method to account estimation error is given in Section 6. The control limits assuming estimation error and different Phase-I data sets are studied in Section 7. The risk-adjusted EWMA performance is presented in Section 8. The analysis of respiratory disease patients and cardiac patients assuming negative binomial regression is discussed in Section 9. Section 10 presents some concluding remarks and proposed extensions.

Risk-Adjusted CUSUM Chart
The general test statistic of the tabular or standard CUSUM is described as where C 0 � 0 and W i is the sample weight assigned to the ith subgroup.
In surgical situations, pre-estimated mortality risk varies from patient to patient and the chart, given in equation (1), does not consider the preoperative risk factor. For this reason, an adjustment is required to ensure that sudden unusual changes are not due to surgeons. This can be done on the basis of prior risk by adopting a weighting scheme which allots higher values to higher risk patient and vice versa. The risk-adjusted (RA) CUSUM chart is used to detect a shift from the baseline rate to a specified alternative rate, where the shift is expressed in terms of odds ratio (OR). In general, OR for alternatives should be selected corresponding to the smallest change that is needed to be detected by the monitoring process. Let R 0 and R 1 represent the odds ratio under the null and the alternate hypotheses, respectively: H 0 : odds ratio � R 0 , where R 0 � 1 shows in-control state of the process and R 1 > R 0 refers to an increase in the odds ratio, while R 0 > R 1 indicates decrease. Thus, for patient i under H 0 , the odds of failure equals to R 0 p i /(1 − p i ) and its corresponding probability of failure equals to R 0 pi/(1 − p i + R 0 p i ). The weighting score can be obtained by p y i (1 − p) 1− y i for patient i, whereas the log-likelihood ratio score in the presence of the estimation error can be expressed as 2 Complexity where Δ represents the estimation error. The above weighted score is used in standard CUSUM equation (1), generally known as the risk-adjusted logistic CUSUM chart. If C i is greater than control limit (h), the system shows a signal of out-of-control situation. Generally, the threshold h is chosen to maintain a specific false alarm rate. The data set considered here is about the cardiac surgery collected by Cardiac Center of United Kingdom during 1992-1998 [16,35]. There are 6,994 patients of cardiac surgery indicating 5,212 males and 1,782 females. The data set also contains information on age, gender, blood pressure, diabetic condition, and other related useful information indicating surgeon group, date, and risk factor which are stored as the Parsonnet score [35]. The average patient age is 62.50 with standard deviation 11. Further, after 30 days of surgery, 461 patients did not survive, i.e., the death rate is 6.6%.

Logistic Regression.
For binary response variable Y i , i � 1, . . . , n, the likelihood ratio is desired to construct a risk-adjusted chart. To find the likelihood ratio, the logistic regression model for the data must be obtained to estimate the logarithm of the odds. Mathematically, it can be expressed as where p i is the likelihood that the ith sample is a success, such that Y i � 1, X i is the vector of input variables, and ß is the coefficient for the input variables in the regression model [16]. Equation (4) can be rearranged to have the likelihood for Y i � 1 as follows: Now, the variable Y i follows the Bernoulli distribution with the probability mass function P(y) The likelihood ratio (LR) then can be calculated by dividing the likelihood for the out-of-control situation, L 1 , and the likelihood for the in-control situation, L 0 . The L 0 can be calculated by To observe the change in the logit model, the logit model for an out-of-control situation can be given as where Δ denotes the out-of-control change in the logit function. Using equation (7), the possibility of an out-ofcontrol situation can be calculated as Thus, the likelihood ratio can be expressed as For patient i, the above equation can be written as Now, for detecting change in the logit model, a RA-CUSUM chart based on the log-likelihood of the out-ofcontrol verses in-control situation can be defined as where C 0 � 0. When the out-of-control situation is likely greater than the in-control situation, R i > 0, it results in larger C i values than the case of in-control situation. To monitor the increase in the odds for the out-of-control situation, Δ must be set greater than zero. Similarly, Δ should be less than zero to show a decrease in the odds.

RA-EWMA Charts
In SPC, the exponentially weighted moving average (EWMA) charts are very popular and widely used, like CUSUM charts, to detect small shifts. The monitoring statistic can be expressed as where R i is the EWMA score assigned to the ith observation and λ is a smoothing constant that lies between 0 < λ ≤ 1. Essentially, the statistic is a linear combination of all the previous observations with higher weights assigned to more previous observations than the current observations. For detecting small changes, the RA-EWMA charts have a similar performance to CUSUM charts. However, the main advantage of RA-EWMA over CUSUM charts lies in its intuitive interpretation as the EWMA statistic can be seemed as an estimate of the current level of process. Moreover, by adjusting the weights rather than resetting the statistic as the CUSUM does, the influence of previous observations is removed gradually in the statistics which is a more natural way to conduct monitoring and is acceptable for healthcare practitioners [36,37]

Negative Binomial Distribution Count Model
Most commonly used counts models are the Poisson and the negative binomial models. The probability density function of X t following the Poisson distribution can be written as the member of the exponential family given below: Complexity where E(X t ) � Var(X t ) � μ t . By reparameterization of the above model for the negative binomial distribution, one can get the following model: with E(X t ) � μ t , Var(X t ) � μ t + (μ 2 t /ϕ) and Γ denotes the gamma function, i.e., Γ(n) � (n − 1)!. Based on the generalized linear model (GLM) assuming the negative binomial distribution with log odds link function [38] and μ 0,t and μ 1,t , as the in-control and out-of-control means, respectively, the CUSUM control chart can be written as Thus, for the tth week under H 1 , the log odds link counts are R 1 ϕp t /(1 − p t ) and corresponding probability of counts equal to R 1 ϕp t /(1 − p t + R 1 ϕp t ). The weighting score is, week, the log-likelihood ratio score in the presence of estimation error can be expressed as The above weighted score is used to design risk-adjusted negative binomial CUSUM chart. If C t ≥ h in equation (15), then chart shows a signal of out-of-control, which means that sufficient evidence has been collected that the underlying parameters are changed.

Negative Binomial Regression. A Poisson regression is
used when there is no dispersion in the data, whereas the negative binomial regression is preferred for dispersed data. As discussed previously, for count response variables Y t , t � 1, . . . , n, the likelihood ratio is used to obtain a riskadjusted chart. To find the likelihood ratio for the negative binomial regression model, we proceed as follows: Now, the variable Y t follows the negative binomial distribution with the probability mass function The likelihood ratio (LR) is then calculated by dividing the likelihood for the out-of-control situation, L 1 , and the likelihood for the in-control situation, L 0 , where L 0 can be calculated as To observe the changes in the model, the model for an out-of-control situation is given by where Δ is the change in the log function. Using equation (19), the possibility of an out-of-control situation can be calculated as The likelihood ratio is then written as For tth week, the above equation can be expressed as Finally, for detecting change in the logit model, a RA-CUSUM chart based on the log-likelihood of the out-ofcontrol verses in-control situation can be defined as where C 0 � 0. A large value of C t is observed in the out-ofcontrol situation. To monitor the increase in the event for the out-of-control situation, Δ must be set greater than 0.

Performance of Risk-Adjusted CUSUM Charts
To evaluate the performance of control charts, Monte Carlo simulations and integral equation approaches are commonly used. In the class of Monte Carlo simulation, Markov Chain approach is a computationally efficient approach. To use the Markov Chain approach, the state space of the CUSUM, C i , is partitioned into m + 1 states, i.e., CUSUM statistic lies between 0 and h, where h denotes the control limit by defining 0 state corresponding to the initial value and m state corresponding to absorbing state when C i > h. The transition probability matrix for m + 1 possible states of CUSUM is given by which can be rewritten as where I is m × m identity matrix, is m × 1 column of ones, p ij denotes the transition probability from state i to j, and R denotes the absorbing state without an out-of-control signal, i.e., just before the out-of-control situation. The last row and column of the matrix P shows an out-of-control signal corresponding to an absorbing state. If we remove this from the transition probability matrix, then the corresponding matrix is known as the R matrix. In general, it is suggested to set m approximately equal to 250 to have stable results. To elaborate it further, consider patients whose Parsonnet score is zero. Using equation (4), such low-risk patients have 2.5% estimated chance of death and 97.5% chance of survival, so from equation (3), assume R 0 � 1 and R 1 � 2, and the possible scores of such patients are 0.0669 and − 0.024.
Let ϕ represent the CUSUM run length, and then we can write Thus, the expected value of run length is 5.1. Estimation Error Impact. As mentioned previously, when the underlying distribution is unknown and chart parameters are to be estimated, the estimation error affects the efficiency of the control chart. To study the impact of estimated parameter(s) of the baseline distribution, the ARL is calculated through equation (3) using the Markov Chain approach for different values of estimation error Δ ∈ 0(0.01)0.2. The resulting study is listed in Tables 1-4. The choice R 1 � 2 is used to show increasing the event rate, while R 1 � 0.5 for decreasing the event rate. From Tables 1 and 3, it is clear that as the estimation error increases, the corresponding ARL decreases. On the other hand, from Tables 2 and 4, the ARL increases when the estimation error increases using R 1 � 0.5. Hence, if R 1 is greater than one, the chart produces an out-of-control signal earlier than the case of R 1 less than one, suggesting that the process improvement cannot be detected in the presence of estimation. Hence, the estimation error has a significant effect on the chart performance.

Bootstrap Method
The best method for adjusting a monitoring scheme to account the estimation error is the use of the bootstrapping method proposed by Gandy and Kvaløy [30]. This section describes the bootstrapping procedure for obtaining the desired ARL on the basis of adjusted threshold.
Assuming the availability of stream of in-control independent observations, i.e., X 1 , X 2 , . . . following a distribution P, a control chart is used to detect shifts when observations are no longer coming from the P, known as the out-of-control situation. To construct a chart, the parameters of P are estimated using the in-control data which are subject to the estimation error.
A common performance measure for control charts is the ARL which is defined as the hitting probability of the chart within a certain number of steps. To compute it, let τ denote the observation number at which the chart gives a signal of out-of-control, i.e., when the observation lies above the threshold h. The distribution of τ is also a function of estimated parameters. If τ is the unknown parameter, then we can express the ARL as ARL(P; ξ) � E(τ(ξ)), where the expectation is with respect to P. The probability of signaling within m time steps can be expressed as hit(P; ξ) � P(τ(ξ) ≤ m), for some finite m > 0. The threshold limit h is chosen to achieve a certain desired in-control performance of the chart. The chart signals when a threshold h is crossed, Again, both, h ARL and h hit depend on P and ξ.
Let P and ξ � ξ(P) denote the estimated distribution and estimated parameters from the past in-control data X − n , . . . , X − 1 . In practice, h ARL (P; ξ) or h hit (P; ξ) is used, but it gives the performance far from the nominal level. To overcome this problem, Gandy and Kvaløy [30] suggested a bootstrap method to calculate the adjusted threshold, which with high probability guarantees the performance very close to the nominal value. To make a unified presentation of the bootstrap, let h be a common notation for quantities of interest like h ARL and h hit . Furthermore, let p α be a constant such that P[h(P; ξ) − h(P; ξ) > p α ] � 1 − α, which implies the following bound on the quantity of interest P[h

Complexity
Since P is unknown, we cannot calculate p α , but an approximation can be obtained by bootstrapping. Let P * represent a parametric or nonparametric bootstrap to replicate the estimate of the in-control distribution P based on the sample size n and let ξ * � ξ(P * ). Then, we approximate p α by p * α : Then, an approximate upper bound which guarantees a certain performance with an approximate probability of denotes the one-sided confidence interval for (P; ξ). Algorithm 1 is a generic algorithm for bootstrapping. Use h � h ARL or h � h hit , and then run the chart with the adjusted control limit: 6.1. Bootstrap in Risk-Adjusted Charts. Suppose we have pairs of independent observations (Y 1 , where Y denotes response variable and X is the    6 Complexity corresponding covariate. Let P denote the joint distribution of (Y i , X i ) and P be the empirical distribution which assigns weight 1/n to each past observation (Y − n , X − n ), . . . , (Y − 1 , X − 1 ). Then, by resampling from P, the bootstrap can be applied to risk-adjusted charts. For example, consider the CUSUM chart with the in-control log odds ratio defined as To observe changes in the odds ratio, the CUSUM is defined as where R i is the log-likelihood ratio between the in-control and out-of-control model for observation i which gives a signal when C i > h. In practice, ß and the distribution of residuals P ε are estimated, and then the CUSUM is run length with h ARL (P ε ; β). The nonparametric bootstrap method can be used for accounting the estimation error by calculating the adjusted threshold, i.e., exp(log h ARL (P ε ; β) − p * α ) � h ARL (P ε ; β) exp(− p * α ). Using the adjusted threshold there is an approximate (1 − α) probability that the actual ARL of the chart is at least as large as the desired ARL.

. Accounting Estimation Error in Cardiac Data
This section presents the results for the effect of Phase-I data on the RA-CUSUM.
Considering the first two-year data as the Phase-I data, we calculated the control limits using the procedure discussed in the previous section and the Phase-I data points are plotted against the constructed control limits using R 1 � 2. It is observed that out of 2,218 treated patients 143 died after 30 days of surgery. This gives probability of mortality p � 0.06447. Using logit(p i ) � β 0 + X i * β 1 to detect R 1 � 2, the average in-control score is Now, substituting p � 0.06447, the average in-control value is To observe the change, the out-of-control average risks score assuming the doubled mortality rate can be calculated as Thus, substituting equation (35) in equation (34), we obtain To obtain the adjusted control limits, the targeted incontrol ARL is taken as 10,000, as the population is generally measured per ten thousand. The Phase-I and Phase-II riskadjusted CUSUM charts are depicted in Figure 1 for the cardiac surgical data. Figure 1(a) indicates the RA-CUSUM chart for the Phase-I observations and shows that for both, the unadjusted and the adjusted control limits, the Phase-I observations are in-control. Thus, we can use these limits for Phase-II data monitoring. Figure 1(b) indicates the RA-CUSUM chart for the Phase-II cardiac surgical operation data and show a steep increase in the mortalities before July 1995 and between July 1995 and January 1996; hence, the chart produces out-of-control signals, which indicates that patients are at risk and reason for this must be investigated. Using the unadjusted control limit chart, the first out-ofcontrol observation is 3,729, whereas it is 3,830 for the adjusted limit chart. Thus, the false alarm rate decreases and efficiency of the chart increases by adjusting the control limit.
Next, considering the first three years as the Phase-I data, it is noticed that 3,187 patients are treated during this period, whereas 215 patients did not survive after 30 days of surgery; hence, the probability of mortality p � 0.06746. The average in-control score is and the average out-of-control risks score can be calculated as Thus, the Δ can be calculated from equations (37) and (38) as follows: The regression model becomes logit(p i ) � − 3.69043 + 0.07721 * X i . The Phase-I and Phase-II risk-adjusted CUSUM charts assuming first three years as the base period are depicted in Figure 2.

Complexity
From Figure 2(b), it can be seen that the unadjusted control limit signals the first out-of-control signal at 3,729 position while the adjusted limit at 3,818. Hence, with the adjusted limit the false alarm rate decreases and charts efficiency increases. Further, it is noticed that when the Phase-I data increases, i.e., from two years to three years, the adjusted control limit charts are more efficient to detect the change.

Complexity
Next, we assess the performance of the RA-CUSUM to detect decrease in the odds occurrence for R � 0.5 assuming different Phase-I observations, as we did previously, to estimate the unknown parameters and construct control limits. The average out-of-control risks score in this case can be calculated as Thus, Δ � − 3.401896 + 2.6749 � − 0.727025 and the regression model is logit(p i ) � − 3.6798 + 0.0768 * X i . The resulting study is depicted in Figure 3 for Phase-I and Phase-II risk-adjusted CUSUM charts. Figure 3(b) shows the RA-CUSUM chart for the Phase-II cardiac surgical operation data. It can be seen that there is a steep increase before July 1997. By using the unadjusted limit, we received the first out-of-control signal at 5,233 position while at 5,355 for the adjusted limit chart. Thus, the adjusted chart produces less alarm than the unadjusted chart. Similarly, for first three-year data as the Phase-I, the average out-of-control risks score is and the value of Δ is This leads to the regression model logit(p i ) � − 3.69043 + 0.07721 * X i . The Phase-I and Phase-II riskadjusted CUSUM charts assuming first three-year data as the Phase-I data are depicted in Figure 4. Figure 4(b) shows that the first out-of-control signal appeared at 5,233 by the unadjusted chart while at 5,354 by the adjusted limit chart. Hence, the false alarm rate decreases by using the adjusted limit chart. Also, comparing charts constructed assuming two years Phase-I data to three years Phase-I data, that is, Figures 3 and 4, we get approximately very similar performance. In this case, the adjusted limit shows a signal of out-of-control at 5,355 observation by using two years data as the Phase-I while 5,354 by using three years data.
The Parsonnet scores are nonlinear on the logit scale, and to confirm this, we plot the histogram and normal probability plots in Figure 5.
Clearly, from the first row of the figure, the data do not follow the normal distribution. To transform the data to be approximately normal, the squared root transformation �� X √ is used and resulting study is plotted in the second row of Figure 5. Furthermore, the skewness of the original Parsonnet score is 1.9757 while kurtosis 8.4532. Similarly, the skewness and the kurtosis after the squared root transformation are 0.2475 and 2.9376, respectively. Thus, squared root transformation minimized the magnitude of skewness. Using the transformed data, the RA-CUSUM parameters are reestimated by using different Phase-I data sets to study the performance of chart in Phase-II. Considering the first two-year data as the Phase-I data, the Phase-I and Phase-II charts are depicted in Figure 6. Figure 6(b) shows the risk-adjusted CUSUM chart for the Phase-II cardiac surgical operation data. From the figure it can be seen that, as discussed earlier, before July 1995 there is a steep increase in the mortalities. By using the unadjusted limit, the first out-of-control observation is 3,729, while with the adjusted limit, first out-of-control signal is observed at 3,901. Thus, by adjusting the limit, the false alarm rate decreases.
Next, considering the first three-year data as the Phase-I period, the graphical presentation of Phase-I and Phase-II data is given in Figure 7.
From Figure 7, it is observed that the first out-of-control signal is at 3,729 point by using unadjusted limit, while at 3,839 by using the adjusted limit chart. It is also noticed that by using a large Phase-I data, the adjusted limit chart is able to detect the change more quickly.
Furthermore, for R 1 � 0.5, assuming the first two-year data as the Phase-I data, the resulting charts are given in Figure 8. From Figure 8, it is noticed that by using the unadjusted chart the first out-of-control signal appeared at 5,142, whereas it is 5,227 using the adjusted chart. Similarly, using the first three-year data as the Phase-I data the graphical presentation of the data is depicted in Figure 9.
By the unadjusted chart, it is observed that the first outof-control signal appears at 5,183, while it appears at 5,227 by the adjusted chart. Note that when we compare two years as the Phase-I data with three-year Phase-I, it is observed that the results of both charts are very similar, as both charts show out-of-control signal at 5,227. On comparing Parsonnet score results assuming transformed data by the squared root to nontransformed data, it is observed that the results of both regressions differ, and thus conclude that the identification of an appropriate distribution for data is the most important step in designing control charts. Furthermore, to supplement our conclusion, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) of the models with original and transformed observations are calculated. The calculated values are 915 and 926.98 for untransformed data while 897 and 908.44 for the transformed data. Hence, the transformed data is more appropriate.

. RA-EWMA Charts
As in the case of RA-CUSUM charts, here we adjust EWMA control limits to minimize the estimation error and to check the effect of Phase-I data.
Beginning with the first two-year data as the Phase-I data, we calculated the adjusted control limits by using smoothing constant λ � 0.1. Next, we used R 1 � 2 and plot the Phase-I and Phase-II data against the constructed control limits.
To obtain the adjusted control limit, 10,000 is assumed as the targeted in-control ARL. The regression model is then, logit(p i ) � − 3.68 + 0.077 * X i , where X i is the Parsonnet score for the ith affected person. The Phase-I and Phase-II risk-adjusted EWMA charts are depicted in Figure 10 for the cardiac surgical data. Figure 10(b) indicates the risk-adjusted EWMA chart for the Phase-II data and shows that since January 1996 there is a steep increase in the mortalities. By using the unadjusted and adjusted limit, the first out-of-control observation is 4,325, which indicates that adjusted limit has no significant effect on the EWMA chart.

Complexity
Considering the first three-year data as the Phase-I period, the graphical presentation of Phase-I and Phase-II  data is given in Figure 11. From Figure 11(b), it is observed that the first out-of-control signal is at 4,325 point by using the unadjusted and adjusted limits. It is also noticed that despite using a large Phase-I data, the adjusted limit chart is unable to detect the change relatively quickly.
Next, we take first two-year data as the Phase-I data and calculate the adjusted control limits using smoothing constant λ � 0.3. Then, we use R 1 � 2 and plot the Phase-I and Phase-II data against the constructed control limits. The resultant Phase-I and Phase-II risk-adjusted EWMA charts are depicted in Figure 12. Similarly, letting the first threeyear data as the Phase-I data, the graphical presentation of Phase-I and Phase-II data is given in Figure 13. From  Figures 12(b) and 13(b), note that all the point of Phase-II are in-control. Furthermore, by increasing λ value, the adjusted limit chart is unable to detect the small changes in the data.
Using the squared root transformation, the results assuming the first two-year data as the Phase-I data for different values of λ is presented in Figure 14.          same results which we obtained previously in RA-EWMA charts. Thus, we conclude that both the CUSUM and the EWMA give the same results.

. Negative Binomial Risk-Adjusted Chart for Cardiac Surgery Patient
This section fits GLM to analyze the count data of the weekly hospital admission due to cardiac surgery in the UK. To this end, the logarithmic link function is used and population size is included as an offset to model the hospitalization rate per 100,000 inhabitants over time ( Figure 16). The observed weekly admission of cardiac surgery patients along with the fitted model is depicted in Figure 17. In particular, the model includes sine and cosine functions to account for seasonal behavior while the week number to explain the variability of    weekly hospitalizations. Let X t denote the number of hospitalizations at week t with X t ∼ NegBin (μ 0,t , ϕ), and then the model can be written as ln pop t week t 100, 000 β 0 + β 1 week + sin 2π week 52 + cos 2π week 52 .
(43) e inclusion of the o set term g t (week t /100000) allows to model the average weekly rate of admission in the hospital. e estimated coe cients of the model are presented in Table 5. e estimated dispersion parameter of the model is greater than one, indicating overdispersion.

E ect of Estimation Error.
To study the impact of estimated parameter of the baseline distribution, the ARL is calculated for di erent values of estimation error Δ ∈ 0(0.01)0.2 and the resulting study is listed in Tables 6-9.
It is observed from the tables that the ARL decreases as the estimation error increases.
us, the chart produces more signal as compared to the case when Δ 0, which shows the e ciency of the chart.
Next, to account for the estimation error, the bootstrap technique with di erent Phase-I data is used. It is observed that the rst out-of-control point appeared at 246 week by the unadjusted chart (UCL 435.1) assuming the rst two years as the Phase-I data. However, using the adjusted limit (UCL 428.6), the rst out-of-control signal appears at 248 week. Similarly, for the rst three-year weekly data as the Phase-I, the rst out-of-control signal is detected at 298 week by the unadjusted limit (UCL 424.8) while at 299 week for the adjusted limit (UCL 422.6). us, using the adjusted and unadjusted control limits, no advantage is observed with increasing the Phase-I data.

Respiratory Disease Patient Analysis.
is section presents the analysis of the weekly hospital admission due to respiratory diseases count data for people aged over 65 years in the city of São Paulo-Brazil [31]. Weekly data from January 2006 to December 2011 are used to t the model, which consists of explanatory variable, such as week. Here, we consider negative binomial instead of Poisson because dispersion parameter ϕ is signi cantly larger than one. Additionally, the logarithmic odds link function is used and population size is included as an o set to model the hospitalization rate per 100,000 inhabitants over time. e weekly admission of the cardiac surgery patient is depicted in Figure 17. In the model, we included sine and cosine functions to account for seasonal behavior while week number to explain the variability of weekly hospitalizations. Let X t denote the number of hospitalizations at week t with X t ∼ NegBin(μ 0,t , ϕ), then the model can be written as ln μ 0,t pop t 100, 000 β 0 + β 1 week + sin 2π week 52 e inclusion of the o set term g t (pop t /100000) allows to model the average weekly rate of admission in the hospital. e estimated coe cients of the model are presented in Table 10.     Tables 11 and  12.
From the tables, it is observed that the ARL decreases as the estimation error increases. Thus, the chart produces more out-of-control signals as compared to the case when Δ � 0.
To account for the estimation error and to assess the effect of different Phase-I data, we considered first two years and then three years as the Phase-I data and calculated the control limits. Using two year weekly data as the Phase-I, the first out-of-control point is observed at 250 week by the unadjusted chart (UCL � 759.4). However, using the adjusted limit (UCL � 764.1), the first outof-control signal appears at 251 week. Similarly, using first three years weekly data as the Phase-I, the first point goes out-of-control at 303 week by the unadjusted limit (UCL � 756.3) while at 304 week for the adjusted limit (UCL � 760.2). Thus, we conclude that some improvement is observed by the adjusting control chart over the unadjusted chart to detect out-of-control point for this data.

Conclusion
The main aim of this study is to assess the impact of the estimation error on the risk-adjusted control charts. In practice, the estimation error occurs when the distribution of the underlying data is unknown and we estimate the parameters from the Phase-I data. To account for this problem, a bootstrap method with different Phase-I data is used to establish adjusted limits of control charts which reduced the false alarm rate. In addition, by increasing Phase-I data, it is able to detect changes quickly. In particular, we considered risk-adjusted CUSUM charts assuming Bernoulli and negative binomial models. Further, risk-adjusted EWMA is also discussed in the study. As literature suggested, the bootstrapping methods can be computationally costly; however, we found it the most efficient method for minimizing the impact of the estimation error with low computational cost due to modern computing power. In future, other transformations, such as logarithmic and Box-Cox, can be considered to compare the performance of risk-adjusted CUSUM and EWMA charts.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.