The Recurrent Event Process of a Success Preceded by a Failure and its Application

We describe a simple discrete time renewal process of an event where a success is preceded by a failure. Its properties, especially the distributions of the counting and the interval processes, are investigated. We also propose an application to statistical process control based on the waiting time between two adjacent events. It is shown that the average number inspected under the new control scheme is larger than with the so called CCC control chart.


Introduction
Suppose we have a sequence of independent Bernoulli trials with outcomes of success (S) or failure (F).Many recurrent events can be generated by the Bernoulli trials (Feller, Chapter XIII, 1968).For example, one may be interested in the recurrent events of having runs of S of length k.In fact, there are various ways of counting the number of runs of S of length k (see, for example, Wang and Ji, 1995).Here, we consider an interesting event derived from the Bernoulli sequence and defined by E = A sucess preceded by one or more failures { } .
The event E is somewhat reminiscent of an old popular Chinese saying "Failure is the mother of success".Effectively, it means a success will arrive after a cumulative number of failures.Of course, this saying has a connotation of 'cause and effect'; however, the E defined here only describes a phenomenon since the trials are assumed to be independent.
Consider an automated production process of items in which an item may be classified as either conforming F or nonconforming S. Each individual item is inspected in the order of production.If this is not a 100% inspection scheme, a sampling device will be used for selecting an item.Then the event E defined above corresponds to the appearance of the first nonconforming after one or more conforming items have been inspected.This event may be slightly reworded as Consider a familiar experiment of coin tossings in which we designate S = H , F = T having p = q = 1/ 2. Then the event E may also constitute a betting game for a player 102 C. D. LAI who selects E against another player whose betting strategy is to obtain a HH first, for example.
This recurrent event process may be considered as a special case of a more general 'pattern in Bernoulli trials' studied in Hunter (1983, pp. 118-120).
The first part of the paper is devoted to the study of the counting process N n { }and the interval process S k { }, k ≥ 1.Since most of the properties can be obtained through using some standard theory, we therefore omit many of the details.Instead, we approach our analyses from some intuitive first principles that are often more interesting and reveal more about the process than that from the probability generating functions.
The main thrust of the paper is to demonstrate how the waiting time between two consecutive occurrences of E may be used for constructing a control chart in quality control.It is found that this new control scheme for fraction nonconforming has a larger 'average number inspected' than a more established chart known as the CCC control chart when the production process is in control.

Distribution of Waiting Time Between Two Consecutive Recurrent Events
Let T be the time from the origin to the first occurrence of E .In other words, T represents the waiting time for the first appearance of FS.This is also the waiting time between the occurrences of two consecutive E 's.Then T = n if the following sequence occurs: Clearly, f 1 = 0.The probability sequence f n { } may be obtained as in Hombas (1997) who partitioned the sample space into sets that begin with F and S .However, this approach is cumbersome as can be seen in Hombas' paper.Other methods may be used to derive f n , see for example, Chapter 3 of Hunter (1983).Here, we shall describe a more direct approach as follows : First, T = n means that the event E occurs at n, n ≥ 2 , for the first time.This is equivalent to having observed a sequence that begins with i S's, 0 ≤ i ≤ n − 2, followed by ( n − 1 − i ) F's and then by S at n.In other words, E occurs at n if the outcome of the n th trial is the first S that is preceded by one or more failures.
It is now obvious that the distribution of T is given by We will soon see that (2.1) may be derived by another easy but well estabilshed method.
It is straight forward to verify that ; so E is indeed a persistent recurrent process (pp.75, Hunter, 1983).Now the probability generating function (p.g.f.) F(s) of the sequence f n { }is given by Next, define (2.3) As E occurs at n (not necessary for the first time) if and only if the outcomes at the (n-1) th and the n th trial are F and S, respectively, it is obvious that From (2.4), we would be able to find the distribution of T, i.e ., the sequence f n { }, through the well known relationship between the generating function (g.f.) of u n { }and the p.g.f.F(s).However, we omit the details here and refer the readers to Chapter 3 of Hunter (1983).

Mean Recurrence Time and Variance
We now proceed to find the first two moments of the waiting time random variable T. First, the mean recurrence time of E , i.e., E(T ), may be obtained from (2.2) by differentiating F(s) with respect to s giving As was observed earlier, E is persistent.Clearly it is also aperiodic; so µ = (Hunter, 1983, pp 119).
In contrast, the mean waiting time of the event SS { }(i.e.mean recurrence time of a success run of length 2) is given by (Feller,1968, pp. 324) When p = q =. 5 , it is easy to see from (2.5)-(2.6)that the mean waiting time for FS { }is 4 and 6 for SS { }, respectively.This fact was noted in Hombas (1997).The question arises: For what value of p would the above two events end up with the same mean waiting time?Setting 1 + p p 2 = 1 pq and solve for p , we get p =.6181.In other words, if two players A and B are tossing a biased coin with probability of getting an H (i.e S) being .6181and that A bets for FS and B for SS, respectively, then the expected waiting time would be the same for both players.
Now the outcome SS is always preceded by FS except when SS appears on the first two trials.Therefore the probability that FS will beat SS, i.e., the probability that FS will appear before SS does in a sequence of independent trials, is 1 − p 2 .When p =.6181, 1 − p 2 = p =.6181which is greater than .5.In other words, at the value p =.6181, it is more likely that FS will appear earlier than SS although their mean waiting times are identical.
The variance of T is given by It is apparent from (2.7) that Var(T ) is minimum when p = q = 0. 5 .
THE RECURRENT EVENT PROCESS OF A SUCCESS 105

Distribution of S k
Let S k denote the number of Bernoulli trials required to obtain the k th E and N n , the number of occurrences of E up to and including the n th trial.The two variables are related by We may represent the partial sum as: where T i denotes the number of Bernoulli trials we observe between the (i − 1) th and the i th occurrence of FS, including the last trial that completes the pattern E .Note in particular, T 1 = T .
Clearly, the sequence T i { }consists of independent random variables.It now follows from (2.2) together with independence of the members of T i { }, that the p.g.f. of S k is simply given by For k = 2 and k = 3, we can show by using partial fractions and extracting the coefficient of s n that : and It was pointed out by one of the referees that in general, we have A closed form expression from this formula does not appear to be possible.

The Number of Occurrences of E, N n
Let N n be the number of recurrent events E that occur up to and including the n th trial (excluding the occurrence of E at the 0 th trial).We now focus our attention to the study of the counting process N n that is associated with E .
Recall, (3.1) provides a key link between S k and N n , that is: In particular, Next, P N n = 1 { }and P N n = 2 { } may be evaluated through equations (3.3-(3.4).However, we opt to use the following more intuitive and direct approach: ∑ E occurs at i for the first time and and

.3) THE RECURRENT EVENT PROCESS OF A SUCCESS 107
We may continue iteratively, yielding A more explicit expression for the distribution of N n can be derived simply by using Cor 3.4.4A of Hunter (1983) which states that By expanding the above expression into a product of two polynomials in s and then extracting the coefficient of s n , we have A closed form expression for general k from this expression is unknown.
For many practical purposes, we may only need to know about the first two moments of N n .
It follows from (3.4.9) of Hinter (1983) and our (2.4)above that the mean count of E 's in (0, n]is given by To find the variance of N n ,we may use (2.4), (4.7) and the following identify that holds for any recurrent event process: Without having to rely on the theory of recurrent event process, we now proceed to derive the mean and variance from the first principles.These are more interesting and reveal more about the process than that from the p.g.f.F(s).

C. D. LAI
Consider the sequence of independent Bernoulli trials beginning from i = 1 till i = n.At each i, i ≥ 2, an event E either occurs with probability ′ p = pq or E does not occur with probability ′ q = 1 − ′ p = 1 − pq .Form an associated sequence of binary variables denoted by Y i { },i ≥ 2 such that (4.9) The counting process N n { }may now be expressed in terms of Y i { }: and hence Unlike the original Bernoulli sequence, Y i { },i ≥ 2, is not sequence of independent random variables although Y i andY i +k are independent for k ≥ 2. Hence, 12) The ratio of variance to mean is so that there is an apparent under dispersion for this process relative to the Poisson process.
THE RECURRENT EVENT PROCESS OF A SUCCESS 109

5.
A Control Chart to be Based on T Production processes are usually of very high quality today.Particularly, in an electronic industry such as integrated circuits manufacturing, the fraction nonconforming p is in the parts-per-million (PPM) range such as 0.0001 or 100PPM.The standard Statistical Process Control (SPC) techniques are not very suitable for near zero-defect environment as detailed by Goh (1987).The main problem is: there are too many false alarms, and it is impossible to detect process improvement, as the lower control limit (LCL) for a p or np-chart will not exist.In what follows, we assume that p < q.This is necessary for any production process.In fact, we may simply assume that p < 0.1 whenever applicable.

Cumulative Count of Conforming (CCC ) Control Chart
A useful control procedure for such a high-quality process is based on the statistic Y, the number of conforming units between two successive occurrences of nonconforming items.The statistic Y is designated as the Cumulative Count of Conforming (CCC) units by Goh (1987), see also Lucas (1989), Xie and Goh (1992).Some other discussions on a decision procedure based on Y can be seen in Nelson (1994) and Quesenberry (1995).
Let E 0 = Occurence of an S { } where S here denotes the outcome of the inspected unit being nonconforming.It is clear that such a continuous sampling procedure may be regarded as a recurrent event process.However, it is necessary to clarify our notation here.In the references just cited above, Y denotes the number of conforming units between two nonconforming units.In contrast, the waiting time between the two consecutive occurrences of an event in the recurrent event theory traditionally includes the trial that 'completes' the event.In other words, Y +1 is the waiting time instead of Y.
It should be noted that the cumulative count of conforming (CCC) control chart is useful for monitoring a high-quality process.It is equally effective for controlling other processes that have a moderate size of p , the proportion of nonconforming units of the production process.
It is easy to see that T 0 , the number of units inspected to obtain the first nonconforming, has a geometric distribution given by having tail probability and mean waiting time C. D. LAI Suppose we are only interested to detect the upward shift of p , i.e, we are only concerned with quality deterioration, then the resulting control chart has only the lower control limit (LCL).A small value of T 0 indicates that the nonconforming units have occurred too often for a process that is under control.The control procedure for this scheme is to send an out-of-control signal if the value of T 0 falls below the LCL.We require the probability of this to happen to be very small, say α, when the process is in control.In other words, α is the false alarm probability which is also the Type I error.
Let L 0 be the lower control limit for the CCC control chart with a given α , then (5.4) It follows from (5.1) that (5.5)

Modified CCC Control Chart Based on T
It is obvious from (2.1) that It seems reasonable to propose a new control scheme based on the statistic T which is the waiting time for the appearance of a nonconforming unit preceded by one or more conforming units.T may also be interpreted as the total number of items that have been inspected during the time interval at each end of which a point is plotted on the control chart.Assuming we are concerned only with quality deterioration, we shall let L designate the lower control limit of the proposed chart satisfying Equivalently, where α is the probability of false alarm or the Type I error with the fraction nonconforming at some acceptable level p = p 0 .
Unlike L 0 in Section 5.1, it is not easy to solve the above functional equation for L .However, the following observations would be helpful in our search.
First, by comparing (2.1) with (5.1), it is clear that (5.9) (We note also that P T = 1 Equation (5.9) suggests that T would have a longer tail probability than that of T 0 .As a matter of fact, for any given n, the difference between the two tail probabilities (see (5.6) and (5.2)) is: and so L ≥ L 0 .Indeed, L is near L 0 for small values of p .In fact, for these small values of p , we note from (5.9) that For this range of p , L may be approximated as (5.12) In other words, (5.13) A simple spread sheet can be used to find L .The following table gives the lower control limits of the modified CCC control chart when α = 0.025 and α = 0.05: With this new control scheme, the quality inspector would not take action (such as plotting a point on the chart) when two or more nonconforming units occur consecutively.This is perhaps the main difference between this new control chart and the existing CCC control chart that is based on the statistic T 0 .Psychologically speaking, one may regard the phenomenon of observing consecutive nonconformings as a fluke because of p being rather small.Looking from another angle, T can be interpreted as a 'delayed' version of T 0 .
For illustration, suppose a sequence of items from a production line have been inspected for quality having the acceptable quality level set at p = 0. 01 and α = 0. 05 The result of this inspection process is as follows :

FFFFFFFSSSFFFFFFFSFFS
Three points are then plotted on the control chart with T 1 = 8, T 2 = 10, and T 3 = 3 .From Table 1 above, we find that the lower control limit L is 6.Since T 3 falls below L, an outof-control signal is sent immediately so an action to examine the production line is called for.The result of the examination may indicate that the process is still operating properly; in that case, we would declare this out-of-control signal as a false alarm.

Comparison of Two Mean Waiting Times
Recall, E(T 0 ) = Also, Consider a Bernoulli sequence SSSSFFFSSS where S denotes that E has occurred at that trial.Clearly, T is the number of trials required to obtain the first F plus the number of trials to obtain the first S (counting from the second F).We note that the forward recurrence time to an S in an independent Bernolli sequence is also a geometric random variable; just like the forward recurrence time for a Poisson event is also exponentially distributed.Therefore, T = Geometric(q) + T 0 .
Assume that E(T 0 ) = 1 p 0 , we want to find value of p so that In the context of quality control, both p 0 and p are much smaller than 0. 5 , so (5.18) has only one meaningful root giving

Average Number Inspected (ANI)
In order to evaluate the performance of a control chart, a measure such as the Average Run Length (ARL) is often used.Essentially, ARL is the average number of points that must be plotted before a point indicates an out-of-control condition.Bourke (1991) used the term ANI to refer to the expected number of items inspected until a signal is produced.This is equivalent to the definition given in Page (1954, p101) where ARL is defined as 'the expected number of articles sampled before action is taken'.In what follows, Bourke's (1991) terminology will be used.

ANI for CCC Control Chart
For the CCC control chart, the ANI is easy to obtain (Bourke (1991) and is given by where α is the false alarm probability which is also the Type 1 error.Note that our aim here is to detect an upward shift of the process quality level p only.However, if both control limits are required, then P(T 0 ≤ L 0 ) in (6.2) is to be replaced by Here p a is the acceptable quality level, that is, the process is assumed to be in control at this level.
We note in passing that (6.2) may be derived by using the approach to be developed in the following subsection.

ANI for the Modified CC Control Chart:
Let N be the number of nonconforming units onserved that leads to an out-of-control signal and S N be the total number inspected to obtain the first out-of-control signal.As N is a random variable so S N is a compound random variable.N is effectively a'stopping time' random variable.Therefore, is the average number inspected in the sense of Bourke's (1991) terminology.
For given N = n , S n = T 1 + T 2 +....T n subject to T i > L, i = 1,... n − 1 and T n ≤ L , where L is the lower control limit satisfying: For given N = n , we can show that where (Recall, T = T 1 , and T i are independent and identically distributed).

THE RECURRENT EVENT PROCESS OF A SUCCESS 115
We also note that N has a geometric distribution having mean 1 α .
It is well known (in fact it can be shown easily) that Hence, and It follows from (6.3) and (6.4) that the avearge number inspected for the control chart under discussion is which is the Wald's identity in sequential analysis.
Comparing the two ANIs, we find that the ANI for the modified CCC scheme exceeds the ANI for the CCC control chart by the amount 1/ αq ( ).For a process that is in control, a control scheme that gives a larger ANI has an advantage as it would send fewer false alarm signals.However, for a high quality production process (i.e., for very small p , and hence for large q ), this advantage becomes minimal.As we remarked in Section 5.1, the CCC control chart is also applicable for controlling fraction nonconforming even in the case when p is of moderate size.For such processes, the modified CCC control chart may be considered as a serious alternative.

Call for Papers
As a multidisciplinary field, financial engineering is becoming increasingly important in today's economic and financial world, especially in areas such as portfolio management, asset valuation and prediction, fraud detection, and credit risk management.For example, in a credit risk context, the recently approved Basel II guidelines advise financial institutions to build comprehensible credit risk models in order to optimize their capital allocation policy.Computational methods are being intensively studied and applied to improve the quality of the financial decisions that need to be made.Until now, computational methods and models are central to the analysis of economic and financial decisions.However, more and more researchers have found that the financial environment is not ruled by mathematical distributions or statistical models.In such situations, some attempts have also been made to develop financial engineering models using intelligent computing approaches.For example, an artificial neural network (ANN) is a nonparametric estimation technique which does not make any distributional assumptions regarding the underlying asset.Instead, ANN approach develops a model using sets of unknown parameters and lets the optimization routine seek the best fitting parameters to obtain the desired results.The main aim of this special issue is not to merely illustrate the superior performance of a new intelligent computational method, but also to demonstrate how it can be used effectively in a financial engineering environment to improve and facilitate financial decision making.In this sense, the submissions should especially address how the results of estimated computational models (e.g., ANN, support vector machines, evolutionary algorithm, and fuzzy models) can be used to develop intelligent, easy-to-use, and/or comprehensible computational systems (e.g., decision support systems, agent-based system, and web-based systems) This special issue will include (but not be limited to) the following topics: • Computational methods: artificial intelligence, neural networks, evolutionary algorithms, fuzzy inference, hybrid learning, ensemble learning, cooperative learning, multiagent learning

•
Application fields: asset valuation and prediction, asset allocation and portfolio selection, bankruptcy prediction, fraud detection, credit risk management • Implementation aspects: decision support systems, expert systems, information systems, intelligent agents, web service, monitoring, deployment, implementation