Reliability Analysis of an Extended Shock Model and Its Optimization Application in a Production Line

Firstly, a two-unit cold standby shock model with multiple adaptive vacations is introduced, in which the startup and replacement of repair facility are also considered. Secondly, using supplementary variable method and Laplace transform, some important reliability indices are derived, such as availability, failure frequency,mean vacation period,mean renewal cycle,mean startup period, and replacement frequency. Finally, a production line controlled by two cold-standby computers is modeled to present numerical illustration and its optimal part-time job policy at a maximum profit.


Introduction
It is well known that the shock model is used to study the external causes which may make a system fail.For example, a computer system may fail due to the invasion of some virus or an attack from a raider.Many authors have investigated various shock models based on different assumptions.Among them, Esary et al. [1], Barlow and Proschan [2], Ross [3], and Fagiuoli and Pellerey [4] dealt with Poisson shock models.Eryilmaz [5,6] discussed discrete-time shock model and its life behavior.Also, Gottlieb [7], Aven and Gaarder [8], Wang and Zhang [9], Lam and Zhang [10], and Tang and Lam [11] obtained the optimal replacement policies for several shock models.Recently, from the reliability viewpoint, Li and Zhao [12] studied a -shock model consisting of  components, and Q. T. Wu and S. M. Wu [13] analyzed a two-unit cold standby shock model with single vacation.
However, in the existing shock literature, the repair facility is assumed to be always available when a failed unit occurs, although this assumption is evidently unrealistic.In fact, in many practical situations the repair facility generally needs a startup time with random length for its preparatory work before starting repair.Furthermore, the busy repair facility is typically subject to lengthy and unpredictable breakdowns and has to be replaced (see [14][15][16]).On the other hand, to utilize the repairman's idle time effectively and increase profit, the system manager can assign some secondary jobs to the idle repairman.But the repairman's additional tasks will reduce system availability and sometimes yield huge economic losses.Therefore, it is very important that the system manager knows how to assign the idle repairman optimal additional jobs based on a maximum profit and high availability level.In this paper, the period when the repairman undertakes additional jobs is represented by the repairman's vacation time.A comprehensive and excellent study on the vacation models can be found in Tian and Zhang's book [17].
Based on the above facts, in this paper we present an extended shock model for two identical unit cold-standby systems.Here, cold standby means that the redundant unit cannot fail at its standby state.Our study differs from previous work [1][2][3][4][5][6][7][8][9][10][11][12][13] in that (i) it considers the startup and breakdown of repair facility and (ii) it introduces the multiple adaptive vacation policy (MAVP).The MAVP first proposed by Tian and Zhang [17] is more general than single vacation, multiple vacations and variant vacations, which is useful for high availability and profit optimization of the system (see Section 5); (iii) some new reliability indices are derived, such as mean renewal cycle, mean vacation period, mean startup period, mean idle period, and mean busy period; (iv) as an application of our model, a production line controlled by two identical cold-standby computers is modeled to analyze its optimal part-time job policy.

Assumptions
The extended shock model we consider here consists of two identical cold-standby units, a repair facility and a repairman.The model assumptions are as follows.
(1) The external shocks arrive according to a Poisson process with the rate (> 0).The magnitude of each shock, , is independent with common distribution function .Shocks only influence the operating unit.
The operating unit will fail if  outstrips a threshold , where  is assumed to be nonnegative with a distribution function .
(2) Suppose that shocks are the only cause of unit failure.
At each vacation completion instant, the repairman checks the system and decides the action to take according to the system state.There are three possible cases: (A) if there is any failed unit in the system, he will immediately spend a startup time  to turn repair facility on and then start his repair until there are no failed units; (B) if there is no failed unit in the system and the total number of vacations is still less than , he will take another vacation; (C) if there is no failed units in the system and the total number of vacations is equal to , he will remain idle in the system until the first failed unit appears, which induces a startup time  and subsequent repair.We assume that the startup time  has distribution function () ( ≥ 0), density function () ( ≥ 0), hazard rate function () (= ()/(1 − ())), and mean startup time (), respectively.
(4) The repair facility may break down with a Poisson rate  in the process of repair.The broken facility is immediately replaced by the repairman.The replacement times are assumed to be i.i.d.random variables having a general distribution function () ( ≥ 0), density function () ( ≥ 0), hazard rate function () (= ()/(1 − ())), and mean replacement time (), respectively.After replacement, the repair facility continues its remaining repair.The repair time of the failed unit is cumulative.
(5) Initially, both units are new (one is operating and the other is in cold standby), and the repairman is idle.
After the first busy period is completed, he begins to take a multiple adaptive vacation policy.All random variables are mutually independent.
Remark 1. From Assumptions (1) and ( 5), the probability that a shock causes the operating unit to fail is given by where Pr(Ω) is the probability of event Ω.

The State Probability Equations and Solutions
We define the possible states of the system as follows: where , , , o, and  represent that one unit is operating, in cold standby, waiting for repair, waiting for remaining repair (preserving the time spent in repair), and under repair, and V, V, V, and V  represent that the repairman is idle, turning the repair facility on, replacing the repair facility, and taking the th vacation under the condition that the maximum vacation number is ,  = 1, 2, . . ., ;  = 1, 2, . .., respectively.By definition, the system states (0), ( 1), ( 3), ( 5), (7), and (8) are operable and (2), ( 4), (6), and (9) are inoperable.
Let () be the system state at time .For  ≥ 0, we define () as the elapsed startup time of repair facility at time , () the elapsed repair time of the failed unit at time , () the elapsed replacement time of the broken repair facility at time , and () the elapsed vacation time of the repairman at time .Then, {(), (), (), (), (),  ≥ 0} is a vector where , , , and  are the values taken by (), (), (), and (), respectively.
In steady state, we define Let  = , since the process {(), (), (), (), (),  ≥ 0} is a vector Markov process in continuous time, one can write the equations of the process in the usual way by considering the transitions occurring in  and  + Δ.For example, we have Letting Δ tend to zero yields By taking the limit  → ∞ in (4  ), we can obtain In the same way, we readily get the following steady-state equations for state probabilities: with the boundary conditions (0) = 0,  = 8, 9;  = 1, 2, . . ., ;  = 1, 2, . . ., (23) and the normalization condition In order to derive important reliability indices, we define the Laplace transform of a nonnegative function () as  * () = ∫ ∞ 0  − (), and we also denote (25) then we have the following.

Reliability Indices
Based on the results obtained in Lemma 2, some important reliability indices are easily derived as follows.

Theorem 3. (1)
The steady-state availability of the system, that is, in steady state, the probability that the system is operating, denoted by , is (2) The steady-state failure frequency of the system, that is, in steady state, the rate of occurrence of failures of the system, is given by (3) In steady state, let   and   denote the repairman's idle and vacation probabilities, and   and   are the startup and busy (repair and replacement) probabilities of repair facility, respectively; then (4) In steady state, let  denote the unavailability of the repair facility, that is, the probability that the repair facility is under replacement, and  is the replacement frequency of the repair facility, that is, the rate of occurrence of replacements; then where  3 (0) is given by Lemma 2.
Proof.According to the frequency formula in [18],  and  are easily obtained.The rest of probabilities are derived by their definitions, Lemma 2, and direct calculations.
Noting that the time points that the repairman completes repair and begins to take vacation are regenerative ones, we define the following.
(i) System renewal cycle denoted by   : this is the length of time from the beginning of the last vacation period to the beginning of the next vacation period.
(ii) Vacation period denoted by   : this is the length of total vacation time of repairman per renewal cycle.
(iii) Idle period denoted by   : this is the length of total idle time of repairman per renewal cycle.
(iv) Startup period denoted by   : this is the length of total startup time of repair facility per renewal cycle.
These equations can be solved in a similar manner to that in Section 3. The Laplace transforms of solutions are given by  as availability, failure frequency, mean renewal cycle, mean vacation period, and mean startup period.Some special cases are given.As an application, a production line is modeled to study numerical illustration and its optimal part-time job policy.For future research, one could consider some discrete time shock models with a repairman vacation and their optimization applications.

Table 6 :
Availability  and average profit per unit time Γ() of production line for different maximum number  of part-time jobs.

Table 7 :
Availability  and average profit per unit time Γ() of production line for different time  of each part-time job.