Uniqueness of Maximum Likelihood Estimators for a Backup System in a Condition-Based Maintenance

A parameter estimation problem for a backup system in a condition-based maintenance is considered. We model a backup system by a hidden, three-state continuous time Markov process. Data are obtained through condition monitoring at discrete time points. Maximum likelihood estimates of the model parameters are obtained using the EM algorithm. We establish conditions under which there is no more than one limitation in the parameter space for any sequence derived by the EM algorithm.


Introduction
Suppose a backup system is represented by a continuous time homogeneous Markov chain X = {X t : t ≥ 0} with a state space S = {0, 1, 2}. States 0, 1, and 2 are the healthy state, unhealthy state, and the failure state, respectively. Assume that the system is in a healthy state at time 0, and the transition rate matrix is given by where θ i ∈ (α, +∞) for i = 1, 2 are unknown. Here α > 0 is a known extreme edge of the parameters. Suppose the system is observed at time points 0, Δ, 2Δ, . . ., where [kΔ, (k + 1)Δ] is a benchmark interval. While the system is failed at an inspection, a new system replaces it. Let two processes Y = {Y (k) : k = 1, . . . , n}, and let R = {R(k) : k = 1, . . . , n} be a record of the system. Here R(k) = 1 if the system is failed during (k − 1, k), and R(k) = 0 otherwise. And Y (k) = X kΔ . As a path of X is a stepped right-continuous function, Y (k) = 0 when a replacement occurs at time kΔ. Moreover, we set Y (0) = 0 for convenience. The process R represents the replacement of the system, and the process Y represents observable information of the system collected through condition monitoring.
The maximum likelihood estimates (MLE) of the model parameters for such models have been studied by [1,2]. As the stochastic processes, Y , R are not Markov processes and the sample path of X is not observable, the likelihood function of incomplete data (Y , R) is complex. Hence, it is difficult to obtain directly the MLE θ = arg max θ∈Θ ln L(θ | Y , R), where θ = (θ 1 , θ 2 ) and Θ = {(x, y) : x, y > β}. Here β > 0 is a prearranged constant. Both [1,2] suggest the EM algorithm (see e.g., [3,4]). Let θ(0) be initial values of the unknown parameters. The EM algorithm works as follows.
The E step. For n > 0, compute the following pseudolikelihood function: Here D is the complete data set of the process X. The forms of the complete data set may be different for different purpose. For example, the forms are different in [1,2]. The M step. Choose θ(n + 1) = M(θ(n)). Here The E and M steps are repeated. According to the theory of EM algorithms (Theorem 1 in [3]), L(θ(n + 1) | Y , R) ≥ L(θ(n) | Y , R) for any given initial value θ(0) and n = 0, 1, . . .. It is clear that if an MLE θ in Θ is one of these fixedpoints when it exists.

2
International Journal of Quality, Statistics, and Reliability In this paper, we consider the uniqueness of the MLE θ. As the likelihood function ln L(θ | Y , R) of incomplete data is complex, we do not follow the classical method by which the uniqueness of a MLE is demonstrated by establishing the global concavity of the log-likelihood function (see e.g., [5][6][7].) Alternatively, we investigate conditions under which the operator M is a contraction. The conditions ensure that the MLE is unique if it exists. Moreover, the conditions implies that there is not more than one limitations in Θ for different sequences derived by the EM algorithm. For the complete data set we present in the next section, we have the following main theorem of this paper.

Complete Data Set
To establish the expression of operators Q, we present our construction of the path of X and the complete data set D of the process X. Suppose that random variables T i k , i = 1, 2, k = 0, 1, . . . are independent, T i k has an exponential distribution, and ET i k = θ i /Δ. As every state of X has an exponential duration distribution, we may construct a path of X through the approach introduced by Theorem 5.4 in [8]. The path of X restricted on t ∈ [0, Δ) has the following form: and X t 1 0≤t≤Δ = 2 for Δ(T 1 1 + T 2 1 ) ≤ t < Δ. If m = lim t↑kΔ X t / = 2 for k ≥ 1, we can construct the following path of X restricted on t ∈ [kΔ, (k + 1). Consider If lim t↑kΔ X t = 2, that is, the system is failed during (kΔ, (k + 1)Δ), a new system replaces it at time (k + 1)Δ and a path of X restricted on t ∈ [kΔ, (k + 1)Δ) is presented by setting m = 1 in (5). In this paper, the complete data set Our choice of the complete data set D ensures a simple form of the operator Q. The log-likelihood function has the following form: .
And the Markovian property of the process X implies that where Y (0) = 0.
In Table 1, we denote by n i the number of different values Here the following forms of M m i for m = 1, 2 follows from (7) and Table 1. Consider For any given θ, it is obvious that the function Moreover, it follows from the definition (3) . Therefore, every fixed-point of M in Θ is a fixed-point of M and vice versa. In this paper, we will prove Theorem 1 through studying the number of fixed-points of M in Θ. Here we present the from of M(θ) = (M 1 (θ), M 2 (θ)) derived by (14). Consider

Two Lemmas
Lemma 2. Consider the following: International Journal of Quality, Statistics, and Reliability we have M m As 0 ≤ x m ≤ 1 on σ, we have that 0 ≤ V ≤ D and 0 ≤ M m 4 ≤ 1. Therefore, Moreover, we have As 0 ≤ x i ≤ 1 on the region σ, it follows from the definition to V that there is 0 < η 1 < 1 such that Similarly, there is η 2 ∈ (0, 1) such that The result follows from (19), (20), and (22).
It follows from (29) to (35), (15), and Lemma 2 that Then it follows from Δ < √ 2α/2 that S 1 ≤ 1 and S 2 ≤ 1. The record has at least one replacement. That is, there is k ≥ 1 such that R(k) = 1. As a new system replaces the old failed system at time kΔ, we have that Y (k) = 0. Now the theorem will be accomplished in two cases.

Example and Discussion
We will apply the EM algorithm to a simulation dataset. Based on this example, we will show the efficiency and accuracy of the EM algorithm. Moreover, by this example, we will show some limitations and shortcomings of Theorem 1.
In this example, we make ensembles of 10 3 consecutive inspection of a simulating backup system defined by (1). The true parameters are θ 1 = 10 and θ 2 = 10/3, which is adopted from [2]. We describe first the process of iterations described by the EM algorithm (2) (θ(0)). If we are fortunate and can repeat the operation again and again, then we obtain a sequence θ(n) ∈ Θ, n = 1, 2, . . ..
It follows from the expression (8) of Q(θ | θ) that Q is continuous with respect to both θ and θ. Similar to the discussion of the Theorem 1 in [4], we can prove that if the limitation of the sequence θ(n) ∈ Θ, n = 1, 2, . . . exists and is also in Θ, then the limitation is a fixed-point of the operator M. Theorem 1 shows that the limitation is unique for all such sequences.
In the first experiment, we run the EM algorithm for different initial values. In this experiment, we set β = 1 and the parameter space Θ = {(x, y) : x, y > β}. We run the algorithm for 64 couples of initial values which are chosen randomly from 2 to 12. For each couple of initial value, we run the algorithm for 200 iterations. Figure 1 draws the final estimations of the parameters for initial values. We can see that the algorithm converges to the same result for a great range of initial values. As Theorem 1 points out that the number of fixed points is not more than 1. So we can conclude that there is a unique fixed points, and hence it is the MLE of the model parameters on the parameter space Sometimes, the above procedure of θ(n), n = 1, 2, . . . must stop without the output of the estimated parameters. In general, for a θ(n) ∈ Θ, if M(θ(n)) derived from (15) is not an element of Θ, then M(θ(n)) / = M(θ(n)). For this case, the solution M(θ(n)) to (3) is on the boundary of Θ. As we do not obtain the explicit expression M(θ(n)) for this case, the procedure is aborted.
In the following second experiment, for the same dataset of the first experiment, we run the EM algorithm for another parameter space Θ = {(x, y) : x, y > β } with β = 12. We run the algorithm for 32 couples of initial values which are chosen randomly from 20 to 80. As we predict, the procedure is aborted for every couple of initial values. For these initial International Journal of Quality, Statistics, and Reliability values, Figure 2 draws the maximal iteration numbers before the procedure is aborted.
As we know, MLE θ in Θ, when it exists, is one of fixedpoints of the operator M. However, there may be other fixedpoints of M, such as stationary points of Q. Theorem 1 provides a sufficient condition under which such fixed-point does not exist. Our first experiment and Figure 1 illustrate this fact. However, in some cases, there is not a fixed point of M on the parameter space Θ. Such a phenomenon occurs when we set a wrong parameter space as in the second experiment.

Conclusion
A parameter estimation problem for a backup system has been considered. We established an EM algorithm, which can be used to iteratively determine the maximum likelihood estimators given observations of the system at discrete time points. It has been found that for any initial values, the sequence derived by the EM algorithm converges to a unique point when the limitation belongs to the specified parameter space.