Statistical Inference in Dependent Component Hybrid Systems with Masked Data

Complex systems are usually composed of simple hybrid systems. In this paper, we consider statistical inference for two fundamental hybrid systems: series-parallel and parallel-series systems based on masked data. Assuming dependent lifetimes of components modelled by Marshall and Olkin’s bivariate exponential distribution in the system, we present maximum likelihood and interval estimation of parameters of interest. Intensive simulation studies are performed to demonstrate the efficiency of the methods.


Introduction
In a system consisting of several components, the reliability analyses are usually made by analyzing lifetime data.The system data includes two parts: (i) the system's lifetime and (ii) the failure reason, that is, which component causes system failure.In real situations, however, some things may prevent systems from revealing the failure reason such as shortage of funds, limit of time, error of records, lack of diagnostic tools, and destructive consequences caused by the failure of some components.For example, in the reliability problems of computers and integrated circuits, the reason for the system failure is often attributed to a module containing several components, but one could not determine exactly which component causes the system failure.Therefore, the observable data from the test includes the failure time and failure reason related to a subset of components.In these cases, the reason for the failure of the system is masked and the lifetime data is called masked data.
The statistical analysis of masked data has a long history.Usher and Hodgson [1] initially proposed the parameter estimation under masked data.Since then, a significant amount of literature has emerged on various models.In the series system with constant, linear and polynomial failure components in the presence of masked data, the maximum likelihood (ML) and other estimation methods were studied among many researchers (e.g., [2][3][4][5][6]).Sarhan and El-Bassiouny [7] considered a parallel system using masked data.Bayes methods with various priors were also used for the estimation of parameters in series and parallel systems (see, e.g., Sarhan [8][9][10], Jiang and Zhang [11]).El-Gohary [12] discussed a series system with two dependent components in a Bayesian approach.So far, most researches of masked data focused on a system with either series or parallel only and assumed independent and identical component lifetime in the system.In many real situations, however, a "hybrid" system is often seen in which the working components are connected in a way of joining together with series and parallel.For example, currently, air supply systems generally are modular designed, where the power system consists of a number of semiconductor units combined in a series or hybrid method [13,14].
Complex systems are usually composed of simple subsystems such as three-component series-parallel and parallelseries systems illustrated in Figure 1.In this paper, we mainly focus on statistical inferences of the two fundamental hybrid systems, in which the component lifetimes are nonindependent and nonidentically distributed.For the two systems, first we note that the system failure occurrence is attributed to one of the four failures consisting of components 1, 2, 3, and 12, where 12 denotes the occurrence of components 1 and 2 failure simultaneously.Let S be the set of all nine events causing the system failure; that is, S = {{1} , {2} , {3} , {12} , {1, 2} , {1, 3} , {2, 3} , {12, 3} , {1, 2, 3}} . ( If  ∈ S consists of more than one element, then the reason of the system failure is not exact and the life data is masked.Notice that here we differentiate two occurrences {1, 2} and {12} by assuming different independent processes damaging component 1 only, component 2 only, and both components in the next section.We make statistical inference of parameters on likelihood-based methods in the presence of masked data.Section 2 presents the life distribution and reliability for the hybrid systems.Section 3 concentrates on the parameter estimation for the series-parallel and parallelseries systems, respectively.In Section 4, we assess the performance of the methods on simulation studies.Lastly, we conclude the paper with a brief discussion in Section 5.

Model Specification
For the three-component hybrid system in Figure 1, there is a subsystem consisting of components 1 and 2. From a practical viewpoint, the lifetimes of the components in the subsystem are usually dependent on each other and independent of component 3 outside the subsystem.The unit lifetime model is addressed in the following.

Life Distribution. A bivariate model is developed by
Marshall and Olkin [15] to describe the correlated lifetimes of two units and is widely used in two-component system.Basically, it was assumed that two-component system is affected by "fatal shocks" governed by three different independent Poisson processes with parameters and the joint density function is ( The probability of both components failure at time  corresponds to the mass probability of singular part ) ,  > 0. The component 3 is shocked by another independent Poisson process with parameter , and so its lifetime is exponentially distributed with the density  3 ( 3 ) =  − 3 ,  3 > 0,  > 0.

Reliability and Density of Hybrid
System.First we briefly introduce the concept of masked probability.Assume that there is a masked event  ∈ S with the exact failure case  in the hybrid system; then the probability of failure due to the masked occurrence  at time  is where ( =  |  <  ≤  + ,  = ) is called masked probability and ( <  ≤  + ,  = ) is the probability of system failure caused by component(s)  at the time ,  = 1, 2, 3, 12.In statistical analysis of masked data, it is usually assumed that the masked occurrence is independent of the cause and failure time; that is, The lifetime of the series-parallel system in Figure 1(a) is  = min(max( 1 ,  2 ),  3 ).With the assumption that ) and an independent  3 ∼ exp(), the reliability at time  is and the probability densities of failure at time  due to each event are Likewise, for the parallel-series system with three components as shown in Figure 1(b), the system life becomes  = max(min( 1 ,  2 ),  3 ).Therefore, the reliability is and the probability densities of failure at time  due to each case are Finally, the density function for the system at  due to the masked occurrence  can be expressed as ( <  ≤ +,  = ) = ∑ ∈   .The likelihood-based parameter inference for the two hybrid systems is presented in the following.

Parameter Estimation
In our statistical inference, two common censoring schemes are considered: type-I and type-II.For  tested systems, through reordering the failure times, we assume that there are   systems failures due to the th mechanism in S with the failure times Obviously there are totally  = ∑ 9 =1   =  9 observed failure times and  −  censored observations.For type-I censoring, the test is continuing until a prespecified time  is reached and we observed  systems failed; whereas for type-II censoring, the test is carried out until the prespecified   systems failures for the th mechanism, and so the test terminated time  = max( 1 , . . .,   1 ,   1 +1 , . . .,   2 , . . .,   9 ).For both cases, we express the observed life data D = { 1 , . . .,   1 ,   1 +1 , . . .,   2 , . . .,   9 , }.The corresponding masked failure event is   ∈ S for the system  and masked probability   = (  =   ),  = 1, 2, . . ., .The probability density of system  at   for each case in (7) and ( 9) is expressed as indicating the failure due to the components 1, 2, and 3 and both components 1 and 2, respectively.Finally, the density function for system  at   becomes ∑ ∈      .Therefore, the applicable unified likelihood function for both hybrid systems and censoring schemes is where the constant  = ∏  =1   does not contain the parameters of interest in   .
For the purpose of simplicity, we only consider two special cases of failure rates: (1) the components were shocked by independent Poisson processes with same parameters; that is,  1 =  2 =  12 = ; (2) the Poisson processes affecting the three components individually have the same parameters but different from that of the Poisson process applying on components 1 and 2 simultaneously; that is,  1 =  2 =  ̸ =  12 .The maximum likelihood estimation (MLE) approach will be implemented for the inference.To make notation simpler, we denote the log-likelihood function as () = log ( | D), where  is the parameter of failure rates included in the life densities.We also apply the approximated chisquared likelihood ratio statistic [16] to numerically obtain the confidence intervals of parameters.Particularly, for our case, the likelihood ratio statistic for the parameter ] where  =  or  = (,  12 ) and its MLE θ, and ] is the dimension of .In general, this method works well even for the situation of small sample size; that is, the coverage probability of the constructed interval is very close to the nominal confidence level.

Series-Parallel System
(1)  1 =  2 =  12 = .Based on the reliability in (6) and the densities in (7), the likelihood function (10) becomes So, the log-likelihood can be simplified as log (4   − 4) and its derivative with respect to  is Since no analytical form of MLE λ can be obtained from the equation   () = 0, a numerical method has to be implemented for specific data observations.The uniqueness of MLE can be justified in the following way: the terms involving exponent in   () can be expressed as a unified functional form () = /( −  − ) with positive constants , , and .Since   () = − − /( −  − ) 2 < 0, we have Hence, the log-likelihood function () is strictly concave and therefore   () = 0 implies a unique MLE λ.Additionally, =1   < 0, and so the MLE λ is a positive value.

Simulation Study
In this section, we conduct a simulation study to investigate the performance of our methodology.We choose two parameter values of failure rates for each case in the two hybrid systems; that is,  = 1.0, 0.8 in the case of same failure rates, and (,  12 ) = (1.0,0.5), (0.8, 0.4) for the case of different failure rates.Under each setting of parameter values, we carry out simulation study to generate the lifetimes   ,  = 1, 2, 3, following the construction described in Section 2.1 under two sample sizes  = 24, 30, for each of which two complete samples ( =  = ∑ 9 =1   ) with two settings of failure numbers   and two censored samples with two failure numbers ( = 20, 21 for  = 24 and  = 26, 28 for  = 30) are considered to determine the sample size  and   variation effects for the estimation precision.We conduct 10,000 Monte-Carlo simulations for each setting of parameter value, sample size, and failure number.The averaged MLE, mean squared error (MSE), length of 95% confidence interval, and coverage probability are displayed in Tables 1 and 2 for the series-parallel system and Tables 3 and  4 for the parallel-series system.
In each table, the estimation results in the upper panel correspond to the complete sample and lower panel to the censored sample.It seems that the estimations are reasonably good under these relative small sample sizes, and all the coverage probabilities of confidence intervals exceed the nominal confidence level, indicating that it is a conservative method for interval estimation by chi-squared likelihood ratio statistics.As expected, under the same sample size , the MSEs and interval lengths are smaller in complete samples than these in censored samples.Due to the scale of the true parameter values, we noticed that given the same sample size  and failure numbers   's, the MSE and interval length of estimates under larger true parameter values are consistently larger than these under smaller true values.In Table 1, for example, given  = 24,  = 24, the MSE = 0.0267 and 95% confidence interval length = 0.7164 when  = 1, whereas MSE = 0.0177 and the length = 0.5664 when  = 0.8.However, it is common that, for a fair comparison between estimates variability with different units or different parameter values, one should use a relative variability measure such as coefficient of variation instead of a measure of dispersion like MSE or interval length.In our case, we propose a "normalized" measure of dispersion  = length/estimate to remove the scale effect for the comparison.As a result, the estimation results mentioned above give us  = 0.7164/1.0481= 0.6818 and 0.5664/0.8308= 0.6817, respectively, which are very close to each other.Similar outcomes are obtained for other estimation results across the tables, indicating a consistent precision for the estimation procedure.Additionally, other findings can be seen from the estimation results.(i) For the complete samples, the upper panels in the tables interestingly show that given the same size , the MSEs and interval lengths are consistently smaller in the setting of larger variation of   's than those in the setting of less variation of   's.In other words, the estimations are more efficient under "unbalanced" failure numbers (  's vary largely) than "balanced" failure numbers (  's are close to each other).The possible reason is that the likelihood function with "unbalanced" failure numbers is less dispersed so that it accommodates more amount of information of parameters.(ii) For the censored samples, the MSE and interval length are getting smaller as the sample size  and failure number  are getting larger.For example, for the true parameter values (,  12 ) = (1.0,0.5) in the lower panel of Table 2, when  = 30,  = 28, the MSE( λ, λ12 ) = (0.0196, 0.0082) and the interval lengths for ,  12 : 0.7216 and 0.5545, respectively, while the corresponding MSE( λ, λ12 ) = (0.0257, 0.0182) and interval lengths for ,  12 : 0.8288 and 0.5792 under  = 24,  = 21.Furthermore, given the sample size  = 24, the MSE and interval length under  = 21 are smaller than these under  = 20, where the MSE( λ, λ12 ) = (0.0260, 0.0188) and the interval lengths of ,  12 : 0.8344 and 0.5909.In summary, the results indicate that it is more accurate for the estimates if more failures are observed.

Conclusions and Discussions
In this paper, we have studied statistical inference for threecomponent hybrid systems based on masked data, for which the lifetimes of units are nonindependent and nonidentical distributed.Two commonly censored schemes type-I and type-II were considered in the analysis.We have presented the maximum likelihood estimates of parameters when the failure rates of three components in the hybrid system were assumed to be the same and different, respectively.
In addition, we obtained the approximate interval estimation of parameters by using likelihood ratio statistic.We have assessed the performance of estimation methods by simulation studies.The results have demonstrated that the procedure can achieve good estimation performances under small and moderate sample sizes, and the estimates are more accurate if more failures are observed, indicating the efficiency of the estimation method.While the method can be extended to more complex systems in the presence of masked data, the representation and evaluation of the likelihood function would become cumbersome for large systems.There is an alternative method based on signature that explores component topology.The system signature is the probability vector whose element is the probability of each component failure resulting in the system failure, and it provides an elegantly simple representation of a system [17].Some advances and various applications of the signature are discussed in [18][19][20].Recently, using the system signature, a Bayesian inference to the system with masked lifetime data was proposed by Aslett [21].The generic likelihood function for complex systems can be easily expressed by data augmentation method; the parameter inference is relied on the samples from an iterative Markov chain Monte-Carlo simulation of all the component failure times and parameters.This intensive computing method provides an alternative to the traditional likelihood-based approach to deal with general systems.

Appendices
Proof of existence of MLEs for the likelihood function under the case  1 =  2 =  ̸ =  12 in both hybrid systems.

A. Series-Parallel System
In the log-likelihood function in (16), taking partial derivatives with respect to  and  12 , respectively,

Figure 1 :
Figure 1: Hybrid systems of three components.