Parameter Estimation of a Reliability Model of Demand-Caused and Standby-Related Failures of Safety Components Exposed to Degradation by Demand Stress and Ageing That Undergo Imperfect Maintenance

The authors are grateful to the Spanish Ministry of Science and Innovation for the financial support received (Research Project ENE2016-80401-R) and the doctoral scholarship awarded (BES-2014-067602). The study also received financial support from the Spanish Research Agency and the European Regional Development Fund.


Introduction
The safety of nuclear power plants (NPPs) depends on the availability of safety-related components that are normally on standby and only operate in the case of a true demand.These components typically have two main types of failure modes that contribute to the probability of failure on demand: (a) by demand-caused failure, associated with a demand failure probability (), (b) standby-related failure, associated with a standby hazard function (ℎ).
Both are generally associated with constant values in a standard Probabilistic Risk Assessment (PRA) models, that is,  0 and ℎ 0 , respectively.Such parameters are associated probability density functions in PRA, which are tailored based on a priori generic probability distribution function, for example, exponential, lognormal, Weibull, and beta, depending on the particular sort of component, for example, motor-driven pump and motor-operated valve.A Bayesian approach is used to combine such generic probability density functions with plant specific failure data for each particular component [1][2][3][4].
However, both failure modes are often affected by degradation such as demand-related stress and ageing, which cause the component to degrade with chronological time and ultimately to fail.Maintenance and test activities are performed to control degradation and the unreliability and Mathematical Problems in Engineering unavailability of such components, although this has both positive and negative effects.Thus, different approaches have been proposed in the literature to model time-dependent  and ℎ that take into account such effects in an either implicit or explicit way.
Samanta et al. [5,6] proposed a well-organized foundation to account for ageing and the positive and adverse effects of testing components in modelling demand failure probability and standby hazard function models.However, this model does not take into account the positive effect of maintenance activities as a function of their effectiveness in managing component degradation due to demand-induced stress and ageing.
As regards the standby-related failure mode, Martorell et al. [7] provide an age-dependent reliability model associated only with standby-related failures which explicitly takes into account the effect of equipment ageing and the positive and negative effects of maintenance activities founded on imperfect maintenance modelling.Mullor [8] proposes an approach for parameter estimation of such a sort of imperfect maintenance models.Martón et al. [9] propose an approach to modelling the unavailability of safety-related components associated with standby-related failures that explicitly addresses all aspects of the effect of ageing, maintenance effectiveness, and test efficiency.Other authors have proposed alternative approaches to modelling the effect of ageing and test and maintenance activities [10][11][12][13].
As regards the demand-caused failure mode, this probability of a safety component is normally considered to be mainly affected by demand-induced stress, for example, due to true demands, proof tests, and others.The demandinduced stress is therefore modelled with a stochastic degradation jump in [14,15].These studies consider that random shocks occur according to a Nonhomogeneous Poisson Process, leading to the immediate failure of the component.Shin et al. [16] propose an age-dependent model that considers, among others, the effect of "test stress" and maintenance effects.In general, previous studies have found that the demand failure probability should be considered as a function not only of the number of tests but also of the effectiveness of maintenance activities.Thus, recently, Martorell et al. [17] have proposed a new reliability model for the demand failure probability that explicitly addresses all aspects of the effect of demand-induced stress, maintenance effectiveness, and test efficiency.
In this context, the objective of this paper focuses on fitting the best model to represent the real operation of safetyrelated equipment, dealing with the problem of estimating a significant number of parameters considering a small amount of data.With this aim, a methodology of parameters estimation and model selection is developed.This methodology allows the joint estimation of reliability and maintenance related parameters as well as obtaining a measure of goodness of fit to select the best imperfect maintenance model for each failure mode.This study considers a standby-related failure model assuming linear ageing and a demand-caused failures model assuming test-induced stress.In addition, it considers imperfect maintenance adopting Proportional Age Setback and Proportional Age Reduction for preventive maintenance modelling.Then, maximum likelihood estimation (MLE) using a direct search algorithm based on the Nelder-Mead Simplex (NMS) method is used to estimate maintenance effectiveness and ageing rate simultaneously.A practical and realistic case study is included facing the parameters estimation of a typical motor-operated valve in a nuclear power plant.Additionally, how the estimates obtained can be used, for example, in the planning of maintenance and surveillance test activities with the aim of minimizing equipment unavailability, is shown.
The rest of this paper is organized as follows: Section 2 introduces briefly the demand failure probability model and the standby-related failure model that addresses component degradation because of demand-induced stress and ageing, respectively, and the positive effect of imperfect preventive maintenance.Section 3 describes the parameter estimation method used to fit plant data to reliability models introduced in the previous section.Section 4 describes a case study involving a motor-operated valve of a pressurized water reactor nuclear power plant.Lastly, Section 5 presents the concluding remarks.

Reliability Models under Imperfect Maintenance
In this paper the models presented by Martorell et al. [7,17] have been selected to model the standby hazard function and the demand failure probability, respectively.In the following subsections, both models are briefly described and the expressions involved in the parameters estimation and model selection are obtained under the following assumptions: (1) Time-directed preventive maintenance effect which depends on its effectiveness.The effectiveness is represented by an imperfect maintenance model with parameter , ranging in the interval [0, 1] and adopting either Proportional Age Setback (PAS) or Proportional Age Reduction (PAR) model.
(2) Corrective maintenance with minimal repair.That is, repairing failures do not improve the age of equipment.Therefore, for corrective maintenance, we adopt the Bad As Old (BAO) model.
(3) A linear ageing model which is selected to model the standby hazard function.
(4) Test-caused stress which is the only degradation mechanism considered to model the demand failure probability.

Reliability Model of Standby-Related
Failures.In the context of safety-related equipment of NPP, the most frequently used function in reliability analysis is the hazard function.
The standby hazard function of equipment depends on its age, which is a function of the chronological time elapsed since its installation and the effectiveness of the maintenance activities performed on it.So, an age-dependent hazard function model, in period  after the maintenance number  − 1, can be expressed as [7] ℎ where ℎ 0 is the initial hazard function of the equipment and  + −1 is the age of the equipment immediately after the (−1) maintenance activity.
Adopting a linear model for hazard function, the expression for the age-dependent hazard function after the maintenance number  − 1 can be written as where  is the linear ageing rate and with  −1 being the time in which the equipment undertakes the ( − 1)-maintenance activity.
The cumulative hazard function in the period , after the maintenance number  − 1, can be obtained by integration from the hazard function given by equation (2) as The age of the component immediately after the maintenance number  − 1,  + −1 , and, therefore, the hazard function and the cumulative hazard function depend on the model of imperfect maintenance selected (PAS or PAR).In the following subsections, the particularization of the previous equations to PAS or PAR model is presented.

Proportional Age Setback Model.
In the PAS approach, each maintenance activity is assumed to shift the origin of time from which the age of the equipment is evaluated.The PAS model considers that maintenance activities reduce proportionally to a factor , the age the equipment had immediately before it enters in maintenance.If  = 0, the PAS model simply reduces to the BAO situation, whereas  = 1 corresponds to the Good As New (GAN) situation.Thus, this model is a natural generalization of both GAN and BAO models in order to account for imperfect maintenance.Considering PAS approach the age of the equipment immediately after the ( − 1) maintenance activity is given by [7] Replacing the expressions corresponding to   () and  + −1 given by ( 3) and ( 5), respectively, into (2) the expression for the induced hazard function becomes In a similar way, the cumulative hazard function,   (), in the period , can be obtained by replacing (3) and ( 5) into (4) obtaining According to the above conditions, the age of the equipment in instant  of period , after the ( − 1)-maintenance activity using the PAR model, is given by Using a similar process as the one described for the PAS model, but adopting (8) instead of (5), it is possible to derive the expression for the hazard function and the cumulative hazard function of imperfect maintenance at instant , under the PAR approach

Reliability Model of Failures by Demand.
The demand failure probability of a component, which is normally in standby and ready to perform a safety function on demand, depends on the number of demands performed on the component, which are often associated with performing surveillance tests.In addition, it is necessary to consider the positive effect that the preventive maintenance activities performed on the equipment have on the degradation factor and, therefore, on demand probability failure.
A time-dependent demand failure probability model that addresses the demand-induced stress and the effect of  − 1 maintenance activities can be formulated for the period  as follows [17]: with  0 being the residual demand failure probability and   () being the degradation function.
Assuming, the same degradation factor,  1 , for all types of demands, the evolution of the degradation function in the period number , that is, between maintenance  − 1 and , can be expressed as where  + −1 is the degradation function immediately after maintenance  − 1 which depends on the selected imperfect maintenance model (PAS or PAR) and ⌈⌉ is the floor function that gives the largest integer less than or equal to , which returns the number of tests performed in the interval [ −1 , ] that are performed with periodicity .
Time-dependent evolution of the cumulative demand failure probability,   (), over the period , can be obtained by adding the cumulative distribution function in the  − 1 maintenance to the demand probability functions in each test performed over the period .Generally,   () does not have a closed-form expression.
In the following subsections, the particularization of equations   () and   () for the PAS or PAR model is presented.

Proportional Age Setback Model.
If a PAS model is considered, the degradation function after maintenance number  − 1 assuming preventive maintenance activities are performed on a regular basis with constant maintenance interval given by  can be formulated by [17] Substituting ( 11) and ( 12) into (10) the function of demand failure probability for the period  can be obtained as The distribution function of the cumulative demand failure probability,   (), in the period , after the ( − 1)maintenance activity, can be obtained, as it is mentioned above, by summing the distribution function immediately after the ( − 1) maintenance activity and the probability functions in the tests performed between the  − 1 maintenance and  to yield: 2.2.2.Proportional Age Reduction Model.In the PAR approach the degradation function immediately after maintenance number  assuming preventive maintenance activities are performed on a regular basis with constant maintenance interval, , is given by Using an analogous procedure as the one described for the PAS model, a time-dependent model for the demand failure probability can be obtained substituting ( 15) into ( 10) and (11) to yield In addition, the cumulative demand failure probability,   (), considering a PAR model is given by

Methodology of Parameters Estimation and Model Selection
Many methods for parameter estimation of reliability models have been proposed in the literature, such as the maximum likelihood, methods of moments, and Bayesian estimators.In this paper, the maximum likelihood estimation method has been selected to estimate the parameters of the reliability models presented in Section 2. For a given model and a set of observed data, the likelihood function  is the product of probabilities of the observed data as a function of the model parameters.It can be applied to reliability and imperfect maintenance models for standby-related failures and for demand-caused failures.Thus, the likelihood function for standby-related failures,  1 (), and the likelihood function for demand-caused failures,  2 (), can be formulated as The maximum likelihood estimation (MLE) method provides estimators, called maximum likelihood estimators, of parameters involved in reliability and maintenance models.
The maximum likelihood estimations of these parameters are those values which make the likelihood function as large as possible, that is, which maximize the probability of the observed data.Since the natural logarithm is an increasing function, the likelihood function and its logarithm achieve their maximum at the same values of their objective parameters.For computational purpose it is preferable to maximize the log likelihood function.By maximizing the expressions corresponding to log (), the maximum likelihood estimators of the objective parameters are obtained.In this paper, the Nelder-Mead Simplex [18,19] algorithm is used to maximize the likelihood functions for each proposed model.The maximum likelihood estimation method provides, in addition to the parameter estimates, information on its variability through the Fisher information matrix, which is defined as the opposite of the partial second derivative matrix, that is, the opposite of its Hessian.So, for the set of estimated parameters the variance-covariance matrix as the inverse of the information matrix divided by the sample size can be obtained.
In particular, taking advantage of the asymptotic normality of the maximum likelihood estimation, if the sample size is large enough, we can obtain the standard deviations of the parameter estimation as the square root of the main diagonal of the variance-covariance matrix to obtain confidence intervals for each of the parameters, as well as information on the relationship between the parameters through their covariance.

Likelihood Function for Standby-Related Failures, 𝐿 1 (𝜉).
Let  , be the number of standby-related failures of component , during the maintenance period  which occur at times  ,,1 ,  ,,2 ,  ,,3 , . .., and let  , be the chronological time for the -maintenance in component .The likelihood function for  identical components of equipment under imperfect preventive maintenance is given by where  is the vector of unknown parameters, (, ).For each component ,   is the number of preventive maintenance activities performed during the observation period  *  , with ℎ , () and  , () being the induced hazard function and the cumulative hazard function in period , respectively, and    +1 ( *  ) the cumulative hazard function in censoring time  *  .
The log likelihood function is given by Equation ( 21) must be particularized depending on the imperfect maintenance model considered.If a PAS imperfect maintenance model is considered, the expressions corresponding to ℎ , ( ,, ),  , ( , ), and    +1 ( *  ) are obtained from (6) evaluated in the failure times and (7) evaluated in the preventive maintenance activities times and censure time: Mathematical Problems in Engineering In the case of PAR imperfect maintenance model, the expressions corresponding to the failure rate, ℎ , ( ,, ), and the cumulative failure rates,  , ( , ) and    +1 ( *  ), are obtained from (9) The probability function,  , ( ,, ), and the cumulative probability functions,  , ( , ) and    +1 ( *  ), depend on the imperfect maintenance model considered.In the case of a PAS model these functions are obtained from ( 14) and ( 15) as

Case Study
This section encompasses the estimation of the parameters associated with the reliability models presented in Section 2 for a motor-operated valve (MOV) of a nuclear power plant.The parameters are estimated and the reliability models that best fit the plant data are selected using the methods presented in Section 3.Then, the estimates obtained are used to predict the performance of the MOV as a function of test and maintenance intervals.In particular, the MOV average unreliability contribution of each failure mode and the total MOV unavailability are computed and plotted as a function of maintenance and test intervals for a 10-year horizon.

Historical Maintenance and Testing
Data.Historical failure, maintenance, and test data have been collected from a  1 shows the failure times of the two MOVs studied obtained from the plant operational data.Table 1 provides also a brief description of the corresponding failure cause and failure mode.The failures have been classified as either standby-related or demand-caused failure taking into account the information available for the failure cause.
A total of 432 surveillance tests and 17 preventive maintenance tasks were performed on MOV1, distributed uniformly with periodicity 22 and 572 days, respectively, along the 27year period analysed.In addition, a total of 424 surveillance tests and 18 preventive maintenance tasks were performed on MOV2, distributed uniformly with periodicity 22 and 528 days, respectively, within the same period.

Results of the Maximum Likelihood
Estimation.This section presents the results of the joint estimation of the effectiveness of maintenance, , and the reliability parameters,  for standby-related failures and  1 for demand-caused failures, under PAS and PAR imperfect maintenance models using the plant data introduced in the previous section.The model that provides the best fit is identified for each of the two failure modes.
The maximum likelihood estimations of parameters , , and  1 are obtained maximizing the log likelihood functions given by (20) for standby-related failures and (25) for demand-caused failures using the Nelder-Mead Simplex algorithm.Table 2 gives MLEs of parameters corresponding to reliability model of standby-related failures considering PAS and PAR imperfect maintenance models, the double of the standard deviations, 2, which are obtained from the Fisher information matrix, and the values of likelihood functions .Table 3 shows the same information for the case of the reliability model of demand-caused failures.
The best model for standby-related failures and demandcaused failures is the PAS model in both cases since it provides the higher value for the likelihood function shown in Tables 2 and 3, respectively.So that, the reliability model that considers PAS imperfect maintenance is selected for both failure modes with the value of the corresponding model parameters given in Tables 2 and 3.

Average Unreliability Contribution as a Function of
Maintenance and Test Intervals.The average unreliability contribution to the unavailability of a component normally in standby over its renewal period can be formulated as follows [9,17]: where  , is the standby-related unreliability contribution and  , is the demand-caused unreliability contribution.On one hand, adopting the PAS model to represent the behavior of the imperfect maintenance for the standbyrelated failures of the component according to the results in the previous section,  , is given by [9]  On the other hand, adopting the PAS model to represent the behavior of the imperfect maintenance for demandcaused failures of the component according to the results in the previous section,  , is given by [17] Figure 1 shows the evolution of  , and  , as a function of the test interval, regarding different preventive maintenance intervals for a 10-year horizon renewal period.It can be seen that  , increases significantly for high  and  values.Nevertheless, the effect of maintenance is positive for both unreliability contributions.Moreover, an increase on test frequency between maintenances, that is, low  values, has a very negative effect on  , for very low  intervals.
In addition, Figure 1 shows confidence intervals for the values predicted for the unreliability contributions  , and  , for different couples  and .One can realize large confidence intervals exist, which even increase with  and  because of the RAM model and the uncertainty in the estimation of the model parameters shown in Tables 2 and  3.

Average Unavailability as a Function of Maintenance and Test Intervals Regarding Unreliability and Downtime Effects.
In accordance with [9], the averaged unavailability of a component is the sum of the average unreliability contribution and the unavailability contributions due to detected downtimes for performing testing and maintenance activities with the plant at power, which can be formulated as follows: where   represents the unavailability contribution due to testing,   is the unavailability contribution due to performing preventive maintenance,   is the unavailability contribution due to performing corrective maintenance conditional to detecting a failure during a previous test, and   is the contribution due to replacement of the equipment, if any.These downtime contributions can be evaluated using the following equations [9]: where  is the downtime for testing,  is the downtime for preventive maintenance,  is the downtime for corrective maintenance or repair, and  is the downtime for replacement or renewal.
For the sake of simplicity, the last two contributions,   and   , are not included in the sensitivity analysis due to both are negligible as compared with the downtime effect of preventive maintenance and testing activities.Therefore, the averaged unavailability of the component is given by Figure 2 shows the evolution of  , versus  , +   +   as a function of the test interval considering different preventive maintenance intervals for a 10 years horizon renewal period.The term  , allows quantifying the benefit of developing test and maintenance activities on the total component unavailability while the sum of contributions  , +   +   represents their negative effect.
Figure 2 shows confidence intervals for the values predicted for the unavailability contributions for different couples  and .Again, large confidence intervals exist, which even increase with  and  because of the RAM model and the uncertainty in the estimation of the model parameters shown in Tables 2 and 3. Substituting (29), ( 30), (31), and (32) into (36) yields the following formulation of the total average unavailability of a component: The last study involves the analysis of the total average unavailability of the component as a function of the couple {, } for a 10-year horizon, which is shown in Figure 3.The highest values of  are reached adopting the highest maintenance and test intervals.The main contributor to the total unavailability,  (see (37)), is the standby-related unreliability contribution given by equation (29) as it can be seen in Figure 2.This explains the direct and proportional dependence between  and  and .Nevertheless, the sum of the demand unreliability contribution and downtime effects considered, that is, downtime effect of preventive maintenance and testing activities, become more relevant for very low  values.This fact is appreciated in Figure 2 too.

Concluding Remarks
This paper presents a methodology of parameters estimation and model selection for safety-related equipment.In the literature, complex reliability, availability, and maintainability (RAM) models have been proposed with the aim of capturing equipment performance in a more realistic way, such as explicitly addressing the effect of component ageing and degradation, surveillance activities, and corrective and preventive maintenance policies.A major challenge for the adoption of the new models in practice is to estimate reliability and maintenance parameters with the aim of selecting the best model for describing the real operation of safety-related equipment.
Then, there is a need to fit the best model to real data by estimating the model parameters using an appropriate tool, which could be a problem in some cases because the number of parameters is large and the available data is scarce.This may have great influence in the confidence intervals of the values found for the model parameters that better fit the data.
The paper considers a standby-related failure model assuming linear ageing and a demand-caused failures model assuming test-induced stress.In addition, it considers imperfect maintenance adopting Proportional Age Setback and Proportional Age Reduction for preventive maintenance modelling.Maximum likelihood estimation (MLE) using a direct search algorithm based on the Nelder-Mead Simplex (NMS) method is used to estimate maintenance effectiveness and ageing rate simultaneously.
A practical and realistic case study is included facing the parameters estimation of a typical motor-operated valve in a nuclear power plant.The case study considers real failure, test, and maintenance data for a typical motor-operated valve in a nuclear power plant.The results of the parameters estimation include confidence intervals and the selection of the best model.
Equipment RAM is quantified based on the best model fitted to make the impact of such an estimation in a testing and maintenance-planning context clear.Thus, the results of such a predictive model may help to plan in a more efficient way the test and maintenance program, which should provide appropriate balance among the different contributions to the unavailability of the MOV, with the aim of minimizing its unavailability assuring a low level of unreliability.
However, the effect of the uncertainties introduced in the estimation of the model parameters, because of the availability of scarce data, can jeopardize the decision-making.Thus, the example of application shows large confidence intervals for the unreliability and unavailability contributions for different  and  couples, which even increase with  and  because of the RAM model and the uncertainty in the estimation of the model parameters shown in Tables 2 and 3.
It can therefore be concluded that estimating the parameters and, consequently, fitting these models, it is possible to manage in a more efficient way the test and maintenance program, by providing appropriate balance among the different contributions to the unreliability and unavailability of the component.However, there is a need to increase the data set used to reduce the uncertainty in the decision-making.Failuretimes.

Figure 1 :
Figure 1:  , and  , as a function of the test interval for different maintenance periods.

Figure 2 :
Figure 2:  , and  , +   +   as a function of the test interval for different maintenance periods.

Figure 3 :
Figure 3: Unavailability for different maintenance and test intervals under PAS model.
) 2.1.2.Proportional Age Reduction Model.In the PAR approach, each maintenance activity is assumed to reduce proportionally the age gained from the previous maintenance.Thus, while the PAS model considers that each maintenance activity reduces the total equipment age, the PAR model assumes that maintenance only reduces a portion of the equipment age, the one gained from the previous maintenance, keeping the rest unaffected.The PAR model considers that maintenance reduces the age gained between two consecutive maintenance activities by a factor . Again, one can realize that if  = 0, the PAR model simply reduces to BAO, whereas if  = 1 it reduces to GAN.

Table 1 :
Failure data collected for two identical motor-operated valves of a nuclear power plant.

Table 2 :
MLEs of parameters of the reliability model of standby related failures under PAS and PAR models.

Table 3 :
MLEs of parameters of the reliability model of demand caused failures under PAS and PAR models.
(): Time-dependent demand failure probability for the period    (): Cumulative demand failure probability in the period   +  : Degradation function of the component immediately after maintenance    (): Degradation function of the component associated with demand-related stress for the period  ℎ 0 : Residual standby-related hazard function   (): Cumulative hazard function in the period   1 (): Likelihood function for  identical components of an equipment under imperfect preventive maintenance for standby time-related hazard function  2 (): Likelihood function for  identical components of an equipment under imperfect preventive maintenance for demand failure probability : Preventive maintenance number  : Preventive maintenance interval (): Cumulative number of demands at time   1 : Test degradation factor associated with demand failures  , : Number of failures of component  during the maintenance period   , : Demand-caused unreliability contribution  , : Standby-related unreliability contribution   : Averaged unavailability contribution due to testing  + −1 : Age of the component immediately after the maintenance  − 1   (): Age of the component in the period  :