A New Reliability Model Based on Lindley Distribution with Application to Failure Data

Software reliability is an important feature that influences systems’ reliability. Software reliability models are a common tool to evaluate software reliability quantitatively. Various reliability models have been suggested based on the NHPP (nonhomogeneous Poisson process). In this article, a new NHPPmodel based on the Lindley distribution is proposed.-emathematical formulas for its measures of reliability are obtained and graphically illustrated. -e proposed model’s parameters are estimated using both the NLSE (nonlinear least squares estimation) and the WNLSE (weighted nonlinear least squares estimation) methods. -e model is then validated based on several different reliability datasets. -e methods of estimation are evaluated and compared using three different criteria. -e performance of the new model is also evaluated and compared, both objectively and subjectively, with three previously suggested models. -e application results show that our new model demonstrates good performance in our selected failure data.


Introduction
e development of software systems is becoming more expensive and time-consuming because of their increasing complexity. Consequently, the reliable performance of software systems is becoming more important. Numerous SRGMs (software reliability growth models) with various assumptions have been proposed since the 1970s [1][2][3]. Several researchers have used reliability models based on the NHPP during the past years [4][5][6][7][8].
Kapur et al. [9] proposed a new SRGM based on Ito type of stochastic differential equation; the proposed model performs comparatively better than the existing NHPP models. Xu and Yao [10] also suggested a novel NHPP model based on the partial differential equation, and their suggested model exhibits a closer fitting to observation. Li and Yi [11] proposed a modified SRGM to reconsider the reliability of open source software and showed that it well fits the failure data and provides powerful prediction capability. Ramasamy and Lakshmanan [12] proposed the SRGM with infinite testing effort function. Recently, Al-Turk [13] proposed a NHPP model based on the two-parameter log-logistic distribution. e essential model characteristics were obtained, and the parameters of the model were estimated using the MLE (maximum likelihood estimation) and the NLSE methods. e results of the application indicate that the considered model gives a reasonable prediction capability for real studied datasets. Hui and Liu [14] proposed a SRGM based on Gaussian new distribution. e proposed model was confirmed by experiments to have a better fit and prediction performance than other reliability models.
In this article, we propose a new model that belongs to the NHPP class and based on the Lindley distribution. Several properties of the proposed model are outlined in Section 2 with graphical representations. ese properties include MVF (mean value function), failure intensity, number of remaining faults, error detection rate, MTBF (mean time between failures), and conditional reliability. e NLSE and WNLSE methods are used for the purpose of estimating the proposed model parameters in Section 3.
Application on real datasets is provided in Section 4. e last section concludes the article.

Model Construction.
A one-parameter Lindley distribution was suggested by Lindley [15] for the analysis of failure data. is model can capture failure data with different shapes of hazard rates. It has been studied by several authors as a good alternative to the exponential and Weibull distributions [16][17][18]. As with all statistical distributions, the Lindley distribution is specified by its PDF (probability density function), or its CDF (cumulative distribution function), where t > 0, θ > 0, and θ is the shape parameter. e main aim of the NHPP model is to assess and predict the expected number of detected faults up to a specific point of time, which can be achieved using its MVF. Suppose m(t) denotes the cumulative number of faults discovered at time t, and F(t) is the distribution function of time between two successive failures, then the MVF of the NHPP model can be expressed as follows [5]: while the corresponding intensity function is given by where a > 0 is the expected number of faults. By substituting equations (1) and (2) into equations (3) and (4), respectively, we get the MVF of the NHPP L (Lindley) model as follows: and its corresponding intensity function as follows:

Model
Characteristics. e NHPP model has very useful reliability measures for describing failure phenomena. In this section, the mathematical formulas of some of these measures for the new model will be given. First, the number of remaining faults of the NHPP L model is given by and then the error detection rate can be defined as follows: while the MTBF is as follows: e conditional reliability R(x | t) is expressed by the probability that nondetected fault is found in the interval (t, t + x), given that a fault occurred at time t ≥ 0. x > 0 is the interval of operation time according to some practical or administrative requirements [19]. Mathematically, the conditional reliability of the NHPP L model can be obtained as follows:

Graphs of the Model Characteristics.
e plots of the NHPP L model's characteristics for different selected values of parameters are shown in Figures 1-6. Figure 1 illustrates the MVF which represents the variation of the number of faults detected with respect to time. From this figure, we can see that, initially, the faults detected during testing are very high but later on become stable, and also larger values of the parameter a give higher MVF form. Figure 2 displays that the intensity function varies in shape over the different selected shape parameters, and it reaches a larger peak level with the larger value of the parameter a. e number of remaining errors function in Figure 3 decreases as the testing time increases; smaller values of the parameter a give a lower form of the number of remaining errors function. In Figure 4, the error detection rate function increases as the testing time increases; a larger value of the shape parameter gives a larger form of the failure occurrence rate per fault of the software function. e MTBF function in Figure 5 increases with the progress of the testing time. In Figure 6, we can see that as t tends to infinity the conditional reliability becomes approximately 1.

Estimation of Model Parameters
In this section, the NLSE and WNLSE methods are applied for the estimation of parameters of our proposed model.

e NLSE and WNLSE Methods.
Assume that a software system is tested for T units of time and n faults were detected. Let 0 < t 1 < t 2 < · · · < t n < T be the times at which the failures were observed. m(t i ; Θ) is the MVF; and Θ is its unknown parameters. e parameters Θ are thus derived from n observed data pairs: (m 0 , t 0 ), (m 1 , t 1 ), . . . , (m n , t n ) where m i is the total number of faults detected within time (0, t i ).
en, the NLSE method aims to minimize the following function: while the WNLSE method aims to minimize the following function: where w i > 0 and(i � 1, . . . , n) are positive weights; n i�1 w i � n [20].

e NLSE and WNLSE Methods for the NHPP L Model.
For the NLSE, we substitute equation (5) in equation (11) as follows:  Mathematical Problems in Engineering Taking the partial derivative of equation (13) with respect to a and θ, respectively, we get   Mathematical Problems in Engineering By setting the derivatives equal to zero, we get the following nonlinear equations: Mathematical Problems in Engineering e closed form expression for the NLS estimates of θ cannot be obtained. Consequently, an estimate of parameter θ can be obtained by numerically solving the nonlinear equation (16), and then by substituting this estimate in equation (15), the estimate of the parameter a can be obtained.
For the WNLSE, we substitute equation (5) in equation (12), and thus we obtain Taking the partial derivative of equation (17) with respect to a and θ, respectively, we get By setting the derivatives equal to zero, we have the following nonlinear equations: Closed form expression for the WNLS estimate of θ cannot be obtained. By solving equation (20) using the Gauss-Newton method, we obtain the value of the estimate, and then by substituting this estimate in equation (19), the estimate of the parameter a can be obtained.

Application to Failure Data
In this section, examples of real data are used to compare the two considered methods of estimation for the proposed model. Also, we perform a comparative study to evaluate the effectiveness of the proposed model with three of the previously existing models. Useful results based on the studied real datasets are presented and discussed at the end of this section. To facilitate mathematical computation, a software tool was developed using R language version 3.6.1.

Datasets. Nine published datasets with different sizes
were chosen for our evaluation study. References for the selected datasets are shown in Table 1.

Models.
In addition to our proposed model (NHPP L), three other well-known reliability models are considered, and the names and MVFs of these models are listed in Table 2.

Evaluation Criteria.
To check the performance of the considered models, we used the following three criteria based on equations (21)- (23). e mean square error (MSE) is the variation between the predicted values and the actual observations. It is defined as [19] where m(t i ) is the estimated number of faults at time t i obtained from the considered model; m i is the total number of faults detected within time (0, t i ), (i � 1, . . . , n); n is the number of observations; and k is the number of parameters. A lower value of the MSE indicates more confidence in the model and thus better performance. e variance is defined as follows [29,30]: where the bias is defined as | n i�1 (m(t i ) − m i )/n|. e average of the prediction faults is referred to as the prediction bias, and its standard deviation is often used as a measure of the variance in the predictions. e small value of variance indicates that the model fits the data well. e coefficient of determination (R 2 ) can measure how precise the fit is in describing the deviation of the data. It is defined as [19] 6 Mathematical Problems in Engineering Values for this coefficient range from 0 to 1. e value of R 2 closest to 1 indicates the best model.

Comparative Study of the Estimation Methods.
is section evaluates the performance of the NLSE and WNLSE methods for the NHPP L model based on eight datasets. e results are shown in Table 3. From the evaluation criteria values in Table 3, we derived the following conclusions: (i) e NHPP L model provides values indicative of a better model for most of the evaluation criteria in most cases when using the WNLSE method. (ii) e different evaluation criteria gave different results, and this indicates the necessity to study several criteria during the comparison. Figures 7 and 8 illustrate the actual and fitted curves of software failures using the NLSE and WNLSE methods. According to these figures, we can see our new model provides a good fit for all considered datasets when using either the NLSE or WNLSE methods. In particular, the proposed model is more suitable for modeling the failure datasets when using the WNLSE method rather than the NLSE method.

Comparing the Performance of Various SRGMs for
Some Real Datasets. Since the proposed model is new concerning the predication/estimation of software reliability, we compared its accuracy with some well-known and widely used SRGMs, namely, the GO model, delayed S-shaped model, and inflection S-shaped model. Our comparative study is based on five datasets, DS2, DS3, DS4, DS5, and DS9, and used the WNLSE as the method of estimation. e Kolmogorov-Smirnov test was used to check and compare the fit between these datasets and our studied reliability models. e results are presented in Table 4. From the table, we can observe the following: (i) e MSE values for all studied models are very close, indicating that all studied models have the ability to describe the five selected systems effectively with minor differences between them in terms of their performance. e NHPP L model ranked the second for DS2, DS3, and DS5 while it ranked the first for DS4 and the third for DS9. (ii) e values for the coefficient of determination (R 2 ) for all studied models are close to 1. erefore, it can  Model name MVF Goel-Okumoto (GO) model [4,27] Delayed S-shaped model [5,28] Inflection S-shaped model [6] where β > 0 is the inflection factor; a > 0 is the expected number of software faults to be eventually detected; b is a constant of proportionality; and θ is the shape parameter for the Lindley distribution.     Mathematical Problems in Engineering   Mathematical Problems in Engineering be said that all studied models are suitable for modeling the considered software projects. Figure 9 illustrates the actual and prediction results based on the four considered models. According to these figures, we can see that all the selected models are well-fitted to the studied failure data. In particular, the proposed model is one of the most suitable for modeling the selected datasets.

Conclusions
In this article, we propose a new reliability model based on the Lindley distribution. Several essential characteristics of our proposed model, the NHPP L model, were obtained. e considered model parameters were estimated using the NLSE and WNLSE methods. e performance of the estimators for each studied method was evaluated using different criteria based on eight datasets. A comparative study between the proposed model and three other common models was conducted based on five real datasets. e WNLSE method was determined to have better performance than the NLSE method for the chosen failure datasets. us, it is recommended that the WNLSE method be used with the NHPP models. e performance of the NHPP L model is encouraging in comparison with other selected models. e present study can be extended by incorporating SRGMs with learning effects to increase the flexibility of models and to enhance their capability for accurately describing software failure phenomena.

Abbreviations
SRGMs: Software reliability growth models NHPP: Nonhomogeneous Poisson process PDF: Probability density function CDF: Cumulative distribution function MVF: Mean value function MTBF: Mean time between failures MSE: Mean of squared errors R 2 : Coefficient of multiple determination NLSE: Nonlinear least square estimation WNLSE: Weighted nonlinear least square estimation MLE: Maximum likelihood estimation.

Data Availability
Previously published data were used to support this study. ese prior studies are cited at relevant places within the text as references.