^{1}

^{2}

^{3}

^{4}

^{1}

^{2}

^{3}

^{4}

Several software reliability growth models (SRGMs) have been developed by software developers in tracking and measuring the growth of reliability. As the size of software system is large and the number of faults detected during the testing phase becomes large, so the change of the number of faults that are detected and removed through each debugging becomes sufficiently small compared with the initial fault content at the beginning of the testing phase. In such a situation, we can model the software fault detection process as a stochastic process with continuous state space. In this paper, we propose a new software reliability growth model based on Itô type of stochastic differential equation. We consider an SDE-based generalized Erlang model with logistic error detection function. The model is estimated and validated on real-life data sets cited in literature to show its flexibility. The proposed model integrated with the concept of stochastic differential equation performs comparatively better than the existing NHPP-based models.

Software reliability engineering is a fast growing field. More than 60% of critical applications are dependent on software. The complexity of business software application is also increasing.

Customers need products with high performance that can be sustained over time. Due to high cost of fixing failures, safety concerns, and legal liabilities organizations need to produce software that is reliable. There are several methodologies to develop software but questions that need to be addressed are how many times will the software fail and when, how to estimate testing effort, when to stop testing, and when to release the software. Also, for a software product we need to predict/estimate the maintenance effort; for example, how long must the warranty period must be, once the software is released, how many defects can be expected at what severity levels, how many engineers are required to support the product, for how long, and so forth. Software reliability engineering (SRE) addresses all these issues, from design to testing to maintenance phases.

The Software Reliability Growth Model (SRGM) is a tool of SRE that can be used to evaluate the software quantitatively, develop test status, schedule status, and monitor the changes in reliability performance [

Ohba [

The systems with distributed computing improve performance of a computing system and individual users through parallel execution of programs, load balancing and sharing, and replication of programs and data. Ohba [

A number of faults are detected and removed during the long-testing period before the system is released to the market. However, the users then find number of faults and the software company then releases an updated version of the system. Thus in this case the number of faults that remain in the system can be considered to be a stochastic process with continuous state space [

For the estimation of the parameters of the proposed model, Statistical Package for Social Sciences (SPSS) is used. The goodness-of-fit of the proposed model is compared with NHPP-based Generalised Erlang Model [

The Software fault-detection process is modeled as a stochastic process with a continuous state space.

The number of faults remaining in the software system gradually decreases as the testing procedure goes on.

Software is subject to failures during execution caused by faults remaining in the software.

The faults existing in the software are of three types: simple, hard, and complex. They are distinguished by the amount of testing effort needed to remove them.

During the fault isolation/removal, no new fault is introduced into the system and the faults are debugged perfectly.

Several SRGMs are based on the assumption of NHPP, treating the fault detection process during the testing phase as a discrete counting process. Recently Yamada et al. [

However, the behavior of

We now briefly describe the Generalised Erlang model with logistic error detection function. The proposed model is based on Generalised Erlang model with logistic error detection function and SDE as described below.

The model assumes that the testing phase consists of three processes, namely, failure, observation, fault detection, and fault removal. The software faults are categorized into three types according to the amount of testing effort needed to remove them. The time delay between the failure observation and the subsequent fault removal is assumed to represent the testing effort. The faults are classified as simple if the time delay between the failure observation, fault detection and removal is negligible. For the simple faults, the fault removal phenomenon is modeled by the exponential model of Goel and Okumoto [

From (

Now in the proposed model considering the three forms of

Total fault removal phenomenon of the proposed model is the sum of mean removal phenomenon for simple, hard, and complex faults, that is,

For proposed SRGM,

In this section, we present expression for various software reliability measures. Information on the current number of detected faults in the system is important to estimate the situation of the progress on the software testing procedures. Since it is a random variable in our models, so its expected value can be useful measures. We have already calculated the expected value for our models in (

The instantaneous MTBF (denoted by

For simple faults,

For hard faults,

For complex faults,

The cumulative MTBF is the Average Time Between Failure from the beginning of the test (i.e., _{C}) for the proposed models:

The cumulative MTBF of the model is given as follows.

Simple faults:

Hard faults:

Complex faults:

Parameter estimation and model validation are important aspects of modeling. The mathematical equations of the proposed SRGM are nonlinear. Technically, it is more difficult to find the solution for non-linear models using Least Square method and requires numerical algorithms to solve it. Statistical software packages such as SPSS help to overcome this problem. SPSS is a Statistical Package for Social Sciences. For the estimation of the parameters of the proposed model, Method of Least Square (Nonlinear Regression method) has been used. Nonlinear Regression is a method of finding a nonlinear model of the relationship between the dependent variable and a set of independent variables. Unlike traditional linear regression, which is restricted to estimating linear models, nonlinear regression can estimate models with arbitrary relationships between independent and dependent variables.

The performance of SRGM is judged by its ability to fit the past software fault data (goodness of fit).

The term goodness of fit is used in two different contexts. In one context, it denotes the question if a sample of data came from a population with a specific distribution. In another context, it denotes the question of “How good does a mathematical model (e.g., a linear regression model) fit to the data”?

The model under comparison is used to simulate the fault data, the difference between the expected values, _{i}

We define this coefficient as the ratio of the sum of squares resulting from the trend model to that from constant model subtracted from 1 [

The difference between the observation and prediction of number of failures at any instant of time

The average of PEs is known as bias. Lower the value of Bias, better the goodness of fit [

The standard deviation of PE is known as variation.

It is a measure of closeness with which a model predicts the observation.

To check the validity of the proposed model and to find out its software reliability growth, it has been tested on three Data Sets. The Proposed Model has been compared with NHPP-based Generalised Erlang Model [

This data is cited from Brooks and Motley (1980) [

Goodness of Fit Curve for DS-I.

This data is cited from Misra [

Goodness of Fit Curve for DS-II.

This data is cited from Fedora Core Linux [

Schedule of release candidate version in Fedora core 7.

Date | Event |
---|---|

1 February 2007 | Test 1 release |

29 February 2007 | Test 2 release |

27 March 2007 | Test 3 release |

24 April 2007 | Test 4 release |

31 May 2007 | Fedora 7 general availability |

Parameter for DS-I (Brooks DS-2 1301 faults)

Models under comparisons | Parameter estimation | ||||||||||

Proposed SRGM | 1339 | .089 | .248 | .251 | 48 | .264 | .669 | .067 | .194 | .001 | .111 |

Generalized Erlang model [ | 1453 | .376 | .000 | .165 | — | .011 | .000 | .989 | — | — | — |

Parameter for DS-I

Models under comparison | Comparison criteria | ||||

MSE | Bias | Variation | RMSPE | ||

Proposed SRGM | 1.00 | 81.3734 | −0.06975 | 9.20469 | 9.204957 |

Generalised Erlang model [ | .994 | 1200.522 | 0.939148 | 35.14148 | 35.15403 |

Parameter for DS-II (Misra 231 faults)

Models under comparisons | Parameter estimation | ||||||||||

Proposed SRGM | 420 | .059 | .104 | .378 | 66.593 | .64 | .342 | .018 | .048 | .185 | .599 |

Generalised Erlang model [ | 561 | .022 | .012 | .041 | — | .64 | .342 | .018 | — | — | — |

Parameter for DS-II

Models under Comparison | Comparison criteria | ||||

MSE | Bias | Variation | RMSPE | ||

Proposed SRGM | .998 | 7.22 | −0.7104 | 2.626231 | 2.720631 |

Generalised Erlang model [ | .995 | 22.09 | 0.6943 | 4.711796 | 4.762687 |

In this paper, the test data for the end of Test 3 Release version is considered, where 164 faults were detected. The Parameter Estimation result and the goodness of fit results for the proposed SRGM are given in Table

Parameter for DS-III

Models under comparisons | Parameter estimation | ||||||||||

Proposed SRGM | 215 | .189 | .135 | .113 | 8 | .220 | .640 | .140 | .328 | .072 | .346 |

Generalised Erlang model [ | 195 | .063 | .011 | .075 | — | .212 | .004 | .784 | — | — | — |

Parameter for DS-III

Models under comparison | Comparison criteria | ||||

MSE | Bias | Variation | RMSPE | ||

Proposed SRGM | .998 | 5.88010 | 0.070583 | 2.541294 | 2.54227 |

Generalised Erlang model [ | .997 | 7.95905 | 0.149816 | 2.84224 | 2.84618 |

Goodness of Fit Curve for DS-III.

The values of initial fault contents

Tables

Tables

The curves given in Figures

This paper presents an SRGM for different categories of faults based on Itô type Stochastic Differential Equations. In this paper, we have extended the SDE approach adopted by Yamada et al. [

Maximum likelihood estimate

Data set

Coefficient of multiple determination

Statistical package for social sciences

Mean square error

Prediction error

Root mean square prediction error

Fault detection rate.

The first author acknowledges the financial support provided by the Defence Research and Development Organization, Ministry of Defence, Government of India under Project no. ERIP/ER/0703635/M/01/977.