The reliability modeling of a module in a turbine engine requires knowledge of its failure rate, which can be estimated by identifying statistical distributions describing the percentage of failure per component within the turbine module. The correct definition of the failure statistical behavior per component is highly dependent on the engineer skills and may present significant discrepancies with respect to the historical data. There is no formal methodology to approach this problem and a large number of labor hours are spent trying to reduce the discrepancy by manually adjusting the distribution’s parameters. This paper addresses this problem and provides a simulation-based optimization method for the minimization of the discrepancy between the simulated and the historical percentage of failures for turbine engine components. The proposed methodology optimizes the parameter values of the component’s failure statistical distributions within the component’s likelihood confidence bounds. A complete testing of the proposed method is performed on a turbine engine case study. The method can be considered as a decision-making tool for maintenance, repair, and overhaul companies and will potentially reduce the cost of labor associated to finding the appropriate value of the distribution parameters for each component/failure mode in the model and increase the accuracy in the prediction of the mean time to failures (MTTF).
There are several failure modes in gas turbine engines. Typically, the failure rates of different components are recorded throughout the life of the engine. Upon obtaining the failure rates of such components, the first step is to conduct a statistical distribution fitting to the initial data per component or failure mode. The mean percentage of failures computed while considering the fitted statistical distribution may not be similar to the historical data due to the lack of quantity or quality of the data. Therefore, a common practice is to perform some adjustments in the distribution parameters based on human intelligence and the experience of the engineers. Currently, there is no formal methodology to approach this complex problem.
The reliability modeling of turbine engines is a complex stochastic system. The complexities that arise when randomness is embedded within a system make the analysis of these systems a difficult task. Unfortunately, epistemic uncertainty is a common and unavoidable characteristic among real-world systems. The development of simulation as means to evaluate stochastic systems enhances the ability to obtain performance measure estimates under several given conditions. Furthermore, these performance measures are much more accurate compared to the estimations of analytical techniques, which often make crude assumptions about the system’s condition and operation.
Computer simulations are highly effective in answering evaluative questions concerning a stochastic system. However, it is often necessary to determine the values of the model’s decision parameters such that a performance measure is maximized or minimized. This type of problems can be tackled by simulation-based optimization methods [
This paper aims to develop a simulation-based optimization method to reduce the discrepancy between simulated and historical failures, which will not only significantly reduce the cost of labor hours associated to performing this activity manually but will also find an appropriate value of the distribution parameters for each turbine component/failure mode so that the simulation’s outcome better agrees with the real-life data.
This research was done in collaboration with an aerospace industry’s maintenance, repair, and overhaul company referred to as ABC Company. A 54-component turbine engine is considered as a case study. For confidentiality reasons, the names of the components, distributions, and distributions’ parameters are not mentioned.
The structure of the paper is organized as follows. Section
The engine model is developed using the following assumptions:
Series system for the mock engine model.
Equation (
The proposed simulation-based optimization method searches for a high quality solution (i.e., the value of the distribution’s parameters of the components within their feasible regions) with the use of the simulated annealing (SA) metaheuristic. The proposed method is broken down into two phases.
Phase I consists of a Monte Carlo simulation to obtain the simulated percentage of failure per component, given an initial set of distribution parameters. The simulated percentages of failures from Phase I are used to compute the initial discrepancy between the simulated and the historical percentages of failures, which is expressed as the sum of squares of errors (SSE). Nevertheless, other combinations of feasible distribution’s parameters values may lead to lower SSE values. (i.e., other high quality solutions could exist in nearby neighborhoods). These neighborhoods are explored in the next phase of the method.
In Phase II, a SA-based search procedure is used to explore the neighborhood within the feasible region of the distributions’ parameters in order to locate solutions of potentially higher quality. The objective is to minimize the SSE obtained from Phase I. The SA-based procedure works well for Phase II because it has the ability to move away from local minima in contrast with gradient methods that cannot overcome local minima. The SA neighborhood function is defined based on the bounds of the distribution parameters of each component. The following section discusses the implementation in further detail.
A Monte Carlo simulation is often used to obtain not only the simulated failure times per component, but also the distribution of the statistical estimates of the system’s TTFs: MTTF and STTF.
At this stage of the proposed methodology, the distribution and the distributions’ parameters values per component (obtained from initial data) are considered as primary inputs. The data set for each component contains suspensions (censored data) which represent the units that have not failed by the failure mode in question. They may have failed by different failure modes or not failed at all [
In the case of life data, these data sets are composed of units that did not fail. For example, if five units were tested and only three had failed by the end of the test, it would have
A random number, drawn from a uniform distribution over the interval
Inverse CDF equations for different distributions in the case study.
Distribution | Inverse CDF |
---|---|
Normal |
|
|
|
Log normal |
|
|
|
Weibull (2 parameters) |
|
|
|
Exponential |
|
For the normal or lognormal distributions, there are no closed-form expressions but there are good mathematical approximations for that purpose. The approximation utilized in this paper is the one developed by Weisstein [
The system’s TTF is computed considering the component with the lowest TTF as the system failure rate (
The simulated annealing (SA) algorithm is a well-known local search metaheuristic used to address discrete, continuous, and multiobjective optimization problems. Its ease of implementation, convergence properties, and ability to escape local optima have made it a popular technique during the past two decades. The application of the SA algorithm to optimization problems emerges from the work of Kirkpatrick et al. [
The SA algorithm is based on the principles of statistical mechanics whereby the annealing process requires heating and then a slow cooling of a substance to obtain a strong crystalline structure. The strength of the structure depends on the rate of cooling of the substance. If the initial temperature is not sufficiently high or a fast cooling is applied, imperfections (metastable states) are obtained. In this case, the cooling solid will not attain thermal equilibrium at each temperature. Strong crystals are grown from careful and slow cooling. The SA algorithm simulates the energy changes in a system subjected to a cooling process until it converges to an equilibrium state (steady frozen state). This scheme was developed in 1953 by Metropolis et al. [
The SA-based procedure has been successfully implemented to address problems in manufacturing enterprises such as [
After all the iterations of the Monte Carlo simulation, the percentage of failures is obtained for each component and it is compared to the historical data to obtain the squared error. The discrepancy or the squared error between the simulated and historical percentage of failure is obtained for each component to conform the sum of squared errors (SSE).
The objective function for the optimization process is shown in (
The simulation-based optimization method will propose new values of the distribution parameters (
Feasible region (contour plot) for the distribution parameters of one component.
In case of unfeasibility, that is, the distribution’s parameters at the new state not satisfying (
The shaded region in Figure
A schematic diagram of the SA-based procedure, used to determine the best value of the distributions’ parameters, is presented in Figure
Schematic diagram of the SA-based optimization procedure.
Obtaining the confidence bounds of the distribution’s parameters is an integral part of the proposed methodology. There are multiple methods for obtaining the bounds of these parameters. The following section provides the necessary background behind the selection of the likelihood ratio based on the confidence bounds as the preferred method as well as the steps for obtaining the bounds of the distributions’ parameters.
Our data set contains suspensions (censored data); therefore, the procedure for computing confidence bounds includes censored data in the parameter estimation. In this section, we present the methodology to obtain the likelihood ratio-based confidence bounds of the distribution’s parameters.
In the general case, the distribution parameters are estimated using the maximum likelihood estimation (MLE) method with a modified likelihood function for censored data. Using these estimated parameters, confidence bounds can be calculated. The parameters are estimated using “mlecustom” function of MATLAB. The general form of the likelihood function considering censored samples is [
For example, if the time to failure distribution is Weibull-distributed, the modified likelihood function for censored samples would be
The “mlecustom” function of MATLAB was used to compute the bounds of the distribution’s parameters using the normal approximation method. The normal approximation method for confidence interval estimation is used in most commercial statistical packages because of the relative easiness for the bound’s computation. However, the performance of such procedures could be poor when the sample size is not large, or when heavy censoring is considered [
On one hand, since the input data set contains heavy censoring, the computation of confidence bounds on parameters using the normal approximation is not recommended. On the other hand, Fisher matrix bounds tend to be more optimistic than the nonparametric rank based bounds. This may be a concern, particularly when dealing with small sample sizes. Fisher matrix bounds are too optimistic when dealing with small sample sizes and, usually, it is preferred to use other techniques for calculating confidence bounds, such as the likelihood ratio bounds [
Without loss of generality, the use of the likelihood ratio statistic for developing confidence intervals about a parameter of interest (
As its name implies, the LR statistic test is a ratio of likelihood functions. However, it is more convenient to work with the log form, which is computed as the difference between two log-likelihood expressions. Specifically,
An asymptotically correct confidence interval or region on
The LR confidence intervals can be graphically identified with the use of the likelihood contours, which consist of all the values of
Solutions to
Load the experimental data and define separate sets of information containing the failure and the censored data points from the input data set.
Define the custom probability density function (PDF) and cumulative density function (CDF) equations based on the distribution of the component.
Use the “mlecustom()” function in MATLAB to estimate the distribution’s parameters values.
Compute the output value of the likelihood function (i.e., the modified likelihood function for a component with censored data points) for the parameter estimates obtained in Step
Obtain the graphical estimate of confidence intervals of the parameters satisfying (
Obtain the chi square statistic value and substitute that value in (
Using the graphical estimates of confidence intervals on the parameters as initial guess, two nonlinear optimization problems are solved to obtain accurate LR limits. Since there are two unknowns (e.g., Weibull’s shape and scale parameters) and only one equation, use an iterative method to obtain the values of the parameters (i.e., for given values of scale, obtain the maximum and minimum value of shape parameter that satisfy (
The following steps summarize the proposed methodology.
The initial data, consisting of failure times and operating times of unfailed units, is provided. The input data is used to determine the distribution behavior of the components’ failure rate and to determine the initial distribution parameters values for each component using Weibull++.
The components’ distributions and their parameters are used to simulate the expected failure time (
The simulated percentages of failure are obtained from the Monte Carlo simulation for all the components in the system. The discrepancies (e.g., SSE) between the simulated and historical percentages of failures are computed.
The simulated annealing algorithm minimizes the sum of square errors (SSE) subject to constraints (e.g., the confidence bounds on distribution parameters).
A schematic diagram of the proposed method is shown in Figure
Schematic flowchart of the proposed method.
The turbine engine model consists of 54 components/failure modes. Table
Distribution fittings.
Distribution | Parameters | Component number |
---|---|---|
Weibull (2P) | Shape ( |
1, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 19, 20, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54 |
|
||
Log normal | log- |
|
|
||
Normal | Mean ( |
4 |
|
||
Exponential | Mean ( |
21 |
A statistical design of experiments can be defined as a series of tests in which purposeful changes are made to the input variables of a process or system so that changes in the output response can be observed and quantified [
In order to tune the parameters of the optimization algorithm, a design of experiments is performed to understand the cause and effect relationship among the SA’s parameters and the output response.
The factors that were selected during the DOE development are
Factors and levels during the DOE development.
Factor name | Levels | ||
---|---|---|---|
1 | 2 | 3 | |
|
50 | 100 | 200 |
|
1 | 2 | 4 |
|
0.9 | 0.95 | 0.98 |
The responses are (1) the sum of squared errors (SSE), (2) the mean of squared errors (MSE), and (3) the average CPU time.
The DOE analysis was performed using Minitab. A total number of
From the ANOVA results, it was observed that the cooling rate (
For the sake of brevity, only selected plots are presented. Figure
Main effects plot for SSE. The gray background represents terms that were not significant in the ANOVA.
From Figure cooling rate ( number of accepted trials ( number of Markov chains (
The initial systems’ temperature was fixed to 10. In order to get this value, the initial temperature is initialized at a very low value and an experimental run is performed where the temperature is raised by a ratio until a minimum acceptance ratio defined by the practitioner (0.8) is met; the acceptance ratio is computed as the number of accepted trials (
A sensitivity analysis was conducted to determine the best value for the Monte Carlo sample (MC is not an SA algorithmic parameter but it is a parameter in the overall solution procedure). It was observed that the SSE becomes stable for Monte Carlo sample size of 10,000 and larger values; that is, the SSE is not sensible to changes in the Monte Carlo sample size after a value of 10,000 samples. On the other hand, if the sample size for the Monte Carlo loop is below 10,000, the SSE considerably increases for this particular problem. Hence, the selected number of Monte Carlo samples was fixed at 10,000.
The evaluation of the test case was performed based on the outcomes of the previously presented DOE analysis. After conducting a computational evaluation of the test case, the value of the SSE was found to be 35.1834 having an average CPU time of 55,239 seconds. Table
Comparison of the responses (before and after the optimization).
Sum of squared errors (SSE) | Mean of squared errors (MSE) | |
---|---|---|
Before optimization | 50.6541 | 0.9380 |
After optimization | 35.1834 | 0.6515 |
The SA-based procedure was executed five times considering the best combination of algorithmic parameters. Figure
The average performance of the SA-based procedure.
Table
Comparison of the responses (before and after the optimization).
Component number | Historical percentage of failures | Before | After | ||||
---|---|---|---|---|---|---|---|
Simulated percentage of failure | Discrepancy |
Squared error | Simulated percentage of failure | Discrepancy |
Squared error | ||
|
0.86 | 0.5211 | −0.3389 | 0.1149 | 0.46 | −0.4 | 0.1600 |
|
1.82 | 1.5673 | −0.2527 | 0.0639 | 1.66 | −0.16 | 0.0256 |
|
3.53 | 3.6 | 0.07 | 0.0049 | 3.76 | 0.23 | 0.0529 |
|
0.88 | 0.5947 | −0.2853 | 0.0814 | 0.63 | −0.25 | 0.0625 |
|
2.4 | 2.5271 | 0.1271 | 0.0162 | 2.37 | −0.03 | 0.0009 |
|
1.31 | 1.2096 | −0.1004 | 0.0101 | 1.22 | −0.09 | 0.0081 |
|
0.76 | 0.6553 | −0.1047 | 0.0110 | 0.78 | 0.02 | 0.0004 |
|
0.4 | 3.5392 | 3.1392 | 9.8546 | 3.04 | 2.64 | 6.9696 |
|
1.04 | 0.1024 | −0.9376 | 0.8791 | 0.09 | −0.95 | 0.9025 |
|
5.35 | 5.0636 | −0.2864 | 0.0820 | 5.09 | −0.26 | 0.0676 |
|
4.56 | 4.21 | −0.35 | 0.1225 | 4.26 | −0.3 | 0.0900 |
|
8.32 | 7.9168 | −0.4032 | 0.1626 | 7.87 | −0.45 | 0.2025 |
|
2.5 | 2.0153 | −0.4847 | 0.2349 | 2.38 | −0.12 | 0.0144 |
|
1.39 | 1.1423 | −0.2477 | 0.0614 | 1.18 | −0.21 | 0.0441 |
|
2.8 | 2.5067 | −0.2933 | 0.0860 | 2.45 | −0.35 | 0.1225 |
|
0.3 | 0.259 | −0.041 | 0.0017 | 0.27 | −0.03 | 0.0009 |
|
0.51 | 0.499 | −0.011 | 0.0001 | 0.47 | −0.04 | 0.0016 |
|
0.91 | 1.8601 | 0.9501 | 0.9027 | 1.81 | 0.9 | 0.8100 |
|
1.34 | 0.9799 | −0.3601 | 0.1297 | 1.24 | −0.1 | 0.0100 |
|
0.68 | 0.4956 | −0.1844 | 0.0340 | 0.68 | 0 | 0.0000 |
|
0.45 | 0.4621 | 0.0121 | 0.0001 | 0.53 | 0.08 | 0.0064 |
|
0.51 | 0.4554 | −0.0546 | 0.0030 | 0.4 | −0.11 | 0.0121 |
|
0.25 | 0.1628 | −0.0872 | 0.0076 | 0.17 | −0.08 | 0.0064 |
|
11.07 | 6.3014 | −4.7686 | 22.7395 | 7.03 | −4.04 | 16.3216 |
|
4.03 | 4.1075 | 0.0775 | 0.0060 | 4.26 | 0.23 | 0.0529 |
|
0.55 | 0.0571 | −0.4929 | 0.2430 | 0.05 | −0.5 | 0.2500 |
|
0.7 | 0.6549 | −0.0451 | 0.0020 | 0.72 | 0.02 | 0.0004 |
|
5.8 | 4.4421 | −1.3579 | 1.8439 | 4.31 | −1.49 | 2.2201 |
|
0.38 | 0.2823 | −0.0977 | 0.0095 | 0.22 | −0.16 | 0.0256 |
|
3.32 | 2.1621 | −1.1579 | 1.3407 | 2.35 | −0.97 | 0.9409 |
|
0.5 | 0.4702 | −0.0298 | 0.0009 | 0.43 | −0.07 | 0.0049 |
|
0.99 | 0.7867 | −0.2033 | 0.0413 | 0.91 | −0.08 | 0.0064 |
|
0.45 | 0.3763 | −0.0737 | 0.0054 | 0.33 | −0.12 | 0.0144 |
|
0.4 | 0.3421 | −0.0579 | 0.0034 | 0.38 | −0.02 | 0.0004 |
|
0.63 | 0.822 | 0.192 | 0.0369 | 0.81 | 0.18 | 0.0324 |
|
0.85 | 1.0452 | 0.1952 | 0.0381 | 0.98 | 0.13 | 0.0169 |
|
0.4 | 0.561 | 0.161 | 0.0259 | 0.47 | 0.07 | 0.0049 |
|
0.38 | 0.5465 | 0.1665 | 0.0277 | 0.49 | 0.11 | 0.0121 |
|
0.23 | 0.3274 | 0.0974 | 0.0095 | 0.36 | 0.13 | 0.0169 |
|
6.02 | 6.3466 | 0.3266 | 0.1067 | 6.39 | 0.37 | 0.1369 |
|
2.98 | 4.0667 | 1.0867 | 1.1809 | 3.88 | 0.9 | 0.8100 |
|
0.61 | 0.9346 | 0.3246 | 0.1054 | 0.93 | 0.32 | 0.1024 |
|
0.6 | 0.8256 | 0.2256 | 0.0509 | 0.91 | 0.31 | 0.0961 |
|
0.76 | 0.2627 | −0.4973 | 0.2473 | 0.27 | −0.49 | 0.2401 |
|
6.28 | 9.0659 | 2.7859 | 7.7612 | 7.84 | 1.56 | 2.4336 |
|
1.23 | 1.8439 | 0.6139 | 0.3769 | 1.55 | 0.32 | 0.1024 |
|
0.85 | 1.1438 | 0.2938 | 0.0863 | 1.1 | 0.25 | 0.0625 |
|
1.66 | 2.1355 | 0.4755 | 0.2261 | 2.37 | 0.71 | 0.5041 |
|
1.29 | 1.8673 | 0.5773 | 0.3333 | 1.76 | 0.47 | 0.2209 |
|
0.71 | 0.9715 | 0.2615 | 0.0684 | 1.01 | 0.3 | 0.0900 |
|
2.11 | 2.9768 | 0.8668 | 0.7513 | 2.92 | 0.81 | 0.6561 |
|
0.38 | 0.558 | 0.178 | 0.0317 | 0.68 | 0.3 | 0.0900 |
|
0.5 | 0.7342 | 0.2342 | 0.0548 | 0.82 | 0.32 | 0.1024 |
|
0.45 | 0.6368 | 0.1868 | 0.0349 | 0.66 | 0.21 | 0.0441 |
Total |
|
Total |
|
Figure
Histogram of percentages of failures before and after optimization.
In order to observe the shifting of the distribution’s parameters at the different evaluations of the DOE, contour plots were constructed for each component. Figure
Distribution parameters for the 45th component for different evaluation cases.
Original distribution’s parameters values and best found combination of distribution’s parameters values for the 45th component.
This paper presented a simulated annealing-based optimization method to minimize the discrepancy of historical and simulated percentages of failures in a turbine engine model. A DOE was performed for the tuning of the algorithmic parameters. The results showed a 30% reduction of the SSE (e.g., from 50.6541 to 35.1834) for the engine model. The average CPU time was approximately 15 hours mainly due to the calculations involved in the likelihood function. Alternative neighborhood and feasibility functions can be investigated by studying the trends in the shifting parameters’ values per component. For instance, it was observed that the distribution’s parameters shifted around the edge of the contour plot for some components.
The proposed simulation-based optimization method can serve as a decision-making tool for maintenance, repair, and overhaul companies and will potentially reduce the cost of labor associated to finding the appropriate value of the distribution’s parameters for each component/failure mode in the model and increase the accuracy in the prediction of the mean time to failures (MTTF).
Future research lines involve parallelization of the algorithm to solve larger models (e.g., thousands of components) and comparing the performance of the simulated annealing with other metaheuristics such as Evolutionary Algorithms, Tabu Search, and Particle Swarm Optimization, among others.
The complementary error function or
Mean time to failure of the system
Standard deviation of time to failure of the system
Time to failure of the system
System TTF
Vector of
Failure time
Cumulative distribution function of failure time (
Number of components in the engine model
Scale parameter of a Weibull distribution
Shape parameter of a Weibull distribution
Mean of
Standard deviation of
Mean of a normal distribution or an exponential distribution
Standard deviation of a normal distribution.
The authors declare that there is no conflict of interests regarding the publication of this paper.
This research was supported by the Air Force Research Laboratory through General Dynamics Information Technology, TaskOrder F5702-11-04-SC63-01 (PrimeContract no. FA8650-11-D-5702/0004). This support is gratefully acknowledged. The authors wish to acknowledge Carlos Torres, Alan Lesmerises, and Eric Vazquez of StandardAero for their technical assistance (Distribution A. approved for public release, distribution unlimited; Case no. 88ABW-2014-4202).