Reliability is an important phase in durable system designs, specifically in the early phase of the product development. In this paper, a new methodology is proposed for complex systems’ design for reliability. Specific test and field failure data scarcity is evaluated here as a challenge to implement design for reliability of a new product. In the developed approach, modeling and simulation of the system are accomplished by using reliability block diagram (RBD) method. The generic data are corrected to account for the design and environment effects on the application. The integral methodology evaluates reliability of the system and assesses the importance of each component. In addition, the availability of the system was evaluated using Monte Carlo simulation. Available design alternatives with different components are analyzed for reliability optimization. Evaluating reliability of complex systems in competitive design attempts is one of the applications of this method. The advantage of this method is that it is applicable in early design phase where there is only limited failure data available. As a case study, horizontal drilling equipment is used for assessment of the proposed method. Benchmarking of the results with a system with more available failure and maintenance data verifies the effectiveness and performance quality of presented method.
Today’s competitive world and increasing customer demand for highly reliable products makes reliability engineering more challenging task. Reliability analysis is one of the main tools to ensure agreed delivery deadlines which in turn maintain certainty in real tangible factors such as customer goodwill and company reputation [
The design for reliability is an important research area, specifically in the early design phase of the product development. In fact, reliability should be designed and built into products and the system at the earliest possible stages of product/system development. Reliability targeted design is the most economical approach to minimize the lifecycle costs of the product or system. One can achieve better product or system reliability at much lower costs by the utilization of these techniques. Otherwise, the majority of lifecycle costs are locked in phases other than design and development; one pays later on the product life for poor reliability consideration at the design stage. As an example, typical percentage costs in various lifecycle phases are given in Table
Lifecycle costs [
Lifecycle phases  Percentage costs 

Concept/feasibility  3 
Design/development  12 
Manufacture  35 
Operation/use  50 
In most of the recent designs for reliability researches, field and test data were used as the main source of the component reliability data; also a part of a system (e.g., electrical or mechanical part) was studied and hybrid electromechanical systems were not integrally analysed.
Researches summary around the design for reliability.
Reference  Year  Used method for modeling and simulation of system 

Avontuur and van der Werff [ 
2001  ETA, FTA, FMEA 
Youn and Choi [ 
2004  FORM, RIA, PMA 
Yadav et al. [ 
2006  FMEA 
Kumar et al. [ 
2007  Replacement and design change 
Carrarini [ 
2007  MC 
Cho and Lee [ 
2011  MC, FORM, SORM 
Abo AlKheer et al. [ 
2011  MC & FORM 
Tarashioon et al. [ 
2012  FMMEA 
O’Halloran et al. [ 
2012  RBD, EDRPM 
Soleimani [ 
2013  RBD, MC 
Morad et al. [ 
2013  RBD, MC 
This work examines a design for reliability methodology for complex systems at the early phase design. One of the main advantages of this method is to consider other significant factors for correction of collected generic failure rates for different components. Typical factors include temperature factor
The main aim of this research is (i) to present an integrated methodology for design for reliability of complex systems where enough experimental data is not available and (ii) to estimate the reliability parameters and reliability optimization of system with increasing the quality of components and changing its design (e.g., redundancy).
In Section
In this research, a methodology is developed for reliability evaluation of electromechanical systems. The proposed method’s flowchart is shown in Figure
The new methods flowchart as an early design reliability tool.
Load stress factor (
In this paper, generic data bases, for example, MILHDBK217F, OREDA, and NPRD95, are used as the primary source of components reliability data for the systems in the presence of inadequate specific reliability data. Expert judgment is used for specific components failure estimation, for which there is no generic failure data available.
Basically, trend testing is accomplished using either graphical method (i.e., probability plotting and time test on plot) or analytical method (i.e., Mann test, Laplace test, and Military Handbook test). Nonparametric methods are alternatives for the analysis of the failure and repair data trend [
The Monte Carlo simulation method is an artificial sampling method which may be used for solving complicated problems in analytic formulation and for simulating purely statistical problems [
The Monte Carlo computer procedure.
The sampling is designed for variables with considering the dependency among them if the trend analysis determines a significant correlation between them. This process is repeated for sufficient sample size to estimate availability values. Typical sampling for
Reliability and availability are two suitable metrics for quantitative evaluation of system survival analysis. Reliability is defined as the probability of the system mission implementation without occurrence of failure at a specified time period [
According to the systemlevel loadstrength interference relationship [
Reliability of the series system
Reliability of the parallel system
Reliability of the
If the strength does not degrade or the degradation can be ignored, the reliability that a system survives
A loadsharing system refers to a parallel system whose units equally share the system function. For a simple loadsharing system, with two same items, initially both units share the load, with times to failure distribution being
For exponential distribution,
Most practical systems are neither parallel nor series but exhibit some hybrid combination of the two. These systems are often referred to as parallelseries system. Another type of complex system is one that is neither series nor parallel alone, nor parallelseries. For the analysis of all types of complex systems, Shooman [
In this research, the RP method is used for nonrepairable but exchangeable [
Among the repairable systems, GRP is the attractive one for reliability analysis modelling, since it covers not only the RP and the NHPP, but also the intermediate “younger than old but older than new” repair assumption. GRP has been used in many applications, such as automobile industry [
The introduced GRP results in the socalled
For the GRP, the expected number of failures in
Kijima et al. [
Availability is defined as the probability that a repairable system is operating satisfactorily at any random point in lifecycle time [
Due to the application of both failures and maintenance downtime data, availability is generally used for measuring performance of the repairable items [
The importance measure is a mean for identification of the most critical items. By ranking of the items, prioritizing policy is planned in a way that the weakest items are identified and improved [
Importance measure
The allocation process translates overall system performance into the subsystem and component level requirements. The process of assigning reliability requirements to individual components is called reliability allocation to attain the specified system reliability [
By wellbalanced usually refers to approximate relative equality of development time, difficulty, and risk or to the minimization of overall development cost.
From mathematical point of view, the reliability allocation problem is a nonlinear programming problem. It is shown as follows [
Maximize
Since the research done by [
Uncertainty ranges are derived for the problem for the demonstration of the confidence on the obtained results. There are various input and model uncertainty sources in the calculations and results. It includes approximations, assumptions, sampling errors, selecting probability distribution functions, and models for estimation of statistical parameters and simulation process. Methods for the estimation of input uncertainty include maximum likelihood estimation, Bayesian updating, maximum entropy. Propagation of uncertainty also affects the results. Several methods exist for uncertainty propagation including Monte Carlo simulation, response surface method, and method of moments and bootstrap sampling [
Confidence intervals method is utilized for presenting uncertainty of the estimated results. In this method, a boundary with acceptable confidence level is associated with the estimated response variable. The confidence bounds are calculated by Fisher matrix approach on censored data [
reducing the complexity of the system;
using highly reliable components through component improvement programs;
using structural redundancy;
putting in practice a planned maintenance, repair schedule, and replacement policy,
decreasing the downtime by reducing delays in performing the repair. This can be achieved by optimal allocation of spares, choosing an optimal repair crew size and so forth.
In addition, use of burnin procedures may also lead to an enhancement of system reliability to eliminate early failures in the field for components having high infant mortality [
In the final step and according to the estimated results, reliability of system is optimized with increasing the quality of critical components and design alternatives. The term design alternative is used interchangeably to refer to the combination of components (or candidate solutions) which form a design. In this method, design alternatives are utilized for reliability improvement with available component elimination and selecting optimal combination of components.
Horizontal drilling equipment is considered in the reverse engineering stage, as a case study for evaluating the present method. There are limited failure and maintenance data available for this system for the design group. Horizontal drilling is a repairable complex system with more than 4000 components where only some of them are repairable. Also, this system has several configurations in the design such as series, parallel, loadsharing, and complex systems [
In this research, correction factor is considered in failure data collection. As an example, corrected failure rate value for an electromechanical relay that is used in this case study is (see more details for other components in [
In the modelling of this system, Weibull and exponential distributions [
In the previous works [
Figure
Decomposition of horizontal drilling equipment [
As mentioned earlier in the modelling of the system, Weibull and exponential distributions are used here because of their capability for modelling components reliability in different phases of lifecycle. Thus, all reliability parameters are calculated for these distributions.
As shown in the process flowchart (Figure
Horizontal drilling equipment has five types of RBD structures in its design including series, parallel,
The reliability of horizontal drilling system and its subsystems are estimated by the selection of Weibull distribution (Table
The reliability value of subsystems with Weibull distribution.
Subsystem/operational time (hr)  Frame  Cab  Engine  Hydraulic  Rod loader  Vise  Control and electrical  Water pump  The whole system 
































































≈0  ≈0 




≈0 
The reliability value of subsystems with Weibull distribution.
Subsystem/operational time (hr)  Frame  Cab  Engine  Hydraulic  Rod loader  Vise  Control and electrical  Water pump  The whole system 
































































≈0  ≈0 




≈0 
According to Tables
Figure
Measuring reliability importance for all subsystems sat 1000 operation hours.
In this research, ARINC technique is used to estimate the results of reliability allocation. Table
Initial reliability and target reliability for subsystems of drilling equipment with Weibull distribution.
Subsystem  Reliability importance 
Initial reliability 
Weighting factors  Target reliability 

Frame 




Cab 




Engine 




Hydraulic 




Rod loader 




Vise 




Control and electrical 




Water pump 




The whole system  — 

— 

In a repairable system, because of renewal process in the components, the value of system reliability is not good metrics for decision making about the system lifecycle. Therefore, availability measure is used as a combination of reliability and maintainability parameters [
Simulation results for estimating availability features of horizontal drilling system.
Feature  Value 

Mean availability time (all events)  0.951408 
Point availability (all events) at 32000  0.938 
Expected number of failures  211.498 
MTTFF (hr)  766.550264 
Uptime (hr)  30445.05127 
Total downtime (hr)  1554.948732 
Figure
Boundary intervals for mean availability time function at 32000 operation hours.
If additional reliability improvement is required, either higher quality components are selected or the design configuration is changed that is, adding redundancy to the weak reliability points. Design alternatives are used here for improving the reliability of drilling equipment. Figure
Combined failure rates for final design alternatives.
Component  Failure rate (*10^{−6})  Component  Failure rate (*10^{−6})  Combined failure rates for final design (*10^{−6}) 

Inductive drive motor  6.6  Hydraulic pump  34.1  226 
Electrical pump  34.0  226  
Pneumatic pump  25.8  171  
Vacuum pump  45.4 




Diesel drive motor 

Hydraulic pump  34.1 

Electrical pump  34.0 


Pneumatic pump  25.8 


Vacuum pump  45.4 

Water pump subsystem.
According to the results of Table
For the validation of the presented methodology, a benchmarking study was done by available results of similar project, copper mining dump trucks [
The case study of dump truck had plenty of field reliability and maintenance data. Table
Comparison of drilling equipment and dump truck reliability value.
Time (hours)  Reliability of drilling equipment  Reliability of dump truck 

0  1  1 
50  0.7  0.55 
100  0.49  0.26 
200  0.24  0.07 
500  0.029  0.001 
1000  0.001  ≈0 
In this research, a design for reliability methodology was developed for electromechanical systems performance evaluation. It overcomes the drawbacks of other reliability evaluation approaches which are not suitable for complex systems with limited failure data available. This method is applicable in early design phase even when there is only limited failure data. Reliability of a complex system in reverse engineering design phase can be evaluated with this method. The main steps of this approach were presented and an application is demonstrated for the drilling equipment as a case study. The availability analysis indicates that the mean availability of the drilling equipment is 95.1% at 32000 operation hours. Reliability importance analysis illustrates that hydraulic and motor subsystems are critical elements from reliability point of view. In addition, among all components of the system, motor starter has the highest failure rate and reliability importance. With increasing the quality of components in the subsystems or changing the design (e.g., redundancy), reliability of system is improved. At the end, a benchmark study of the result of this research with similar projects shows the effectiveness of the presented method.
Reliability block diagram
Firstorder reliability method
Secondorder reliability method
Failure mode, mechanism, and effect analysis
Reliability index approach
Performance measure approach
Markov chain Monte Carlo
Cumulative density function
Cumulative intensity function
Probability density function
Cumulative distribution function
Time to first failure
Mean time to failure
Mean time between maintenance actions
Mean downtime
Single pole single throw
Identical and independent distribution
Generalized renewal process
Nonhomogenous Poisson process
Homogenous Poisson process
Renewal process
Failure mode and effect analysis
Event tree analysis
Fault tree analysis
Monte Carlo
Early design reliability prediction method
Markov chain Monte Carlo.
The authors declare that there is no conflict of interests regarding the publication of this paper.