A Simulation Model for Machine Efficiency Improvement Using Reliability Centered Maintenance : Case Study of Semiconductor Factory

The purpose of this study was to increase the quality of product by focusing on the machine efficiency improvement. The principle of the reliability centeredmaintenance (RCM)was applied to increase themachine reliability.The objective was to create preventive maintenance plan under reliability centeredmaintenancemethod and to reduce defects.The study target was set to reduce the Lead PPM for a test machine by simulating the proposed preventive maintenance plan. The simulation optimization approach based on evolutionary algorithms was employed for the preventive maintenance technique selection process to select the PM interval that gave the best total cost and Lead PPM values. The research methodology includes procedures such as following the priority of critical components in test machine, analyzing the damage and risk level by using Failure Mode and Effects Analysis (FMEA), calculating the suitable replacement period through reliability estimation, and optimizing the preventive maintenance plan. From the result of the study it is shown that the Lead PPM of test machine can be reduced. The cost of preventive maintenance, cost of good product, and cost of lost product were decreased.


Introduction
Due to the higher competition of businesses and industries, the battle for survival of business has become stronger.Many businesses and industries have to improve their performance in order to retain their businesses in the competitive world.The semiconductor factory is one of the industries that require effective production planning in order to improve production process for higher competence.Reliability and maintainability play a crucial role in ensuring the success of operation in production processes because they could be used to determine production availability and to increase quality.In addition, maintenance policy also plays an important role in increasing operation effectiveness with minimum cost [1].The reliability centered maintenance (RCM) is a widely accepted methodology that has been available in the industry for over 30 years and has been proved to offer an efficient strategy for preventive maintenance optimization [2].
Designing a preventive maintenance schedule for a production line is not always easy.Predicting the outcome of the scheduling without reliable reason or evidence could create a certain level of uncertainty for the design engineers.New technological developments have enabled the advantage of simulation models to test the performance of the manufacturing process lines even before they exist and to define as well as to implement the scheduling plan [3].Concept of simulation technique is to imitate the real system and after that use model to simulate different conditions and then study the effects to evaluate possible strategies in order to improve the current system.Simulated model could present results and effects of various conditions according to the assumptions in testing stage of simulation model.The results assist analyzer to understand the transient stage of the system and predict the effects that occur when changing condition(s) of the system [4].This study considered the case study of a test machine which caused defect (i.e., Lead defect) in a semiconductor factory.From the historical data of corrective maintenance, various components in a test machine such as test socket, inserter, and Lead pusher deteriorated over time affecting the Lead quality of products.In order to stabilize test machine and reduce defect in process, an effective preventive maintenance plan was required.Thus, this study aimed to establish a preventive maintenance plan based on reliability data of test machine and applied discrete event simulation to select the preventive maintenance intervals that gave the best performance values.

Identify Components and Analyze Failure Modes and
Effects of Each Component.Failure analysis of machine mechanical and change of kit (COK) components would reveal the impact of each type of failure on the Lead quality.The failure modes and effect analysis (FMEA) were carried out on the test machine components under study to evaluate the various modes of failure of each component.After brainstorming with test machine experts, the FMEA worksheet was obtained.The worksheet consists of defining what can fail and the way it can fail (failure mode) and the effect of each failure mode on the components.Severity was ranked according to the seriousness of the failure mode effect on the product quality as defined in Table 1.Occurrence is scored according to likely failure rate as defined in Table 2.
Detection is an assessment of the ability of a method to detect the failure of the component as defined in Table 3.The risk priority number was computed for each failure mode identified in the study as shown in Table 4.

Analyze Priority of Components.
A Pareto chart is a graphical tool for ranking the causes of problems from the most significant to the least significant.The 80-20 rule was applied to identify the most critical failure components of the test machine under study.The cumulative probabilities of occurrence of RPN values obtained for various failure modes of the components were considered in the failure analysis of the test machine under study as displayed in Table 5.The Pareto chart constructed for the RPN values of the failure modes that cause the Lead defects is presented in Figure 1.
From the 80-20 rule, the critical failure modes of components that caused complete or partial failure of Lead quality were considered in the reliability evaluation of test machine.

Collect and Analyze System
Data.Statistical software would automatically choose appropriate continuous distributions to fit to the input data, calculate maximum likelihood estimates for those distributions, test the results for goodness of fit, and display the distributions in order of their relative rank as shown in Figure 2. The relative rank was determined by an empirical method, which uses effective goodness-offit calculations.While a good rank usually indicates that the fitted distribution is a good representation of the input data, an absolute indication of the goodness of fit is also given.

Build Reliability Centered Maintenance Model.
In order to clarify and analyze the logic of the simulation model, the flowchart describing the logic of reliability centered maintenance (RCM) model of a test machine was presented Very high 1 Discrepant parts cannot be made because item has been error-proofed by process/product design.
Source: [5]. in Figure 3.The overall scenarios of RCM model were concluded in Table 6 before simulation model had been created.Once a unit arrives to test machine, the simulation model would check whether the machine component reached the preventive maintenance (PM) plan.If so, the unit would be blocked to wait until PM completed.In case the machine component did not reach PM plan, the component would be checked by the second condition to determine whether it still had the less cycle counts than the time-to-failure (TTF).If so, the tested unit would become the good unit.Otherwise, the tested unit would become the lost unit.The third condition would check whether the test machine continuously produced 13 lost units.If so, it would be stopped for corrective maintenance (CM).
The following assumptions were made in the simulation model of RCM model.
(1) Since reliability centered maintenance model would determine preventive maintenance schedules depending on a reliability of component, the simulation model would assume that the component carried 100% reliability when the simulation started and decreased over cycles of processing.(2) One repair technician is continuously available.
(3) Once a repair action begins on a component, it is fully completed without exemption.
(4) Once the repair is completed, the component will resume operation with 100% reliability.
(5) Once failed component occurred, test machine can still run.But the output will be produced as being defective.
(6) Arrival time is constant feeding to tester without idling.
The simulation model would be run for 1 year (365 days).Once the simulation run was completed and the results were obtained, the report provided the user to see how the machine utilization, the reliability of component, and cost had changed.The complete simulation model for PM scheduling optimization is shown in Figure 4.

Verify and Validate System Model.
During the implementation of the simulation model, it was verified using test run methods to ensure that the model has been implemented with correct algorithms and used certain data in certain moment of time.Moreover, debugger tool was used to point out programming errors when they appeared.Counters were also inserted throughout the model for local result measurement.By this way, some mistakes could be found and corrected.
Before experiments, executions model set with all parameters was configured.It had been validated using the hypothesis methods.In this study, the simulation model would not produce statistically different values for 2 selected performance measures-throughput and Lead ppm.The results are shown in Table 7.To determine statistical significance,  = 0.05 was set and the following hypothesis test was performed: The subscripts  and  denote the real system data and simulation results.The confidence interval was computed for comparing two systems.Additionally, this approach does not require that the two populations have equal variances.With 95% confidence, the conclusion is that there is no significant difference between the throughput and Lead ppm of the two systems (real system and simulation model) given that the confidence interval includes zero.While the throughput and Lead ppm are two prominent performance measures for analyzing manufacturing systems, other performance measures can also be tested in a similar manner.

Analyze Output of System.
To analyze the output from simulation model, simulation results would be examined in terms of total cost and Lead ppm that will be used to determine the most appropriate preventive maintenance scheduling scheme.The cost structure and Lead ppm formulation are shown in Tables 8 and 9.The cost rates were input into the computational simulation model.The real data were coded for the sake of industrial confidentiality.Table 9: Lead ppm for performance measures.

Formulation
Lead PPM Lead PPM = (Total Lost Units/Total Units Produced) × 10 6 could be executed considering total cost of system and Lead ppm.For this case study, the objective of optimization was to find out the optimal preventive maintenance interval that maximized the total cost, minimized Lead ppm, and maximized the number of preventive maintenance intervals.
To do this, simulation model will seek optimal values for decision variables (i.e., preventive maintenance interval for each component).The solutions were generated by varying the values of decision variables according to their data type, lower bounds, and upper bounds.

Finding of the Study
3.1.The Optimal Preventive Maintenance Interval.In order to solve optimization problem, optimization software will generate solutions by varying the values of decision variables according to their data type, lower bounds, and upper bounds.After selecting the decision variables, an objective function was defined to measure the utility of the solutions tested by optimization software.PM intervals were defined as the decision variables.The upper bound was obtained from the current PM interval of each component as shown in Table 10.For example, test socket had the current PM interval as equal to 6 months or 43,200 cycles.The upper bound of test socket in the simulation model was set to 43,200 cycles for optimizing preventive maintenance interval.
In contrast, the minimum time-to-failure value of each component would be set as the lower bound of each component as shown in Table 11.For example, the minimum timeto-failure value of test socket was 3.59 months or 25,871 cycles.The lower bound of test socket in the simulation model was set to 3 months or 21,600 cycles for optimizing preventive maintenance interval.In this study, there were five optimization strategies proposed in order to define test machine components and multiterm objective function.The reasons for those optimization strategies were to optimize all scenarios that could impact the total cost, Lead PPM, and PM interval of each component.The details of each strategy were provided in Table 12.
Table 13 and Figure 5 show the current and proposed PM interval for each critical component.The proposed PM interval resulted in a lower number of cycles than the current PM interval.

Comparison of Simulation Results
. After running the simulation with 10 replications (i.e., 10 years) using the proposed PM interval, the results of comparing the test machine performance of current and proposed model by simulation are as follows.Lead ppm was decreased from 1087 ppm to 15 ppm or decreased by 98.6%.In contrast, total cost of system was increased to 59,299 Baht.Furthermore, cost of preventive maintenance was decreased to 49,059 Baht, cost of good product was increased to 4,827 Baht, and cost of lost product was decreased to 5,343 Baht.The total cost of each strategy compared with the current system is presented in Table 14. Figure 6 illustrates the results comparison.

Conclusions
The objective of the study was to create preventive maintenance plan under reliability centered maintenance method and to reduce the defects of TS056 package occurring during TMP process.The critical components of test machine were examined as the case study, where the machine behavior and outcomes were obtained by using a ProModel-based simulation model.The simulation optimization approach based on evolutionary algorithms was employed for the preventive maintenance technique selection process to select the PM interval that gave the best total cost and Lead PPM values.Five distinct optimization strategies were identified.The effects on the performance measures were described.According to the results of the study, optimization strategy 1 provided the highest total cost and the lowest Lead PPM for the case study.The total cost increased from 2,493,604 Baht to 2,552,833 Baht (increased 59,299 Baht).Lead PPM could be reduced from 1,087 ppm to 15 ppm or decreased 98.6 percent.Furthermore, cost of preventive maintenance was decreased from 117,095 Baht to 68,036 Baht or decreased 42 percent, cost of good product was increased from 2,616,120 Baht to 2,620,947 Baht or increased 0.2 percent, and cost of lost product was decreased from 5,421 Baht to 78 Baht or decreased 98.6 percent.Some important conclusions from the study are as follows.
(1) To improve equipment reliability, the critical components required immediate attention to quantitatively evaluate reliability centered maintenance based on essential historical data.The study showed that simulation technique could be used as a computer-aided solving tool in reliability engineering area.It assists in decision making regarding maintenance and Lead defect reduction.
(2) The critical components were selected based on risk priority number (RPN) from failure mode and effect analysis (FMEA).Time-to-failure (TTF) and timeto-repair (TTR) of each critical component were collected from the maintenance reports, failure observations, and daily reports prior to creating simulation model.
(3) Simulation model was used for the entire process to define characteristics of components and to imitate the machine behavior under different preventive maintenance intervals and different reliability constraints.Total cost and Lead ppm were evaluated to obtain the most suitable preventive maintenance schedule for the case study.

Figure 2 :Figure 3 :
Figure 2: Various distributions fitted to the input data.

Figure 4 :
Figure 4: Complete simulation model for PM scheduling optimization.

Figure 5 :
Figure 5: Current and proposed PM interval for each critical component.

Table 3 :
Detection evaluation criteria.Control is based on variable gauging after parts have left the station, or go/no go gauging performed on 100% of the parts after parts have left the station.

Table 4 :
FMEA worksheet for test machine components.

Table 5 :
Cumulative percentage of occurrence of RPN values.

Table 6 :
Overall scenarios of RCM model.

Table 8 :
Cost structure for performance measures.

Table 10 :
Upper bound setting for each component.

Table 11 :
Lower bound setting for each component.

Table 12 :
Overall scenarios of RCM model.
Figure 6: Comparison of total cost of current system and five strategies.