Data Mining for Material Feeding Optimization of Printed Circuit Board Template Production

Improving the accuracy of material feeding for printed circuit board (PCB) template orders can reduce the overall cost for factories. In this paper, a data mining approach based on multivariate boxplot, multiple structural change model (MSCM), neighborhood component feature selection (NCFS), and artificial neural networks (ANN) was developed for the prediction of scrap rate and material feeding optimization. Scrap rate related variables were specified and 30,117 samples of the orders were exported from a PCB template production company. Multivariate boxplot was developed for outlier detection. MSCMwas employed to explore the structural change of the samples that were finally partitioned into six groups. NCFS and ANN were utilized to select scrap rate related features and construct prediction models for each group of the samples, respectively. Performances of the proposed model were compared to manual feeding, ANN, and the results indicate that the approach exhibits obvious superiority to the other two methods by reducing surplus rate and supplemental feeding rate simultaneously and thereby reduces the comprehensive cost of raw material, production, logistics, inventory, disposal, and delivery tardiness compensation.


Introduction
Printed circuit board (PCB) is found in practically all electrical and electronic equipment (EEE), being the base of the electronics industry [1].Due to increased competition and market volatility, demand for highly individualized products promotes a rapid growth of PCB orders designed with many specialized features but short delivery time [2,3].Customeroriented small batch production is always employed by a factory with lots of PCB template orders, which is different from mass production, and therefore causes companies to face serious challenges.Optimization of material feeding is one of the critical problems.
The scrap rate and material feeding area of each PCB template order are difficult to be accurately determined in advance of the production.Many factories undergo the violent fluctuation in both surplus rate and supplemental feeding rate due to empirical manual feeding in practice by heavily depending on their experience and knowledge.Individualized surplus template products can be placed in inventory or directly destroyed while frequent material feeding brings supplemental production cost and delivery tardiness compensation.This motivates us to explore the pattern of historical orders through data mining (DM) approach to facilitate more reasonable material feeding for the orders automatically and therefore reduce the comprehensive cost caused by excessive or underestimated material feeding before production.
The general process of DM also known as knowledge discovery in databases (KDD) includes problem clarification, data preparation, preprocessing, DM in the narrow sense, and interpretation and evaluation of results [4].DM in the narrow sense as a step in the KDD process consists of applying data analysis and particular discovery algorithm within an acceptable computational efficiency limit [4].DM tasks can be classified into descriptive and predictive two groups [4,5].Descriptive function of DM mainly aims to explore the potential or recessive rules, characteristics, and relationships (dependency, similarity, etc.) that exist in the data, such as generalization, association, sequence pattern mining, and clustering [4][5][6][7][8].As to the predictive functions of DM, they are usually selected to analyze the relevant trends of the data or the relevant laws to predict the future state.It includes classification, prediction, time series, and analysis [4][5][6][7][8].To achieve these goals, DM solutions employ a wide variety of enabling techniques and specific mining techniques to both predict and describe interpretable and valuable information [4,5,[8][9][10].The enabling techniques mainly refer to the methods for data cleaning, data integration, data transformation, and data reduction that can support the implementation of DM in the narrow sense, while specific mining techniques, like regression, support vector machine, and artificial neural network (ANN), are the approaches used to explore useful knowledge from massive data [4,9].The scrap rate prediction and material feeding optimization of PCB template production can be taken as an application of predictive function of DM; and the specification of scrap rate related features, identifying features which affect scrap rate significantly and related mining techniques, should be carefully studied.Moreover, many features (e.g., required panel) have structural change influence for the scrap rate according to empirical knowledge, and therefore the enabling and mining techniques, interpretation, and evaluation steps in DM should also be adjusted accordingly.
The details of enabling techniques, DM applications for different manufacturing task and different manufacturing industry, patterns in the use of specific mining techniques, application performance, and software used in these applications have been widely studied, and one can refer to [4][5][6][7][8][9][10] for comprehensive review.Electronic product manufacturing industries also exploited several DM methods with the purpose of summarization, clustering, association, classification, prediction, and so on [5,8].And many are closely related to PCB manufacturing industry.Tseng et al. employed Kohonen neural networks, decision tree (DT), and multiple regression to improve accuracy of work hours estimation based on the PCB design data, and the performance clearly exceeds the conventional method of regression equations [11].Tsai et al. developed three hybrid approaches including ANN-genetic algorithm (GA), fuzzy logic-Taguchi, and regression analysisresponse surface methodology (RSM) to predict the volume and centroid offset two responses and optimize parameters for the micro ball grid array (BGA) packages during the stencil printing process (SPP) for components assembly on PCB, and the confirmation experiments show that the proposed fuzzy logic-based Taguchi method outperforms the other two methods in terms of the signal-to-noise ratios and process capability index [12,13].Some other approaches, like support vector regression (SVR) and mixed-integer linear programming, have also been developed for the parameter optimization of SPP [14].Haneda et al. [15] employed variable cluster analysis and -means approach to help engineers determine appropriate drilling condition and parameter for PCB manufacturing.DM-based defect (faults) diagnosis or quality control during manufacturing has also been widely studied [5,7], and many algorithms like adaptive genetic algorithm-ANN [16] and DT [17] have been developed for the defect (faults) diagnosis of PCB manufacturing.
The marketing and sales is another widely investigated direction of DM application in PCB industry.Success in forecasting and analyzing sales for given goods or services can mean the difference between profit and loss for an accounting period [18].Many DM-based methods like -mean cluster and fuzzy neural network, fuzzy case-based reasoning, and weighted evolving fuzzy neural network have been developed by  to select a combination of key factors which have the greatest influence on PCB marketing and then forecasts the future PCB sales.Tavakkoli et al. [22] combined SVR, Bat metaheuristic, and Taguchi method to predict the future PCB sales, and performance comparison indicates that the accuracy of the proposed hybrid model is better than the GA-SVR, particle swarm optimization-SVR, and classical SVR.Hadavandi et al. hybridize fuzzy logic with GA and means to extract useful information patterns from sales data, and results show that the proposed approach outperforms the other previous approaches [18].
However, the quality related research for PCB manufacturing mainly focuses on one operation of the manufacturing process for the purpose of yield improvement [12][13][14][15], and there are few studies on material feeding optimization especially for PCB production using DM mechanism to the best of our knowledge.Meanwhile, the change structure of the studied problem and corresponding change of relevant features have seldom been considered during the mining procedure.Meanwhile, ANN-based approach, as a most frequently used DM method that will also be employed in the study, tries to exploit nonlinear patterns in different problems demonstrating reasonable results; however, problem divided into different subproblems according to structural change always requires different ANN architecture and different learned link weights based on different input features, while this is difficult to learn by the ANN without reasonable preprocessing.
In this paper, a data mining approach (MSCM-ANN) is presented to establish the scrap rate prediction model and optimize material feeding of PCB orders considering the structural change influence based on the use of multivariate boxplot, multiple structural change model (MSCM), neighborhood component feature selection (NCFS), and ANN.The comparison of MSCM-ANN to ANN and manual feeding will be conducted to verify the performance of the proposed approaches.The rest of the paper is organized as follows.In Section 2, variables specification and sample data are described.Methodology, including multivariate boxplot, MSCM, NCFS, ANN, and performance indicators, is presented in Section 3, followed by experimental results and discussion in Section 4. Lastly, conclusions are drawn in Section 5.

Variables and Sample
The data used in this study were collected from Guangzhou FastPrint Technology Co., Ltd.A total of 56 variables inherited from enterprise resource planning system combined with the derived variables were selected and specified in Table 1, in which variables 40 to 56 are the statistic results of manual feeding adopted by FastPrint.The unit in a panel, required quantity/panel/area, and delivery unit area can not only be taken as statistic items, but also feature candidates for MSCM-ANN and ANN model establishment.
Set and unit are two types of delivery unit, whereas panel is production unit that will be partitioned into either set The Qualr for the same order number in the past 2 years 8.824%-100% Note.New orders having no Hquar are replaced by the Qualr for orders having the same layer number and surface finishing operation during the past 2 years.or unit before delivery depending on the requirement of customers.The relation between set, unit, and panel specified in Table 1 is illustrated in Figure 1, in which each panel consists of 10 units in the PCB order.Suppose the customer's required quantity and required panel of the PCB order given in Figure 1 are 90 units and 9 panels, respectively.If the initial feeding is 100 units (10 panels) but finally ended up with 95 qualified units due to scrap rate (i.e., (100 − 95) × /100 ×  × 100% = 5% in this example) after production, then the surplus quantity is 5 units (= 95 − 90) and therefore feeding 10 panels is more reasonable to reduce the redundancy of the customized orders.Conversely, it will result in supplemental feeding if we feed only 9 panels initially due to the scrap rate.

Panel Set Unit
On this basis, 30,117 samples of the orders placed between October 31, 2015, and October 31, 2016, were exported with careful audit for erroneous and missing values.The number of the orders for each required panel is illustrated in Figure 2. It can be seen that the required panel is less than 30 in most of the case, which represents a typical customer-oriented small batch template production in PCB industry.

Methodology
The main flow of the proposed approach (MSCM-ANN) is presented in Figure 3, and various aspects of MSCM-ANN are discussed in detail in the following subsections.The multivariate boxplot, MSCM-based partition, and neighborhood

Low quartile Median Upper quartile
Minimum Maximum component feature selection are the enabling technique of DM considering structural change influence; and the ANNbased prediction model is the mining technique to predict scrap rate; meanwhile, the transformation of scrap rate to surplus rate and supplemental feeding rate is conducted.Performances of the proposed MSCM-ANN will be compared to the ANN and manual feeding based on the same 29,157 samples after outlier detection of original 30,117 samples.Some statistic results of manual feeding are given in Table 1 by the variables 40 to 56.MSCM-ANN and ANN were implemented in Matlab5 version 2017a.

Multivariate Boxplot-Based Outlier Detection.
Identification of outliers and the consequent removal are a part of the data screening process which should be done routinely before analyses [23,24].There are various methods of outlier detection.Some are graphical such as normal probability plot, and others are model-based approaches which assume that the data are from a normal distribution [24].Boxplot is a hybrid of the above two mechanisms for exploring both symmetric and skewed quantitative data, and it can also identify infrequent values from categorical data.Figure 4 shows a description of boxplot, and one could define an outlier as any observation outside the range where IQR =  2 −  1 and is the interquartile range which contains 50% of the data.The value is a lower outlier, if  <  1 − 1.5IQR, and an upper outlier, if  >  1 − 1.5IQR.Detection of outlier sample according to scrap rate, as the target variable of the prediction here, can reduce the impact of accidents that may be caused by machine break, wrong operation, and so on.However, scrap rate related outlier detection influenced by multivariable with structural change does not guarantee they are subject to normal distribution.The modification of the boxplot, called multivariate boxplot here, is developed to identify and discard outliers.The main procedure can be described in Algorithm 1.

Multiple Structural Change Model-Based Sample Partition.
The required panel has significant influence on the scrap rate according to expert experience and initial analysis.The average scrap rate of the orders with different required panel is illustrated in Figure 5.The curve shows declining tendency when the required panel is less than 9 but presents great fluctuations when the required panel is larger than 30.The multiple structural change of average scrap rate versus the required panel may require separate features and prediction models to improve the prediction accuracy.
MSCM was employed to explore multiple structural changes of samples and partition samples.MSCM was initially developed by Bai and Perron [25,26] to address problem of online (time serial related) multiple linear regression (MLR) with multiple structural change along with time.MSCM takes least squares method to detect the number of break points and estimate the change position.Here, the required panel in ascending sort order is considered as time serial related date, and the scrap rate is taken as the online  regression objective.Then the online MLR of the scrap rate with  breaks ( + 1 regimes) can be expressed as   =      +   ,  =  −1 + 1, . . .,   , for  = 1, . . .,  + 1 with the convention  0 = 0 and  +1 = .In this model,   is the observed scrap rate, and   is vectors of independent variable, but only the layer number, required panel, and number of operations are considered here.  ( = 1, . . .,  + 1) is the corresponding vectors of coefficients, and   is the disturbance.The break points,  1 , . . .,   , are explicitly treated as unknowns.The purpose is to estimate the unknown coefficients together with the break points when  observations (samples) on   ,   are available.
Significance test for the structural changes based on newly introduced statistic items sup (),  max ,  max , and sup   ( + 1/) is conducted.The sup () is the  test with original hypothesis  = 0 and alternative hypothesis  = .Then  max = max 1≤≤ sup   () and  max = max 1≤≤   sup   () are introduced to check whether there is the structural change in the model, in which   is set as the weight of  value based on sup   () hypothesis test, and  is the maximum number of .The number of the break points for the model is determined according to sup   ( + 1/) ( ≥ 1), and   ( + 1/) is the  test with original hypothesis having  break points and alternative hypothesis having  + 1 break points.Therefore the samples can be partitioned into subgroups according to the break points and corresponding change position [25,26].All the estimation and hypothesis tests are conducted based on Matlab code provided by Qu [27].

Neighborhood Component Feature Selection.
It is necessary to employ some feature selection methods to remove irrelevant and redundant features to reduce the complexity of analysis and the generated models and also improve the efficiency of the whole modeling processes [28][29][30].Wrappers, embedded, and filter are three types of the approaches developed for feature selection [31].In this study, neighborhood component feature selection (NCFS) was applied for each group of the samples.NCFS is an embedded method for feature selection with regularization to learn feature weights for minimization of an objective function that measures the average leave-one-out regression loss over the training data [32].
Given  observations,  = {(  ,   ),  = 1, 2, . . ., }, where   ∈   are the feature vectors and   ∈  are response (scrap rate).In this study, the aim is to predict the response given the training set .Consider a randomized regression model that randomly picks a point (Ref()) from  as the "reference point" for  and sets the response value at  equal to the response value of the reference point Ref().Now consider the leave-one-out application of this randomized regression model, that is, predicting the response for   using the data in  − = /(  ,   ).The probability that point   is picked as the reference point for   is where   (  ,   ) = ∑  =1  2  |  −   |,   ,  = 1, 2, . . ., , are the feature weights, and  is the kernel function.Suppose () = exp(−/) as suggested in [32], where  is set to 1 after standardizing the dependent value to have zero mean and unit standard deviation.
Let ŷ be the response value of the randomized regression model and let   be the actual response for   .And let  be a Procedure (, , , ), : samples size; : initial step length; : regularization parameter;  small positive constant Initialization: Standardize features to have zero mean and unit standard deviation;  (0) = (1, . . ., 1),  (0) = −∞,  = 0 for  = 1, . . .,  do Compute   and   according to (1) and ( 2) loss function that measures the disagreement between ŷ and   .Then, the average value of (  , ŷ ) is After adding the regularization term  ∑  =1  2  , the objective function for minimization is The loss function for (  ,   ) here is the mean absolute deviation defined as ∑  =1 | −   |/.The main procedure of NCFS for regression feature selection can be summarized in Algorithm 2.

Neural Network-Based Prediction Model and Transformation.
The most frequently used DM method for prediction is ANN.An ANN is network neurons that consist of propagation function and activation function, which receives the input, changes their internal state (activation) according to the input, and produces the output depending on the input and activation [33].Despite the black box mechanism of ANN, it has been widely used in prediction problems demonstrating reasonable results as scrutinized in the literature [34].ANN with their successful experience in forecasting diverse problems are among the most accurate and trustworthy used models.Their ability to learn from incomplete datasets in order to predict the unseen section of data besides their capability of modeling the problem with the least available data and estimating almost all continuous functions has made them attractive enough to be used in prediction problems [34].Köksal et al. [5] reviewed the reported performance of the DM methods and also pointed out that the ANN performance is mostly compared to the performance of the classical statistical modeling method such as multiple linear regressions (MLR), and better performance of ANN can naturally be observed in multidimensional data since these are powerful tools in modeling nonlinear relationships.
In this study, three-layer back propagation ANN was taken to predict the scrap rate; then it was transformed to determine the predicted surplus rate and supplemental feeding rate, two most concerned performances for material feeding optimization.The architecture of ANN is set by trial and error, and the number of nodes in hidden layer is set to max(3, (  + 1)/20), in which   is the number of input features.The ANN-based architecture for the scrap rate prediction and material feeding optimization is illustrated in Figure 6, in which a neuron (in the hidden layer or the output layer)  receives the outputs  1 ,  2 , . . .,   [] of other neurons 1, . . .,  [] which are connected to  with bias   , and the propagation function of neuron  is defined as =1     +   , in which the superscript [] denotes the th layer and  [−1] is the number of units of the ( − 1)th layer.The results of propagation function are further processed by sigmoid activation function; that is,   = (  ) = 1/(1+ −  ).
The transformation can be conducted according to (4)-( 5) following the hypothesis that each feeding panel of an order has the same scrap probability, and scrap rate for an order will not change along with the predicted feeding area.Therefore, a [1]  a [2]   a [2]   a [1]   1 a [1]   2 a [1]   n [1] −1 a [1]   n [1]   x 1 x 2 x n−1 x n Figure 6: ANN-based architecture for the prediction of scrap rate and material feeding optimization (one can refer to Tables 1 and 2 for the notations).Scraa Pd Predicted surplus rate Surpr Pd Predicted supplemental feeding frequency for an order Supff Pd Predicted supplemental feeding rate Supfr Pd the predicted surplus rate and supplemental feeding rate can be calculated according to ( 6)-( 8), and related variables are presented in Table 2.
The predicted feeding quantity and panel for each order are described as in which Duap is the delivery unit in a panel given in Table 1.
Then the predicted feeding quantity, area, and scrap area should be revised accordingly: Thus the predicted surplus rate of each order can be defined as The predicted supplemental feeding frequency for each order is specified as Reqa is the required area defined in Table 1.Therefore, the predicted surplus rate for these orders with    = 0 can be calculated by is the number of samples with    = 0, 1 ≤  ≤ .The   for these orders with    = 1 is not considered here because the surplus area cannot be determined before the supplemental feeding is finished.But their surplus rate is always lower than the   defined in (10) in practice.
The supplemental feeding rate for all the samples can be defined as The surplus rate and supplemental feeding rate of the manual feeding can be computed by 3.5.Performance Indicators.In order to evaluate the effectiveness of the model, the following evaluation indicators are used [35].The mean squared error (MSE) is the average of square sums between predicted data ŷ and original data   , which can be described as The mean absolute error (MAE) is the average of the sum of the absolute difference between observed values and estimated values.It can be expressed as The mean absolute percentage error (MAPE) is the average of the sum of the normalized absolute difference between observed values and estimated values.The formula is written as  where  is the number of samples.The final purpose is to determine the feeding panel for each order on the basis of the predicted scrap rate, and   and ŷ are replaced by the least feeding panel and predicted feeding panel, respectively.
Then the deviation of the predicted feeding and the manual feeding can be computed according to ŷ −  and   −  for sample , 1 ≤  ≤ , respectively.The error diagram can be drawn as a distribution of the deviation for all samples.Combined with the aforementioned predicted surplus rate and supplemental feeding rate for material feeding optimization, the final performance will be evaluated by the five indicators.

Results and Discussion
According to the multivariate boxplot approach described in Section 3.1, 960 outliers were trimmed and 29,157 samples left.Figure 7 shows the boxplot of the scrap rate for different value of the required panel and the layer number.Figure 7 illustrates that the outliers are shifted by the values of the required panel and the layer number, and therefore outlier detection considering different feature values is more reasonable.
Significance test for break points of the samples according to  max ,  max , and sup   ( + 1/) was conducted based on default parameters given in [27], and the results are given in Table 3.The values of  max and  max indicated that the samples have significant structural change at 5% level, and the values of sup   ( + 1/) showed that 5 break points are significant.The final break position of ascending sorted samples according to the required panel was 8,935, 13,995, 17,003, 21,791, and 27,491.Therefore the samples were partitioned into 6 groups with indexes 1-6 which corresponds to the samples with the required panels 1, 2, 3, 4-6, 7-19, and greater than 19, respectively.The samples in group 6 can still be segmented for sup   (5 | 4) greater than the critical value at 5% level [25].However, the sample size in group 6 was small (1666 samples) and the average of scrap rate greatly fluctuated which can be seen from Figures 1 and 6(a).Thus further partition for the samples in group 6 was not conducted.
Then NCFS was conducted for each group of the samples in which the initial step length was set to 0.9 and small positive constant  was set to 10 −4 .5-fold cross-validation instead of a single test was conducted to optimize regularization parameter  initialized with 20 randomly selected values between 0-1.2 × 10 −3 according to [32], and the  value that minimizes the mean loss across the cross-validation was selected to fit NCFS. Figure 8(a) shows that the loss performance for 20 different  values for the group of the samples with the required panels between 7 and 19 and the fourth  corresponding to the lowest mean loss was selected as the regularization parameter for NCFS. Figure 8(b) illustrates the indexes of the selected features based on the selected .Final selected features for different group of the samples and all samples as a whole are given in Table 4. Difference of the selected features indicates that the samples with the different features may distribute in different regimes.However, layer number, Rogers material, number of operations, Huawei standard, plug hole with resin, second drilling, back drilling, Cu/Ni/Au pattern plating, gold finger plating, gold plating, delivery unit area, and historical qualified rate are critical features for most of the samples, which means that different values of these features will influence the scrap rate greatly in general, and these selected features also match well with the experience of experts from the factory.
The 70%, 15%, and 15% of the mutually exclusive samples were randomly selected as training, validation, and test data for each partitioned group of the samples, and the sample sizes for each group are given in Table 5. Prediction models of MSCM-ANN were trained, validated, and tested for each group of the samples with 5 runs based on the corresponding selected features while the ANN was trained, validated, and The average results of each group of the samples achieved by MSCM-ANN, ANN, and manual feeding are given in Table 7.The surplus rate and supplemental feeding rate obtained by the three approaches are given in Table 8.The following results can be drawn accordingly: (1) Both MSCM-ANN and ANN can reduce the surplus rate and supplemental feeding rate performances simultaneously compared to the manual feeding as shown in   predicted surplus rate and 11.91% predicted supplemental feeding rate while ANN achieved 15.16% and 12.69% for the two performance indicators, respectively.Better performance of MSCM-ANN may be influenced by more precisely selected features, more reasonable ANN architecture, and well-trained models for each partitioned sample group based on MSCM considering the structural change influence compared to ANN which could not explore the pattern in each partitioned group.
(2) The results in Table 8 indicate that MSCM-ANN and ANN achieved the lower surplus rate but relatively higher supplemental feeding for the samples when the sample group corresponding to the intervals of required panel increases.The main reason is that the required panel that was rounded up to the nearest integer based on the required quantity resulted in high redundancy when the number of the required panels was small, which therefore caused lower supplemental feeding rate.Taking the PCB order in Figure 1, for example, if the required quantity is only 4 units, then it will cause 100% ((10-2-4)/4 × 100%) surplus rate for feeding one panel with 20% scrap rate; the supplemental feeding should not be conducted until the scrap rate is greater than 60%.In contrast, lower surplus rate but relatively higher supplemental feeding rate was obtained when the required panel increased with lower surplus, but great fluctuation of the scrap rate may cause insufficient feeding panel and therefore bring about high supplemental feeding frequency.
The predicted scrap rate and predicted supplemental feeding rate on average obtained by MSCM-ANN for the training, validation, and test sample in each group are Note.Surpr Pd and Supfr Pd are the predicted surplus rate and supplemental feeding rate, respectively, and they can be obtained according to the definition specified in Section 3.5 and the data provided in Table 7.  illustrated by Figure 9, which indicates that the MSCM-ANN is stable to determine the surplus rate and supplemental feeding rate for each group of samples in most of the case.The relatively large deviation of the predicted supplemental feeding rate between training and test for the samples in group 6 may be caused by the large fluctuation of the scrap rate for different orders.Meanwhile, relatively small sample size is harmful to maintain the stability of the model.Figures 10(a) and 10(b) present the error diagrams of the results obtained from the manual feeding and run predicted results of MSCM-ANN, respectively, for the samples in group 5.  MSCM-ANN are more likely to distribute with mean value 0.3 and short tail for training, validation, and test samples, while most of the errors obtained by the manual feeding distributed with mean value 1.725 (Figure 10(a)), and the large positive tail indicates (Figure 10(a)) that the manual feeding can easily lead to high redundancy after delivery of order.The deviations between the manual feeding panel/predicted feeding panel and least feeding panel for sample in group 5 are illustrated in Figure 11.It indicates that the predicted results in Figure 11(b) achieved lower deviation in most of the case compared to the manual feeding results in Figure 11(a), which can bring lower surplus panel, and therefore reduce the cost of material, production, inventory, and disposal.
Figures 12(a) and 12(b) present the regression of manual feeding panel and predicted feeding panel versus least feeding panel, respectively.Results indicate that the predicted feeding panel coincides better with the least feeding panel in Figure 12(b) compared to the manual feeding panel illustrated in Figure 12(a), and therefore the waste of surplus quantity and area can be reduced.The same coefficients and similar regression expressions obtained by MSCM-ANN for training, validation, test, and all samples mutually verify the stability of the proposed approach.

Conclusions
Accurate determination of the number of feeding panels for each PCB template order can reduce the cost of material, production, logistics, inventory, disposal, and delivery tardiness compensation.In this paper, a data mining approach (MSCM-ANN) involving the use of multivariate boxplot, MSCM, NCFS, and ANN was developed for establishing the scrap rate prediction model and material feeding optimization for PCB template order considering the structural change influence for the predicted scrap rate.The various aspects of the approaches have been discussed in detail.Mean squared error, mean absolute error, and mean absolute percentage error, three prediction performance indicators, combined with surplus rate and supplemental feeding rate, two most concerned performances indicators for material optimization in practice, were presented to evaluate the established model.The multivariate boxplot was adopted for scarp rate outlier detection considering the structural changes influence of different input features, while the MSCM was applied to explore the multiple structural changes of the samples and therefore partition the samples into 6 different subgroups.NCFS and ANN were utilized for feature selection and scrap rate prediction model establishment for each group of the samples, respectively.After comparing MSCM-ANN with ANN and the manual feeding, the following conclusions and contributions are highlighted as follows.
(1) The proposed MSCM-ANN shows superior prediction accuracy on training, validation, and test dataset with the lowest MSE, MAE, MAPE, surplus rate, and supplemental feeding rate performance indicators compared to ANN and the manual feeding.MSCM-ANN reduces the surplus rate and supplemental feeding rate from 27.44% and 17.91% obtained by the manual feeding to 11.96% and 11.91%, respectively, but ANN can only reduce them to 15.16% and 12.69%, respectively.The same coefficients and similar regression expressions of the predicted feeding panel versus the least required panel for training, validation, test, and all samples mutually verify the stability of the proposed MSCM-ANN.
(2) The established model provides a new mechanism based on DM for the material feeding optimization of PCB template production that has seldom been studied according to the best of our knowledge.The application of the developed approach can replace the empirical manual feeding and cut

Figure 1 :
Figure 1: Structure of a PCB panel.

Figure 2 :Figure 3 :
Figure 2: Number of orders for each required panel.

Figure 5 :
Figure 5: Average scrap rates with different required panel.

Figure 7 :
Figure 7: (a) Boxplot for different value of required panel.(b) Boxplot for different value of layer number.

Figure 8 :
Figure 8: (a) Mean loss performance for 20 different lambda () values for samples with required panel between 7 and 19.(b) Feature selection based on NCFS with the lowest loss lambda for samples with required panel between 7 and 19.

Figure 9 :
Figure 9: (a) Predicted scrap rate of MSCM-ANN for different samples.(b) Predicted supplemental feeding rate for different samples.

Figure 10 :
Figure 10: (a) Error diagram of the results from manual feeding for the samples in group 5. (b) Error diagram of predicted result obtained by MSCM-ANN for the samples in group 5.
Figure 10(b)  illustrates that the errors obtained by

Figure 11 :
Figure11:(a) Deviation of samples between manual feeding panel and least feeding panel.(b) Deviation between predicted feeding panel and least feeding panel.

Table 2 :
Prediction results related variables.

Table 3 :
Significance test of break points.
*A statistic significance at the 5% level.

Table 4 :
Selected features for different group of the samples.Note.Selected features are marked with ◊.The description of features (variables) has been specified in Table1.testedfromallsamplesbased on the selected features as listed in the last column of Table4.Comparison of average MSE, MAE, and MAPE of 5 runs for MSCM-ANN, ANN, and the manual feeding is given in Table6.The results indicate that both MSCM-ANN and ANN have obvious superiority in reducing the three indicators.However, MSCM-ANN can achieve smaller MSE, MAE, and MAPE compared to ANN, which means that the established models considering structural change can further improve the precision.

Table 5 :
Samples sizes of training, validation, and test data.

Table 6 :
Performance indicators achieved by different approaches.

Table 7 :
Predicted and real results of each group of samples.

Table 8 :
Comparison of surplus rate and supplemental feeding rate.