Study on Discrete Manufacturing Quality Control Technology Based on Big Data and Pattern Recognition

Aiming at the quality control problems in the discrete manufacturing process of large and superlarge equipment, which cannot meet the urgent needs of production, a quality control method based on big data and pattern recognition is proposed. A large amount of data is collected through the test equipment developed in the discrete manufacturing process; a database of typical working conditions and an information tracking system relying on the cloud platform were formed. The working conditions were divided by the principal component analysis (PCA) and improved K-means algorithm. The Markov prediction model predicts the working conditions, recognizes the pattern with typical working conditions, regulates the processing parameters, and achieves quality control. Taking the quality control of the hydraulic cylinder manufacturing process above 5m as an example for experimental veriﬁcation, the experiments indicated that working conditions can be automatically identiﬁed and classiﬁed through pattern recognition technology. The process capability index Cpk increased from 0.6 to 1, which proved the eﬀectiveness of quality control and the improvement of processing capabilities.


Introduction
Large and superlarge equipment is an indispensable product for important manufacturing industries such as construction machinery, aerospace, and steel. It is also an important foundation for the development of manufacturing and defense equipment. Complex systems, harsh environments, high loads, long-term operation, and relatively backward maintenance systems, equipment often encounter various failures, which seriously affect the progress and benefits of construction projects and the safety of people's property. Large and superlarge equipment is a typical discrete manufacturing process with multiple varieties and small batches. e production process is complex, the manufacturing is expensive, and the maintenance process is cumbersome. It is of great significance to improve the manufacturing quality control of large and superlarge equipment.
Now the order-driven production model has gradually transformed the manufacturing process of enterprises from traditional mass production to customized production of multiple varieties and small batches. How to reduce costs and shorten the production cycle while meeting customer needs is a key issue. Manufacturing process optimization is the essence of manufacturing companies, and process management and quality control rely more on intelligent decision-making methods [1]. Scholars at home and abroad have conducted research on quality control during processing. Jiang XY [2] used similar manufacturing theory, statistical process control, neural network, and other theories and technologies to establish an intelligent process quality control system. Wang [3] proposed an intelligent quality control system modeling method based on multiagent. Zheng [4] proposed a Bayesian network and big data analysis integration method for manufacturing process quality analysis and control. Wu [5] proposed a quality control method for complex product assembly process based on digital twin technology. Lee [6] developed a multivariate dynamic quality control theory and proposed an optimal design scheme based on Hotelling t2 control chart and a control method with variable sampling interval. Celano [7] considered the limited sampling number of the small-batch production processes and studied the monitoring performance and statistical properties of the t-control chart. Castagliola [8] studied the design method of the EWMA control chart in the case of unknown sample data, compared the different detection effects of the CUSUM control chart and the EWMA control chart, and summarized the applicable occasions of the two control charts. Jiang PY [9] proposed a multiprocess machining quality prediction modeling method based on an assignment error transfer network to perform machining quality control. Xiang C [10] analyzed several key technologies of data monitoring and quality control system based on the Internet of ings technology from the perspective of system integration. Hosseininasab S M E [11] used the clustering algorithm to mine and classify the quality data for quality control. At the same time, some scholars introduced data mining, brittleness theory, and other technologies into the process quality control [12,13].
Due to the immature detection and quality control system of key quality control points in the manufacturing process, the quality control of large-scale discrete manufacturing cannot be applied in actual production. We take the actual production of hydraulic cylinders above 5 m as an example to carry out quality control of the processing process and establish the typical working conditions and pattern recognition in the actual processing process to carry out the whole process quality control. It is of great significance to improve processing accuracy and production efficiency and save processing time and production costs.

Quality Control Principles
With the development of information technology, big data has brought new opportunities for the development of all walks of life. As a typical discrete manufacturing enterprise, large and superlarge equipment has a large amount of data collected in the manufacturing process and a large amount of information. At the same time, there is often a high degree of correlation between variables. e data relationship presented is often nonlinear, and the subprocesses are coupled with each other. It makes the establishment of quality process control model more difficult. e lack of two technologies for typical working condition construction and complex processing condition recognition is the difficulty faced by the current large and superlarge equipment quality control technology. In view of such difficulties, this paper sets up various sensors, cameras, and other pieces of data collection equipment through the process, records all factors that affect the quality of processing, accumulates big data, and uses the improved K-means working condition algorithm to divide the working conditions and form typical working conditions. Control the quality of key quality control points through pattern recognition technology. In the production process, constantly measure the quality characteristics of the production system, and with the help of various quality statistical analysis methods and control methods, comparative analysis, quality decision-making and quality evaluation with the quality control objectives of key processes, and continuously feedback decision-making information and evaluation results to various stages of the production system to perform quality control of the production process. e principle of quality control of product manufacturing process is shown in Figure 1.

Working Condition Pattern Recognition
3.1. Information Traceability. Network collaborative manufacturing is the only way for the transformation and upgrading of the manufacturing industry. It plays an important role in making full use of limited resources, improving the agility of manufacturing, and shortening the production cycle [14]. Make full use of advanced technologies such as cloud computing and big data analysis, with the help of various sensors, and achieve the acquisition of the entire processing data of large and superlarge equipment from the raw material plant-the external plant-equipment manufacturing plant through wireless transmission and network transmission technology to build a database. For the content of the general large and superlarge equipment database system, it should include subdatabases such as the intrinsic parameter database of the device, the knowledge base, the real-time dynamic database, and the historical database of the large and superlarge equipment. Integrate subdatabases into a whole database system for quality control. After the database is established, the corresponding settings of the database can be realized according to different application permissions, and the performance of application operations such as query and modification of the database can be repeatedly verified.
rough the continuous adjustment and improvement of the database system, the data collected in the database will be further enriched and optimized to provide a solid guarantee for better playing the role of the database. e data is shared on the cloud platform in the form of a database system. All processes within the company use a local area network-shared database to establish a company's internal quality control system. Various companies can rely on integrated information network, Beidou satellite communication, and so on and form a set of global connectivity security network system to achieve on-demand access in fixed places and mobile conditions by transmission encryption, isolation switching, and other technologies. Construct a cloud platform based on information integration of quality control system as shown in Figure 2.
In order to better control the quality and find key quality control points, manufacturing information traceability is required during the manufacturing process of large and superlarge equipment. Each product or material has a unique identification code. Each code corresponds to a quality control point, connecting the stations at the quality control point to form a production line. e collection of production lines of all products or materials constitutes a manufacturing network, which is a traceability model of processing and manufacturing process information. All information in the manufacturing process of large and superlarge equipment involved in the whole process can be traced through this model and the cloud platform architecture.

PCA.
Due to the large number of materials, processing, equipment, and other parameters designed during the manufacturing process of large and superlarge equipment, the correlation between the parameters is relatively complex, and it is not easy to predict and optimize the processing quality. In order to better perform quality control, this study discards some of the influencing factors according to the correlation between various parameters and the degree of impact on straightness by PCA [15] and further reduces the analysis index, so that the workload will be greatly reduced in the future.
According to the quality big data, establish the original parameter matrix composed of the data of materials, processing, and equipment (straightness is the number of rows, and the influence parameter is the number of columns).
R � (1)  According to the PCA mathematics, the matrix R � [X ij ] m×n is standardized, the covariance matrix is established, the eigenvalues are restored, the main elements of the matrix are sorted from left to right, the cumulative contribution rate is calculated, and the main elements are determined.
e correlation matrix C of the standardized matrix X is solved: (3) e eigenvalues of matrix C are solved and arranged from large to small as e contribution rate of i-th principal component is calculated: In general, the cumulative contribution rate exceeding 85% can meet the requirements.

Division of Operating Conditions Based on Improved K-Means
Algorithm. According to the information tracing system established above, a large amount of large and superlarge equipment has accumulated data in the manufacturing process. e data presents a nonlinear relationship, which makes the data processing cumbersome. e improved K-means clustering algorithm is simple in principle, is easy to implement, and can quickly deal with a large amount of data. e clustering effect is better, and it is suitable for dividing the working conditions of the data. e evaluation criterion function is defined as In formula (6), x ij is the j th data of the i th category; n i is the number of the i th data; m i is the center of the i th category; x lr is the r th data of the l th category. In formula (6), the numerator is the distance between the data within the class and the center of the class, indicating the closeness of the class; the denominator is the sum of the distance between the data outside the class and the center of the class, indicating the dispersion between different classes. e smaller the ratio, the more concentrated the data within the class, and the more dispersed the data among different classes, the more reasonable the division. e specific steps of the K-means algorithm are as follows: (1) Select the initial value k � n (2) Cluster all the data to get all the clustering centers m i (i � 1,2,...,k) and the data x ij (i � 1,2,...,k;j � 1, 2,..., n i ) (3) Calculate G n (k) according to equation (6) (4) When k � n+1, repeat step (2); calculate G n+1 (k) according to equation (6) if G n (k)>G n (k+1) iteration ends; otherwise, continue to repeat steps (3)- (5) e selection of the initial clustering center m i (i � 1, 2,..., k) has a great influence on the clustering result and running time, so it is necessary to select the appropriate k clustering centers. In view of this shortcoming, this paper optimizes the method of randomly initializing cluster centers, and the specific steps for improving the K-means algorithm to select cluster centers are as follows: (1) Randomly select data from the input data set as the first clustering center m 1 (2) For each data x ij in the data set, calculate the distance D(x ij ) � min‖x ij − m i ‖2 from the nearest cluster center in the selected cluster center (3) Select the next new data as the new clustering center; the selection principle is the point with larger D(x); the probability of being selected as the clustering center is greater (4) Repeat steps (3)-(4) until k cluster centers are selected e improved K-means algorithm is used to divide the working conditions of key quality control point quality data in the discrete manufacturing process of large and superlarge equipment, and the processing parameters corresponding to the key quality control points are excavated to obtain typical processing conditions. By dividing the working conditions of the processing process, a database of typical processing conditions is constructed to lay the foundation for subsequent quality control.

Markov Prediction Model.
According to the big data during the processing quality of large and superlarge equipment, analyze the dynamic response characteristics of processing quality, and compare the prediction methods such as neural network and response surface method. Finally, it was decided to use Markov prediction theory to establish a key quality control point prediction model to improve the prediction accuracy of quality parameters and save calculation time [15].
Take each working condition divided in Part 3 as a state and record as E 1 ,E 2 ,...,E n , a total of n possible states, and the working condition at time t is denoted as x t ,x t ∈ (E 1 , E 2 , ..., E n ). Let P ij be the state transition probability from state E i to state E j . P � P 11 P 12 · · · P 1n P 21 P 22 · · · P 2n ⋮ ⋮ · · · ⋮ P n1 P n2 · · · P nn P is called the state transition probability matrix.
If the predicted straightness is currently in state E i , then at the next moment, it may change from state E i to any one of E 1 , E 2 ,..., E i ,..., E n . In order to calculate the state transition probability matrix P, the transition probability P ij of each state transitioning to any other state is found. In order to find each P ij , the idea of frequency approximation probability is used to calculate it.
where a ij is the number of observations from state E i to state E j . e state probability π j (k) represents the probability that an event is in E j at the k-th moment after k state transitions under the condition that the initial state (k � 0) is known.
Markov model is shown as follows: In the formula, π(n) is the state probability vector at time n, and π(0) is the initial state probability vector. According to the Markov model, the state of future straightness can be predicted.

Pattern Recognition and Optimization.
As a typical discrete manufacturing product, the quality of large and superlarge equipment is mainly affected by raw material data, processing parameters, and equipment parameters during the manufacturing process. According to the quality data of key quality control points after processing, the working conditions are divided and the typical working condition database is established. Collect the data that affects the processing quality parameters in real time and compare it with the typical working condition database to verify whether it conforms to the existing historical typical working conditions of the working condition database. If it belongs to the current working condition, it prompts the category of the working condition; otherwise, save the current processing parameter data and automatically judge. e processing parameter data is automatically discriminated afterwards. If the processing quality results can meet the processing requirements, a new typical working condition is established. Otherwise, the parameter adjustment is performed to conform to the typical working condition. A typical working condition database is constructed as shown in Figure 3.
X�(x 1 ,x 2, ..., x n ) represents an n-dimensional processing condition. And W typical processing conditions are ω 1 ,ω 2 ,..., ω W ; the basic problem of pattern recognition is to find W discriminant functions d 1 (X), d 2 (X),..., d W (X) based on attributes. Working condition X belongs to typical processing working condition ω 1 ; then An unknown processing condition X is said to belong to the i-th typical processing condition, and it is only valid when X is substituted into all discriminant functions to obtain the maximum value of d i (X).
Define the prototype of a typical processing condition as the average value of the model.
In the formula, W is the number of typical processing conditions, N j is the number of parameters of a typical processing condition ω i , and all these typical processing conditions are taken in the sum formula. e method of determining the category member of an unknown processing condition X is to assign it to the nearest condition. e use of Euclidean spatial distance discrimination closeness simplifies the problem of calculating distance measures.
When the value of D j (X) is the minimum distance, X is classified as a typical processing condition ω i . e minimum distance indicated that it is the best match.
Choosing the minimum distance is equivalent to the following function evaluation: And when d gets the maximum value, X is classified as class ω i .

Experimental Verification
Taking the actual production process of hydraulic cylinders of more than 5 m as an example, the data in the information traceability system established by the entire processing process are divided into working conditions to form a typical working condition. At the same time, the entire process data of 30 hydraulic cylinders were tracked for quality control.
And there was no obvious failure in the external ambient temperature and production equipment. rough the quality control methods proposed in this paper, quality control of key processes is carried out to verify the feasibility of the algorithm.

Division of Typical Working Conditions.
e cylinder barrel and the piston rod are the key parts of the hydraulic cylinder and the key to the overall performance of the hydraulic cylinder. e spiral pattern and straightness are the key quality control points that affect the processing quality of the cylinder and piston.
ere are many parameters such as materials, processing, and equipment designed, the correlation between the parameters is relatively complex, and it is not easy to predict and optimize the processing quality during the manufacturing process. erefore, the PCA in Section 3.2 is used for dimensionality reduction. At the same time, the improved K-means algorithm is used to divide the spiral pattern and straightness data in the database, and five cluster centers are selected. e results of the division of processing conditions were shown in Figure 4. According to the clustering results, we can get the range of quality control points for typical working conditions as shown in Table 1.

Pattern Recognition and Optimization.
According to the actual production of hydraulic cylinder processing, the straightness and spiral pattern are detected by using detection equipment such as sensors to accumulate a large amount of data. Markov can be used to predict the working conditions based on the above typical working conditions. If the working conditions can meet the quality requirements, the next process can be carried out normally; otherwise, the quality parameters should be optimized to reduce the speed of maintenance and scrapping.
Taking each typical working condition divided in Table 1 for each state, the state transition moment P can be obtained according to formula (8): e measured straightness data are used as the initial state in the fourth working condition, so the initial state probability vector is According to the Markov prediction model, the probability of occurrence of each working condition in the future 30 times is predicted. e first working condition is taken as an example as shown in Figure 5.
Markov prediction model is not related to the initial state. It can be found from Figure 5 that the probability of occurrence of the working condition at the 21st time is completely stabilized at a certain value. e probability of the 21st prediction is π(21) � π(0)P 21 � 0.0504 0.1765 0.2017 0.4034 0.1681 .
(17) e Markov model can predict the probability of occurrence of typical working conditions, which provides a basis for process improvement and theoretical analysis of production and processing sites. In order to verify the possibility of quality prediction and pattern recognition and  optimization, track the entire production and processing process of 30 hydraulic cylinders and extract the two key quality control point data of spiral pattern and straightness, as shown in Table 2. Figure 6 shows the comparison of the predicted probability and actual probability of the working condition prediction model. It can be concluded that the prediction error rate of Markov's working condition prediction model is 9.40%, and the difference is small, which is within the acceptable range. e prediction error rate of the model is 9.40%; the difference is small and within the acceptable range.
e Markov prediction method can obtain the probabilities that any working condition may appear in the future processing, and the working condition corresponding to the straightness and spiral pattern of the next moment can also be obtained by a one-step transfer matrix. Owing to the problem of straightness or spiral pattern out of tolerance of some workpieces, we have adjusted relevant parameters of    the influencing factors of straightness and spiral pattern in the process of hydraulic cylinder processing. According to the conclusion obtained from the PCA, only the parameters determined by the PCA need to be adjusted to achieve the purpose of regulating straightness. e pattern recognition technology is used to recognize the working conditions after processing. e distribution of typical processing conditions of 30 sets of data was shown in Figure 7. According to the distribution results of typical working conditions, the quality parameters of the processing procedures are optimized according to the solution in Table 1. is study uses the process capability analysis method to compare and analyze the quality control before and after quality control. Table 3 shows the results of process capability analysis before and after quality control.
rough comparison of process capability analysis results before and after quality control, the key index of the spiral pattern before quality control is 0.3-0.5 mm, the key      index of straightness is 0.5-1.0 mm, and the process capability index C pk of processing equipment is about 0.6, which is currently at the level of 3σ. e key index of the spiral pattern after quality control is 0.2-0.4 mm, the key index of straightness is 0.3-0.9 mm, the process capability index of processing equipment is about 1 C pk , which is at the level of 4σ, and the processing quality has been significantly improved.
In order to further verify the effectiveness of the quality control of the hydraulic cylinders, the hydraulic cylinders after quality control were tracked in the market, and the market fault feedback was counted. e feedback rate of hydraulic cylinder leakage and noise caused by deviation of straightness and spiral will be reduced from 3.85% in 2018 to 2.4% in 2019 and 2.4% in 2019 to 1.8% in 2020. As shown in Table 4, the market feedback rate continued to decline steadily.

Conclusions
is research focuses on the quality control of discrete manufacturing of large and superlarge equipment, carries out work such as the establishment of information traceability system, the construction of typical working conditions, pattern recognition, quality prediction, and parameter optimization, and draws the following conclusions: (1) Filter the processing parameters that affect the quality through the information tracing system and PCA method, and use the improved K-means algorithm to divide the quality data in the database to establish typical working conditions. (2) e probability of the occurrence of the working condition can be predicted through the Markov prediction model. e error rate is only 9.40%, which is within the acceptable range and meets the actual production and processing requirements by comparing the predicted result with the measured value. (3) e effectiveness of quality control has been verified by process capability analysis, of which the process capability index C pk has been increased from about 0.6 to about 1, respectively. e effect is significant. (4) is research provides technical support for the optimization of discrete manufacturing quality control parameters and high-precision machining of large and superlarge equipment. e number of parameters of a typical processing condition D j (X): e distance between the working condition and the average value of this type of mode.