Robust Locally Weighted Regression for Profile Measurement of Magnesium Alloy Tube in Hot Bending Process

Section ﬂattening often occurs in the hot bending process of magnesium alloy tube with large curvature. In order to control the forming quality of the tube, it is necessary to measure the section proﬁle of the magnesium alloy pipe online. In this paper, the laser vision system is used to measure the proﬁle of magnesium alloy tube. Due to the inﬂuence of the environment and the surface quality of the pipe, there are obviously isolated outliers in the proﬁle data, which seriously aﬀects the accuracy and precision of the tube measurement. An outlier identiﬁcation algorithm based on robust locally weighted regression and Pa k Ta criterion is proposed. This algorithm is used to identify the typically isolated outliers in the measurement process and discuss its identiﬁcation ability. Meanwhile, it is compared with the moving mean identiﬁer and the Hampel identiﬁer. Subsequently, the ellipse ﬁtting of proﬁle data was carried out, and the ﬁtting ellipse parameters and ﬁtting precision of the curved section were obtained. At the same time, the ﬁtting results were compared before and after the outliers are eliminated. The experiment proves that the outlier identiﬁcation method based on robust locally weighted regression and Pa k Ta criterion can eﬀectively identify outliers in proﬁle data, especially for spot outliers. This algorithm is a robust, accurate, and eﬃcient outlier identiﬁcation method, which can eﬀectively improve the laser proﬁle measurement accuracy of the pipe section and has great signiﬁcance for the quality control of magnesium alloy tube.


Introduction
Magnesium alloy has the advantages of low density and high specific strength and specific stiffness, so it is gradually receiving attention from various industries [1][2][3][4]. In particular, various types of magnesium alloy tubes are widely used in aviation, aerospace, transportation, and other fields [5][6][7]. However, various defects often occur in the hot bending forming process of magnesium alloy tubes with large curvatures, such as section flattening, excessive thinning, and wrinkle fracture [8][9][10][11]. Section flattening is an unavoidable problem in magnesium alloy tubes with large curvature. As shown in Figure 1, the section flattening of the magnesium alloy tube is mainly due to the action of bending moments M 1 and M 2 , the compression stress F c on the inner side, and the tensile stress F t on the outer side, and the pipe is bent due to the uneven force on both sides. As a result, the section is deformed to become approximately elliptical. For the same specification, the smaller the bending radius, the larger F t and F c and the more obvious the trend of section flattening. If it is a bending without a mandrel, the flattening is more serious, which affects the forming quality of the pipe. In order to control the forming quality of the tube, the section profile must be measured online, and the flattening of the magnesium alloy section must be detected in real time.
Pipe profile measurement generally uses contact measurement and noncontact measurement [12][13][14][15]. Machine vision is an important method in noncontact measurement. Compared with traditional manual contact measurement methods, machine vision measurement has high accuracy and fast speed and causes no damage [16][17][18][19]. In this study, a laser vision system based on line structure light was used to measure the flattening section of magnesium alloy tube caused by the large curvature hot bending to obtain the accurate profile data of the magnesium alloy tube section. e online vision measurement of tube bending is strongly demanded to improve the dimensional accuracy. However, the quality of the profile data deteriorates sharply due to the environment and surface quality of the pipe. In particular, many isolated outliers appear in the measured data. Outliers generally mean that the observed values are significantly different from most observed values, that is, the measured data do not obey the statistical distribution law of the data [20]. In this experiment, the outliers appear in isolation as the form of dot or point, which are not necessarily related to the data quality before and after. erefore, they are also called isolated outliers. Outliers can be generated for a variety of reasons, such as sensor noise, channel interference, and human factors [12,21]. ese outliers will cause data distortion, leading to the miscalculation of model parameters and wrong analysis results. erefore, it is necessary to identify and eliminate the outliers in the laser measurement process to improve the precision and accuracy of the measurement.
Commonly used identification methods for outliers include Nair, Grubbs, Dixon, etc [22,23]. ese methods are difficult to apply in laser profile measurement due to the limitation of data distribution and data amount. is paper presents an outlier identification method based on robust locally weighted regression (RLWR), which is applied to laser profile detection. RLWR is developed from locally weighted regression (LWR) and belongs to nonparametric estimation. Nonparametric estimation is an important research direction of modern statistical analysis. It can adapt to more complex nonlinear changes without assuming the specific form of population distribution and error distribution and without directly obtaining data models, which is more flexible, robust, and widely applied than parameter estimation [24,25].
RLWR is a robust fitting process that integrates local polynomial estimation and locally weighted regression with excellent smoothing performance.
is method was first proposed by Cleveland [26] and further elaborated [27,28], subsequently improved by Jacoby [29] and Loader [30]. It has been gradually applied to different fields of scientific research and engineering applications. Ma et al. [31] used the RLWR to reduce the impact of high-frequency noise on superresolution enhancement of multiangle remote sensing imagery. Leonor et al. [32] minimized the influence of the inhomogeneity effect on tree reradiation pattern by using RLWR. Chen et al. [33] proposed an algorithm based on RLWR and robust z-scores for the construction of a pit-free canopy height models. Nurunnabi et al. [34,35] and Liu et al. [36] used RLWR techniques to study the filtering of 3D ground cloud point data. Yu et al. [37] applied the method to the smoothing of combustion kinetics data of pine sawdust biochar. is study is based on the robust smoothing performance of RLWR, which is used to smooth the laser measurement data and realize the identification of isolated outliers. At the same time, profile fitting was carried out to verify the effectiveness of this algorithm in the profile inspection.
is paper organized as follows. Section 2 introduces the laser vision measurement system and its basic principles and analyzes the profile data of magnesium alloy tube, pointing out the typical outliers in the data. In Section 3, the moving average algorithm, Hampel algorithm, and RLWR algorithm are, respectively, adopted to identify the outliers. It focuses on the outlier recognition effect for the RLWR combined with PakTa criterion, median criterion, and quartile criterion, respectively. In Section 4, elliptic fitting of profile data is performed by using the RLWR identification algorithm, and relevant elliptic parameters and fitting error are obtained, at the same time, compared with original data. Lastly, a conclusion is given in Section 5.

Laser Vision Measurement
System. e laser vision measurement system adopted in this experiment mainly consists of a line-structured light sensor, image processing unit, and control unit, as shown in Figure 2. e system works as follows: the laser source in the sensor projects line structure light to a measured object. e reflected light is captured by the camera in the sensor. e profile data are recorded and displayed by industrial PC after image processing. e measuring platform can move along the axial direction of the pipe, which is controlled by the motion control system, realizing the profile measurement of each section for the pipe.
Laser triangulation is the basic principle of line-structured light vision measurement. e line-structured light projected by laser source forms a scattered light band on the pipe surface, and the scattered light is imaged in CCD array. e distance and coordinate data of the measured point can be obtained through triangular geometry [38,39].
As shown in Figure 3, the point p(x, y, z) is a measurement point on pipe surface, point p ′ (u, v) is the imaging camera, b is the distance between the center of light source and the center of camera lens, and θ is the angle between the X axis and the construction line, which is formed by the measured point and the center of light source. e precise coordinates of p(x, y, z) on the measured profile can be obtained from the spatial geometry in the figure.

Typical Profile Outliers.
Observe the profile data of each section obtained by the measuring system. ere are two typical isolated outliers in the measured profile, that is, the spot outliers and the point outliers. e spot outliers or speckle outliers are composed of multiple point outliers, as shown on the right side of Figure 4. e double-point isolated outliers are shown in the middle of Figure 5.
In order to facilitate analysis, as shown in Figure 6, the double-point outliers in Figure 5 are superimposed in Figure 4. Subsequently, the recognition ability of different identification algorithm is investigated for two types of outliers.

Mathematical Problems in Engineering 3
Moving mean identifier or movmean identifier is the easiest way to identify isolated outliers [40,41]. e basic definition is as follows. For the data sequence x 1 , x 2 , x 3 , . . . , x n , moving window length is wl, and the outliers are judged by the following equation: where μ i is the local mean and σ i represents local standard deviation (LSD) within the moving window. e moving mean identifier is to use the 3σ criterion to judge the outliers in the data window when the length is wl. When the difference between the measured value and the local mean is greater than three times the local standard deviation, it is considered as an outlier. e moving mean identifier is used to identify outliers in Figure 6, and the recognition results are shown in Figures 7 and 8. e moving window length is gradually increased. When the window length increases to 19, only one point outlier is identified, as shown in Figure 7. When the window length is increased to 25, the threshold range is reduced in the middle of the profile so that the double-point outliers in the middle could be completely identified (see Figure 8). However, the spot outliers on the right are never identified regardless of the length of the moving window.
It can be seen from Figures 7 and 8 that the distinguishing threshold at the location of outliers is greatly increased due to the existence of outliers in the middle of the profile. With the increase of window length, the threshold curve in the middle of the profile is gradually smooth and the threshold range is gradually reduced. However, the upper and lower threshold ranges on both sides of the profile increase significantly. It indicates that the identification ability of outliers on both sides of the profile decreases with the increase of window width.

Hampel Identifier Recognizes Isolated Outliers.
Hampel identifier is a median identification method, which uses the median and absolute median deviation as a robust estimation of the location and distribution of outliers, with good robustness [42][43][44][45].
Hampel identifier is defined as follows: for data sequences x 1 , x 2 , x 3 , . . . , x n , the number of neighbors on either side of is k; then, the moving window length is 2k + 1, and the local median is m i .
When the difference between the measured data and the local median is greater than t times MAD l , the measured value is considered to be an outlier, as shown in equation (4).
Hampel identifier is used to recognize the outliers in Figure 6. e identification effect is observed by changing the length of moving window when t � 3. As shown in Figure 9, when the window length is 5, the threshold range of identification is relatively narrow. Hampel identifier was able to identify the double-point outlier in the middle of the profile, but it is unable to identify the spot outliers on the right side. At the same time, the identification threshold fluctuates significantly with the spot outliers. Furthermore, the misidentification of outliers is observed. Besides, influenced by the discontinuous data on both sides of the profile, the identification threshold fluctuated greatly and the endpoint data are identified as outliers.
When the window length is increased to 25, as shown in Figure 10, the upper and lower identification threshold on both sides of the profile increased significantly and the central double point outliers can be effectively identified, but    the spot outliers on the right side of the profile are unrecognized. It is worth noting that the misidentification in Figure 9 no longer appeared. Meanwhile, several discontinuous data at the right end of the profile are identified as outliers.
If the moving window length continues to increase, the identification threshold on both sides of the profile will increase accordingly. Although the double-point outliers can be identified, the spot outliers are still unrecognizable.
Compared with the moving mean identifier, the Hampel identifier significantly reduces the threshold range and the fluctuation phenomenon, which can effectively identify point outliers, but it is still unable to identify spot outliers. In addition, the large interval of profile data will increase the difference between the data and the median, resulting in a large fluctuation for threshold, which affects its ability to identify outliers.

Recognition of Isolated Outliers Based on RLWR.
In this section, the RLWR algorithm is combined with PakTa criterion, median criterion, and quartile criterion to identify the isolated outliers and choose the appropriate smoothing window length and observe its identification effect.

Identification of Outliers Based on RLWR and PakTa
Criterion.
e RLWR smoothing algorithm is combined with the PakTa criterion, i.e., 3σ criterion, to identify the laser profile outliers; this algorithm can be referred to as the RLWRP identifier. e basic approach of this identifier is to smooth the profile data by using the RLWR algorithm firstly, and then the residual between smoothing data and original data is calculated; finally, we use the PakTa criterion to identify outliers. e algorithm of RLWRP identifier is as follows. e measured data sequence is x i , y i , i � 1, 2, . . . , n. e data model assumes the following: where g(x i ) is a smooth function of x i and ε i is independent and normally distributed with zero mean and variance. Set the smoothing coefficient as f, where 0 < f ≤ 1, and round f · n to get the data width r, r � [f · n]. Taking each observation point x i as the center, select the appropriate f to determine the smoothing window length wl, wl � x i ± r.
Subsequently, a weight function is selected for the locally weighted regression (LWR). LWR typically uses tricube weight function W(x) for weighted least squares fit, defined as where N(x i ) is local neighborhood in smoothing window which is closest to x i and d(x i , x j ) is the distance between x i and x j in smoothing window. e value of W(x i ) is a maximum for the point closest to x i and reduces to zero for the point farthest to x i in smoothing window.
Use weighted least squares method get estimates of parameters. e parameters estimates of equation (5) are the values of the parameters that minimize e coefficients from each local neighborhood are used to estimate the fitted values g(x i ) at x i .
Generally, the RLWR selects the bisquare weight function Q(z) as follows: where z i � e i /(6 · s), in which e i is the fitting residual, i.e., e i � y i − y i . s � Median(|e i |), in which s is the median of |e i |.
Replace W(x i ) with Q(z i ) · W(x i ) as new bisquare weight, which is used to estimate the new set of RLWR coefficients by minimizing the error sum of squares: e new RLWR fitting value y i is calculated using weighted least squares method. Repeat the above steps of  robust enhancement, and the final robust locally weighted fitting value is obtained.
Next, the outliers are identified according to the 3σ criterion.
e residual g i is calculated by using the smoothing value obtained by the RLWR algorithm and the original measurement data, and then we get the mean of residual, i.e., μ � 1/n n i�1 g i . Finally, the standard deviation σ is obtained: When the difference between the residual and the mean is greater than three times the local standard deviation, it is considered to be an outlier, as shown in the following equation: RLWRP identifier is used to identify the outliers in Figure 6, and its identification ability is observed under different smoothing windows. e length of smoothing window increases gradually from 5. is method shows good recognition effect when the length of smoothing window increases to 11.
As shown in Figure 11, the blue dot is the original data and the red dash dot line is the RLWR smoothing curve. e RLWR smoothing curve retains the characteristics of original profile without the risk of excessive smoothing. In Figure 12, the distribution trend of the original data is removed, which is obtained by using the residual between the smoothing data and the original data, and the outliers in the residual are identified by the 3σ criterion, as shown in the box. e identification result also is plotted in Figure 11. It can be seen from the figure that the method successfully identifies the double-point outlier and the spot outlier. Meanwhile, a few discontinuous data at the right end of the profile are recognized as outliers.

Identification of Outliers Based on RLWR and Median Criterion.
is section uses RLWR smoothing algorithm combined with median criterion to identify outliers; it can be called the RLWRM identifier. e algorithm is as follows.
According to the residual g i obtained in above section, the median of the residual sequence is calculated, i.e., m g � median(g i ).
e scaled median absolute deviation (MAD) is defined as follows: where k � 1/( � 2 √ erfc − 1 (1/2)) ≈ 1.4826. When the data element is greater than three times MAD, it is considered to be outlier, as shown below.
e RLWRM identifier is used to identify outliers in Figure 6. e RLWR smoothing data are used when the smoothing window length is 11. e identification effect is shown in Figures 13 and 14. In Figure 13, the blue point is the residual between RLWR smoothing data and original data and the red boxes are the outliers identified by the median criterion. e identification result and the smoothed value are plotted in Figure 14 for observation. It can be seen from the figure that although the RLWRM identifier can recognize the double-point outliers and the spot outliers, there are many misidentifications about the isolated outliers.

Identification of Outliers Based on RLWR and Quartile Criterion.
is section uses RLWR smoothing algorithm combined with quartile criterion to identify outliers; it can be called the RLWRQ identifier. Quartile criterion is a relatively robust identification method; the algorithm divides sorted data into quarters; Q 1 , Q 2 , and Q 3 are their break points. Q 1 is lower quartile (25 percentile), Q 2 is median (50 percentile), and Q 3 is upper quartile (75 percentile). e interquartile range (IQR) is introduced here as a statistic for checking outliers, i.e., IQR � Q 3 − Q 1 .
Outliers are defined as elements more than 1.5 IQR above Q 3 or below Q 1 .   Mathematical Problems in Engineering e RLWRQ identifier is used to identify outliers in Figure 6. Similarly, the smoothing window length of RLWR is 11. e identification results are shown in Figures 15 and  16. It can be seen from the figures that the identification effect of the RLWRQ identifier is similar to the RLWRM identifier.
is algorithm also has misidentification of outliers and identifies more normal data as outliers. If these values are removed, the profile will not be truly reflected, forming new errors and affecting the measurement accuracy.

Profile Fitting and Error Analysis
According to the previous section, the RLWRP identifier can obtain better identification results for the isolated outlier. e identification results in Figure 13 are used to remove outliers, and the ellipse fitting experiment is performed by the least squares method. e least squares method is one of the most important methods of data fitting. e least squares method has the characteristics of simple, effective, and strong applicability. erefore, this method is selected to conduct ellipse fitting research.
is paper chooses the algebraic least squares method to carry out ellipse fitting research, which is to determine the ellipse parameters by measuring the smallest algebraic distance squared from the fitting ellipse to the ellipse. e elliptic algebraic equation is expressed as follows: According to the principle of least squares method, its objective function is minimized as follows: To minimize F(α) on the basis of the extreme value principle, the following equation exists: us, a linear equation is obtained. en, by solving the linear equations and combining the constraints, the values of the equation coefficients can be obtained. Get the elliptic equation and draw the fitted ellipse. Finally, get the ellipse equation and draw the fitted ellipse. e fitting results are compared before and after outlier eliminating, as shown in Figure 17. Meanwhile, the fitting ellipse parameters are shown in Table 1.
In Figure 17, the blue dots are the original data, the boxes are the outliers that are identified, the dotted line is the fitted ellipse of the original data, and the dash-dotted line is the fitting ellipse after removing the outliers.
Before removing outliers, " * " is the center of the ellipse fitted with the original data, the diameter of the ellipse's major axis is 11.7197 mm, the eccentricity is 0.9263, and the ellipticity is 0.6233. e ellipticity is defined as follows: After eliminating the outliers, the center of the fitted ellipse is "+," the major axis diameter is 15.8586 mm, the eccentricity is 0.6739, and the ellipticity is 0.2611.
At the same time, the profile was measured by the coordinate measuring machine (CMM).
e measurement data are shown in Table 1. By comparison, it is found that the fitting ellipse with outliers removed is similar to the measurement results of CMM.
From the above figure and table, it can be found that the fitting result of the original profile has deviated from the actual situation. After removing the outliers by using the RLWRP identifier, the fitted ellipse conforms to the measurement reality.
In addition, the fitting error before and after removing outliers is analyzed, and the results are shown in Table 2. For the original profile, the sum of squares due to error (SSE) is 7.1938e − 06 and the root mean square error (RMSE) is 1.5750e − 04. After eliminating the outliers, the SSE and the RMSE of the fitted ellipse are reduced to 2.1157e − 06 and 8.6617e − 05, respectively. It shows that the elimination of outliers greatly reduces the fitting error and improves the fitting accuracy, and the measurement results are more accurate.

Conclusions
In this paper, the profile of magnesium alloy tubes with large curvature is measured by a laser vision system based on line structure laser. ere are two typical outliers in measurement, that is, point outliers and spot outliers. For these outliers, the moving average method, the Hampel method, and the RWLR method were, respectively, adopted to identify the outliers of profile data. And their ability to identify outliers is discussed for the above methods with different window lengths. e experiment found that all the above methods could identify the isolated point outliers, but neither the moving mean method nor the Hampel method could identify the isolated spot outliers. In this article, the RWLR method is studied emphatically for the isolated outliers, which was combined with the PakTa criterion, the median criterion, and the quartile criterion. e research shows that the RWLR smoothing algorithm combined with PakTa criterion, i.e., RLWRP identifier, can more accurately identify different types of outliers with a lower misidentification rate.
At the same time, according to the outlier identification result of the RWLRP identifier, the profile fitting was carried out by the algebraic least squares method. en, the main parameters of the fitted ellipse are obtained, and the fitting errors are calculated. After the comparison and analysis of the fitting results before and after the outlier processing, it is found that the data contaminated by outliers will lead to a great deviation of profile fitting and wrong profile shape   parameters. e RWLRP identifier is a robust, accurate, and efficient outlier identification method, which can effectively deal with outliers in profile data, especially for spot outliers. is algorithm is suitable for data cleaning in the line structure light measurement, which can effectively improve the precision and accuracy of online profile measurement in the process of hot bending of magnesium alloy tube.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest regarding the publication of this paper.