Research on Feature Extraction of Indicator Card Data for Sucker-Rod Pump Working Condition Diagnosis

. Three feature extraction methods of sucker-rod pump indicator card data have been studied, simulated, and compared in this paper, which are based on Fourier Descriptors (FD), Geometric Moment Vector (GMV), and Gray Level Matrix Statistics (GLMX), respectively. Numerical experiments show that the Fourier Descriptors algorithm requires less running time and less memory space with possible loss of information due to nonoptimal numbers of Fourier Descriptors, the Geometric Moment Vector algorithm is more time-consuming and requires more memory space, while the Gray Level Matrix Statistics algorithm provides low-dimension featurevectorswithmoretimeconsumptionandmorememoryspace.Furthermore,thecharacteristicofrotationalinvariance,both intheFourierDescriptorsalgorithmandtheGeometricMomentVectoralgorithm,mayresultinimproperpatternrecognitionof indicatorcarddatawhenusedforsucker-rodpumpworkingconditiondiagnosis.


Introduction
The sucker-rod pump system is the most widely used form of artificial lift for the onshore oil well production [1][2][3].Approximately, 80% of the oil wells in the world, 90% of those in China, are being produced by the sucker-rod pumps [4,5].The maintenance and optimization of a sucker-rod pump system is a costly and time-consuming operation.The indicator card is the relation curve between the load and the displacement of a sucker-rod pump in an intact suck cycle, in which -axis represents displacement and -axis represent load [6].The indicator card is helpful to analyze the down-hole working condition of the sucker-rod pump wells [7], which can judge the operation condition of the sucker-rod pump well and provide reliable proof of high efficiency, reasonable exploitation for the oil well production.While the system is operating, the card can indicate such shape that might be a normal operation or a fault situation.According to different kinds of real-time indicator card data, the pattern recognition and fault diagnosis techniques are used to identify some different curve shapes, locate which kind of abnormal situation is, and interpret why the fault occurs [8].Therefore, the correct and quick identification of the sucker-rod pump indicator card is essential to the fault diagnosis of down-hole working condition.The automatic fault diagnosis of sucker-rod pump working condition is a visual interpretation process [9].Nowadays, the traditional methods of interpretation are not suitable for the automatic fault diagnosis of the down-hole conditions.And several signal processing methods, such as artificial neural network (ANN) [10] and fuzzy support vector machine (FSVM) [11], have been studied and applied to pattern recognition of indicator cards to improve the accuracy and efficiency of suckerrod pump system fault diagnosis.Because there are more fault patterns and less fault samples in the sucker-pod pump working conditions, the above approaches have their limits, respectively.In recent years, a method called biomimetic pattern recognition (BPR) [12] has been proposed, which is based on the principle of homology continuity and is suitable to recognize and classify those objects with more pattern types and less samples.
This paper is focused to study, compare, and select the proper feature extraction methods used to analyze the suckerrod pump indicator card data before pattern classification.The optimal feature extraction algorithm of sucker-rod indicator card data followed by the matching pattern recognition is helpful to locate the exact fault type of working sucker-rod pump, which is of great significance in improving crude oil production as well as preventing some possible safety accidents.The different feature extraction algorithms of indicator card are based on Fourier Descriptors, Geometric Moment Vector, Gray Level Matrix Statistics, Area and Difference Curve, respectively [13].
The area-based method can only determine some limited fault types of narrow-band distribution indicator card, such as pump spraying fault [14].The difference curve-based method could not determine some greatly dangerous pump failures, such as stuck pump fault [15].In this paper, we focus on three feature extraction algorithms of the indicator card, which are based on Fourier Descriptors (FD), Geometric Moment Vector (GMV), and Gray Level Matrix Statistics (GLMS), respectively.With numerical experimental simulation and analysis, three different algorithms are compared in terms of the consumption of time and the complexity of memory space.

Feature Extraction Methods of
Indicator Card Data 2.1.Method I: Algorithm Based on Fourier Descriptors.Fourier transformation is a popular method for reconstruction and classification of image.It generates a complete set of complex numbers-the Fourier Descriptors (FD), which represent the object shape in a frequency domain [16].
To reduce the computational complexity, the polygonal approximation method is used to cross out those redundant indicator card points (data).The procedures are as follows.
Firstly, according to a given value, which is called , we traverse all the digital pixels (data) of the indicator card curve in order to choose the feature pixels (data) of the polygon, which meet the condition with maximum curvature of a certain length curve.
Secondly, we store those feature data to an array.The given value () is assigned as 0.008 according to comparison of different computation procedures.Take the pump-on-touch fault as an example; 34 feature pixels (data) are extracted from the original 702 pixels.Figure 1 shows the reconstructed graphic curve of the pump-on-touch fault compared with the original one.
It is reasonable to employ Discrete Fourier Transforms (DFT) to obtain FD [17].However, some sample errors may be produced during the sampling process.
After the feature extraction of polygonal approximation, the Fourier Transform (FT) of each polyline is employed to avoid the possible error caused by sampling and to improve the speed and accuracy of calculation.
The Fourier Descriptors, denoted by (), are generated by such math formula as follows [18]: where  denotes the polygons perimeter.(  ,   ) denotes the coordinates of vertex P  .  denotes the accumulated length sum of those short polylines between P 0 and P  .The amplitude spectrum of the indicator card obtained after Discrete Fourier Transforms (DFT) is shown in Figure 2. The lower frequency elements of Fourier Descriptors contain the most important information of the indicator card while the higher frequency ones contain less information.Therefore, a subset of the Fourier Descriptors can be used to discriminate different shapes of curves.In this paper,  is set as 10, which means the first 10 FDs are used.
The Fourier Descriptors are affected by the location of vertex P  , the scale and direction of curve.To eliminate these effects, the normalization is employed.If the given curve object is firstly magnified by  times, secondly its starting position is shifted by , and then it is rotated by  degree, finally it is translated by the displacement ( 0 ,  0 ).We get a ( The effect of modulus and phase changes on Fourier Descriptors is eliminated due to the identically equal ratio as shown in the formula (2).

Method II: Algorithm Based on Geometric Moment Vector.
In image processing field, Geometric Moment Vector (GMV) can be used as an important feature to represent objects due to its translation invariance, rotation invariance, and scale invariance.
Since the scanned image of indicator card is grayscale and the image outline is not smooth, it is necessary to preprocess the image by using binarization and refinement technology.Figure 3 is a single-pixel binary image after the binarization and refinement processing.
The seven two-dimensional moment invariants are as follows, any of which is not sensitive for translation, scaling, mirroring, and rotation [18]: (3)

Method III: Algorithm Based on Gray Level Matrix
Statistics.The grid method is one of the traditional image feature extraction methods.The processing steps of the grid method are as follows.
Firstly, we divide the image of indicator card into a number of small grids with same size and shape, in the horizontal and vertical direction, respectively.Then, we mark the grids which are traversed by the curve of indicator card.Finally, we can obtain the feature parameters of the indicator card image.
In this paper, we use a Gray Level Matrix Statistics (GLMS) feature extraction method which is based on grid method.
Before the GLMS feature extraction, the indicator card curve should be converted to grayscale graphic matrix.The steps of GLMS feature extraction algorithm are as follows.
(1) The mesh of grayscale matrix is initialized: if a grid is traversed by the indicator card curve, then the gray value is assigned as "1".
(2) According to the gray contour principle, other grids are assigned as such gray value: if the grid is located at the inside of the curve, then the gray value is equal to the initial value plus  for its  grids distance away from the curve; if the grid is located at the outer region of the curve, then the gray value is equal to the initial value minus  for its  grids distance away from the curve.
(3) Finally, we can get 6 statistic parameters of the gray level matrix [19].

Numerical Experiments and Results
Three typical fault indicator cards are shown in Figures 4, 5,  and 6.
According to the Method II (Geometric Moment Vector), we compute 12 subsets of the Geometric Moment Vectors (GMV) for 12 different typical fault of indicator cards.Each set includes 7 GMVs, which is called a feature vector denoted as  = [ 1 ,  2 ,  3 ,  4 ,  5 ,  6 ,  7 ].Table 2 shows the numerical result.According to the Method III (Gray Level Matrix Statistics), we compute 12 subsets of the Gray Level Matrix Statistic (GLMS) for 12 different typical fault of indicator cards.Each set includes 6 GLMSs, which is called a feature vector denoted as  = [ 1 ,  2 ,  3 ,  4 ,  5 ,  6 ].Table 3 shows the numerical result.

Conclusion
In this paper, three different feature extraction methods, which are based on Fourier Descriptors, Geometric Moment Vector, and Gray Level Matrix Statistic, respectively, have been analyzed and simulated.The computing speed and memory consuming of these 3 algorithms are compared as shown in Table 4.
Numerical experiments show that the FD algorithm is with high computing speed and more memory space but possible loss of information; because of different numbers of FDs, the GMV algorithm is more time-consuming and less memory consuming, while the GLMS algorithm provides low-dimension vectors with good performance of speed and space.
The characteristic of rotational invariance, both in the FD algorithm and the GMV algorithm, may cause improper pattern recognition of indicator card data when used for suckerrod pump working condition diagnosis.Further research on feature extraction of indicator card data should continue for better performance.

Table 1 :
Normalized Fourier Descriptors of 12 typical fault indicator cards.

Table 2 :
Geometric Moment Vectors of typical faults indicator card.

Table 3 :
Gray Level Matrix Statistics of typical faults indicator card.

Table 4 :
Comparison of three types of algorithms.