Classification of Error-Diffused Halftone Images Based on Spectral Regression Kernel Discriminant Analysis

,


Introduction
As a popular image processing technology, digital halftoning [1] has found wide applications in converting a continuous tone image into a binary halftone image for a better display on binary devices, such as printers and computer screens.Usually, binary halftone images can only be obtained in the process of printing, image scanning, and fax, from which the original continuous tone images need to be reconstructed [2,3], using an inverse halftoning algorithm [4], for image processing, for example, image classification, image compression, image enhancement, and image zooming.However, it is difficult for inverse halftoning algorithms to obtain the optimal reconstruction quality due to unknown halftoning patterns in practical applications.Furthermore, a basic drawback of the existing inverse halftone algorithms is that they do not distinguish the types of halftone images or can only coarsely divide halftone images into two major categories of error-diffused halftone images and orderly dithered halftone images.This inability of exploiting a prior knowledge on the halftone images largely weakens the flexibility, adaptability, and effectiveness of the inverse halftoning techniques, making the study on the classification of halftone images imperative for not only optimizing the existing inverse halftoning schemes, but also guiding the establishment of adaptive schemes on halftone image compression, halftone image watermarking, and so forth.
Motivated by observing the significance of classifying halftone images, several halftone image classification methods have been proposed.In 1998, Chang and Yu [5] classified halftone images into four types using an enhanced one-dimensional correlation function and a backpropagation (BP) neural network, for which the data sets in the experiments are limited to the halftone images produced by clustered-dot ordered dithering, dispersed-dot ordered dithering, constrained average, and error diffusion.Kong et al. [6,7] used an enhanced one-dimensional correlation function and a gray level cooccurrence matrix to extract features from halftone images, based on which the halftone images are divided into nine categories using a decision tree algorithm.Liu et al. [8] combined support region and least mean square (LMS) algorithm to divide halftone images into four categories.Subsequently, they [9] used LMS to extract features from Fourier spectrum in nine categories of 2 Advances in Multimedia halftone images and classify these halftone images using naive Bayes.Although these methods work well in classifying some specific halftone images, their performance largely decreases when classifying error-diffused halftone images produced by Floyd-Steinberg filter, Stucki filter, Sierra filter, Burkers filter, Jarvis filter, and Stevenson filter, respectively.They are described as follows.
Different Error Diffusion Filters.Consider the following: (a) Floyd-Steinberg filter: (b) Sierra filter: (c) Burkers filter: based on different error diffusion kernels, as summarized in [10][11][12].Moreover, these literatures did not consider all types of error diffusion halftone images.For example, only three error diffusion filters are included in [6,7,9] and only one is involved in [5,8].The idea of halftoning for the six error diffusion filters is quite similar, with the only difference lying in the templates used (shown in different error diffusion filters and the error diffusion of Stucki described above; the templates are shown at the right-hand side in each equation).It is difficult to classify the error-diffused halftone images because of the almost inconspicuous differences among various halftone features extracted from, using these six error diffusion filters, the error-diffused halftone images.However, as a scalable algorithm, the error diffusion has gradually become one of the most popular techniques, due to its ability to provide a solution of good quality at a reasonable cost [13].This asks for an urgent requirement to study the classification mechanism for various error diffusion algorithms, with the hope to promote the existing inverse halftone techniques widely used in different application fields of graphics processing.This paper proposes a new algorithm to classify errordiffused halftone images.We first extract the feature matrices of pixel pairs from the error-diffused halftone image patches, according to statistical characteristics of these patches.The class feature matrices are then subsequently obtained, using a gradient descent method, based on the feature matrices of pixel pairs [14].After applying the spectral regression kernel discriminant analysis to realize the dimension reduction in the class feature matrices, we finally classify the error-diffused halftone images using the idea similar to the nearest centroids classifier [15,16].
The structure of this paper is as follows.Section 2 presents the method of kernel discriminant analysis.Section 3 describes how to extract the feature extraction of pixel pairs from the error-diffused halftone images.Section 4 describes the proposed classification method for the error-diffused halftone image based on the spectral regression kernel discriminant analysis.Section 5 shows the experimental results.Some concluding remarks and possible future research directions are given in Section 6.

An Efficient Kernel Discriminant Analysis Method
It is well known that linear discriminant analysis (LDA) [17,18] is effective in solving classification problems, but it fails for nonlinear problems.To deal with this limitation, the approach called kernel discriminant analysis (KDA) [19] has been proposed.

Overview of
where   is the number of the samples in the th class, ( =1 (   ) is the global centroid.In the feature space, the aim of the discriminant analysis is to seek the best projection direction, namely, the projective function V to maximize the following objective function: Equation ( 10) can be solved by the eigenproblem According to the theory of reproducing kernel Hilbert space, we know that the eigenvectors are linear combinations of (  ) in the feature space : there exist weight coefficients   ( = 1, 2, . . ., ) such that V  = ∑  =1   (  ).Let  = [ 1 ,  2 , . . .,   ]  ; then it can be proved that (10) can be rewritten as follows: The optimization problem of ( 11) is equal to the eigenproblem where  is the kernel matrix and   = (  ,   );  is the weight matrix defined as follows: For sample , the projective function in the feature space  can be described as

Kernel Discriminant Analysis via Spectral Regression.
To efficiently solve the eigenproblem of the kernel discriminant analysis in (12), the following theorem will be used.

Theorem 1.
Let  be the eigenvector of the eigenproblem  =  with eigenvalue .If  = , then  is the eigenvector of eigenproblem (12) with the same eigenvalue .
According to Theorem 1, the projective function of the kernel discriminant analysis can be obtained according to the following two steps.
Step 2. Search eigenvector  which satisfies  = , where  is the positive semidefinite kernel matrix.
As we know, if  is nonsingular, then, for any given , there exists a unique  =  −1  satisfying the linear equation described in Step 2. If  is singular, then, the linear equation may have infinite solutions or have no solution.In this case, we can approximate  by solving the following equation: where  ≥ 0 is a regularization parameter and  is the identity matrix.Combined with the projective function described in ( 14), we can easily verify that the solution  * = ( + ) −1  given by ( 15) is the optimal solution of the following regularized regression problem: where   is the th element of  and  is the reproducing kernel Hilbert space induced from the Mercer kernel  with ‖ ⋅ ‖  being the corresponding norm.Due to the essential combination of the spectral analysis and regression techniques in the above two-step approach, the method is named as spectral regression (SR) kernel discriminant analysis.

Feature Extraction of the Error-Diffused Halftone Images
Since its introduction in 1976, the error diffusion algorithm has attracted widespread attention in the field of printing applications.It deals with pixels of halftone images using, instead of point processing algorithms, the neighborhood processing algorithms.Now we will extract the features of the error-diffused halftone images which are produced using the six popular error diffusion filters mentioned in Section 1. and  0 is diffused ahead to some subsequent pixels not necessary to deal with.Therefore, for some subsequent pixels, the comparison will be implemented between  0 and the value which is the sum of   (, ) and the diffusion error .A template matrix can be built using the error diffusion modes and the error diffusion coefficients, as shown in the error diffusion of Stucki described above, for example, (a) the error diffusion filter and (b) the error diffusion coefficients which represent the proportion of the diffusion errors.If the coefficient is zero, then the corresponding pixel does not receive any diffusion errors.According to the error diffusion of Stucki described above, pixel  suffers from more diffusion errors than pixel ; that is to say, - has a larger probability to become 1-0 pixel pair than -.The reasons are as follows.

Statistic Characteristics of the
Suppose that the pixel value of  is 0 ≤   ≤ 1, and pixel  has been processed by the thresholding method according to the following equation: In general, threshold  0 is set as 0. Since the value of each pixel in the error-diffused halftone image can only be 0 or 1, there are 4 kinds of pixel pairs in the halftone image: 0-1, 0-0, 1-0, and 1-1.Pixel pairs 0-1 and 1-0 are collectively known as 1-0 pixel pairs because of their exchange ability.Therefore, there are only three kinds of pixel pairs essentially: 0-0, 1-0, and 1-1.In this paper, three statistical matrices are used to store the number of different pixel pairs with different neighboring distances and different directions, which are of size  ×  and are referred to as  00 ,  10 , and  11 , respectively ( is an odd number satisfying  = 2 + 1 and  is the maximum neighboring distance).Suppose that the center entry of the statistical matrix template covers pixel  of the error-diffused halftone image with the size  * , and other entries overlap other neighborhood pixels (, V).Then, we can compute three statistics on 1-0, 1-1, and 0-0 pixel pairs within the scope of this statistics matrix template.If the position (, ) of pixel  changes continually, the matrices  00 ,  10 , and  11 with zero being the initial values can be updated according to where  = −+−1, V = −+−1, 1 ≤ ,  ≤ , 1 ≤ , V ≤ , 1 ≤  ≤ , and 1 ≤  ≤ .After normalization, the three statistic matrices can be ultimately obtained as the statistical feature descriptor of the error-diffused halftone images.

Process of Statistical Feature Extraction of Halftone Images.
According to the analysis described above, the process of statistical feature extraction of the error-diffused halftone images can be represented as follows.
Step 3. Obtain the statistical matrix   of block   according to (18), and update  using the equation  =  +   .
According to the process described above, we know that the statistical features of the error-diffused halftone image  are extracted based on the method that divides  into image patches, which is significantly different with other feature extraction methods based on image patches.For example, in [20], the brightness and contrast of the image patches are normalized by -score transformation, and whitening (also called "sphering") is used to rescale the normalized data to remove the correlations between nearby pixels (i.e., low-frequency variations in the images) because these correlations tend to be very strong even after brightness and contrast normalization.However, in this paper, features of the patches are extracted based on counting statistical measures of different pixel pairs (0/0, 1/0, and 1/1) within a moving statistical matrix template and are optimized using the method described in Section 3.3.

Extraction of the Class Feature Matrix.
The statistics matrices   00 ,   10 ,   11 ( = 1, 2, . . ., ), after being extracted, can be used as the input of other algorithms, such as support vector machines and neural networks.However, the curse of dimensionality could occur, due to the high dimension of  00 ,  10 ,  11 , making the classification effect possibly not significant.Thereby, six class feature matrices  1 ,  2 , . . .,  6 are designed in this paper for the errordiffused halftone images produced by the six error diffusion filters mentioned above.Then, a gradient descent method can be used to optimize these class feature matrices. = 6 ×  error-diffused halftone images can be derived from  original images using the six error diffusion filters, respectively.Then,  statistics matrices   (  00 ,   10 ,   11 ) ( = 1, 2, . . ., ) can be extracted as the samples from  error-diffused halftone images using the algorithm mentioned in Section 3.2.Subsequently, we label these matrices as label( 1 ), label( 2 ), . . ., label(  ) to denote the types of the error diffusion filters used to produce the error-diffused halftone image.Given the th sample   as the input, the target out vector   = [ 1 , . . .,  6 ] ( = 1, . . ., ), and the class feature matrices  1 ,  2 , . . .,  6 , the square error   between the actual output and the target output can be derived according to where The derivatives of   (, ) in ( 19) can be explicitly calculated as where 1 ≤  ≤ , 1 ≤  ≤ , and • is the dot product of matrices defined, for any matrices  and  with the same size  × , as The dot product of matrices satisfies the commutative law and associative law; that is to say,  •  =  •  and ( • ) •  =  • ( • ).Then, the iteration equation ( 23) can be obtained using the gradient descent method: where  is the learning factor and  means the th iteration.
The purpose of learning is to seek the optimal matrices   ( = 1, 2, . . ., 6) by minimizing the total square error  = ∑  =1   , and the process of seeking the optimal matrices   can be described as follows.
Step 1. Initialize parameters: initialize the numbers of iterations inner and outer, the iteration variables  = 0 and  = 1, the nonnegative thresholds  1 and  2 used to indicate the end of iterations, the learning factor , the total number of samples , and the class feature matrices   ( = 1, 2, . . ., 6).

Classification of Error-Diffused Halftone Images Using Nearest Centroids Classifier
This section describes the details on classifying error-diffused halftone images using the spectral regression kernel discriminant analysis as follows.
Step 2. According to the steps described in Section 3. Step 4. A label matrix information of the size 1 ×  is built to record the type to which the error-diffused halftone images belong.
Step 5.The first  features  1 00 ,  2 00 , . . .,   00 of the samples feature matrices  00 are taken as the training samples (the first  features of  10 ,  11 , or  all which is the composition of  00 ,  10 , and  11 also can be used as the training samples).Reduce the dimension of these training samples using the spectral regression discriminant analysis.The process of dimension reducing can be described by three substeps as follows.
Step 7. The remaining  −  samples are taken as the testing samples, and the dimension reduction is implemented for them using the method described in Step 5.
Step 8. Compute the square of the distance |  00 − aver  | 2 ( =  + 1,  + 2, . . .,  and  = 1, 2, . . ., 6) between each testing sample   00 and different class-centroid aver  , according to the nearest centroids classifier; the sample   00 is assigned to the class  if  = arg min |  00 − aver  | 2 .In Step 8, the weak classifier (i.e., the nearest centroid classifier) is used to classify error-diffused halftone images, because this classifier is simple and easy to implement.Simultaneously, in order to prove that these class feature matrices, which are extracted according to the method mentioned in Section 3 and handled by the algorithm of the spectral regression discriminant analysis, are well suited for the classification of error-diffused halftone images, this weak classifier is used in this paper instead of a strong classifier [20], such as support vector machine classifiers and deep neural network classifiers.

Experimental Analysis and Results
We implement various experiments to verify the efficiency of our methods in classifying error-diffused halftone images.The computer processor is Intel(R) Pentium(R) CPU G2030 @3.00 GHz, the memory of the computer is 2.0 GB, the operating system is Windows 7, and the experimental simulation software is matlab R2012a.In our experiments, all the original images are downloaded from http://decsai.ugr.es/cvg/dbimagenes/ and http://msp.ee.ntust.edu.tw/.About 4000 original images have been downloaded and they are converted into 24000 error-diffused halftone images produced by six different error-diffused filters.

Classification Accuracy Rate of the Error-Diffused
Halftone Images 5.1.1.Effect of the Number of the Samples.This subsection analyzes the effect of the number on the feature samples on classification.When  = 11, and feature matrices  00 ,  10 ,  11 ,  all are taken as the input data, respectively, the accuracy rate of classification under different conditions is shown in Tables 1 and 2. Table 1 shows the classification accuracy rates under different number of training samples, when the total number of samples is 12000.Table 2 shows the classification accuracy rates under different number of training samples, when the total number of samples is 24000.The digits in the first line of each table are the size of the training samples.According to Tables 1 and 2, the classification accuracy rates under 12000 samples are higher than that under 24000 samples.Moreover, the classification accuracy rates improve with the increase of the proportion of the training samples when the number of the training samples is lower than the 80% of sample size.And it achieves the highest classification accuracy rates when the number of the training samples is about 80% of the sample size.In addition, from Tables 1 and  2, we can also see that  00 ,  10 ,  11 can be used as the input data alone.Simultaneously, they can also be combined into the input data  all , based on which the classification accuracy rates would be high.

Comparison of Classification Accuracy Rate.
To analyze the effectiveness of our classification algorithm, the mean values of classification accuracy rate of the four data sets on the right-hand side of each row in Table 2 are computed.The algorithm SR outperforms other baselines in achieving higher classification accuracy rates, when compared with LMS + Bayes (the method composed of least mean square and Bayes method), ECF + BP (the method based on the enhanced correlation function and BP neural network), and ML (the maximum likelihood method).According to  optimize the associated nonconvex problems, which are well known to converge very slowly.However, the classifier based on SR performs the classification task through computing the square of the distance between each testing sample and different class-centroids directly.Hence, the time-consumption of it is very cheap.

The Experiment of Noise Attack
Resistance.In the process of actual operation, the error-defused halftone images are often polluted by noise before the inverse transform.In order to test the ability of SR to resist the attack of noise, different Gaussian noises with mean 0 and different variances are embedded into the error-defused halftone images.Then classification experiments have been done using the algorithm proposed in this paper and the experimental results are listed in Table 6.According to Table 6, the accuracy rates decrease with the increase of the variances.As compared with the accuracy rates listed in Table 7 achieved by other algorithms, such as ECF + BP, LMS + Bayes, and ML, we find that our classification method has obvious advantages in resisting the noise.

Conclusion
This paper proposes a novel algorithm to solve the challenging problem of classifying the error-diffused halftone images.We firstly design the class feature matrices, after extracting the image patches according to their statistical characteristics, to classify the error-diffused halftone images.Then, the spectral regression kernel discriminant analysis is used for feature dimension reduction.The error-diffused halftone images are finally classified using an idea similar to the nearest centroids classifier.As demonstrated by the experimental results, our method is fast and can achieve a high classification accuracy rate with an added benefit of robustness in tackling noise.A very interesting direction is to solve the disturbance, possibly introduced by other attacks such as image scaling and rotation, in the process of errordiffused halftone image classification.
denotes the pixel being processed; , , , and  indicate the four neighborhood pixels:
Error-Diffused Halftone Images.Assume that   (, ) is the gray value of the pixel located at position (, ) in the original image and (, ) is the value of the pixel located at position (, ) in the error-diffused halftone image.For the original image, all the pixels are firstly normalized to the range [0, 1].Then, the pixels of the normalized image are converted to the errordiffused image  line by line; that is to say, if (, ) ≥  0 , (, ), which is the value of the pixel located at position (, ) in error-diffused image , is 1; otherwise, (, ) is 0, where  0 is the threshold value.The error between (, ) 5. According to the template shown in the error diffusion of Stucki described above, we can know that the diffusion error is  =   −   , the new value of pixel  is   = (  +   ) + 8/42, and the new value of pixel  is   = (  +  )+/42, where   (  +  ) and   (  +  ) are the original values of pixels  and , respectively.

Table 4 :
The training and testing time under different sample sizes (in seconds).

Table 3 ,
the mean values of classification accuracy rates obtained using SR and different features  00 ,  10 ,  11 , and  all , respectively, are higher than the mean values obtained by other algorithms mentioned above.5.2.Effect of the Size of Statistical Feature Template on Classification Accuracies.Here, different features  00 ,  10 ,  11 of the 24000 error-diffused halftone images are used to test the effect of the size of statistical feature template.00,10,  11 are constructed using the corresponding class feature matrices  00 ,  10 ,  11 with deferent size  ×  ( = 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25).Figure1shows that the classification accuracy rate achieves the highest value when  = 11, no matter which feature is selected for experiments.It is well known that the time-consumption of the classification includes the training time and the testing time.From Table4, we can know that the training time increases with the increase of the number of training samples; on the contrary, the testing time decreases with the increase of the number of training samples.

Table 5 :
Time-consumption of different algorithms (in second).

Table 7 :
Classification accuracy rates using other algorithms under different variances.