Texture Classification Using Scattering Statistical and Cooccurrence Features

Texture classification is an important research topic in image processing. In 2012, scattering transform computed by iterating over successive wavelet transforms and modulus operators was introduced. This paper presents new approaches for texture features extraction using scattering transform. Scattering statistical features and scattering cooccurrence features are derived from subbands of the scattering decomposition and original images. And these features are used for classification for the four datasets containing 20, 30, 112, and 129 texture images, respectively. Experimental results show that our approaches have the promising results in classification.


Introduction
The texture is one of the main contents of the image.Texture segmentation, texture classification, and shape recovery from texture are three primary issues in texture analysis [1].Among them, texture classification plays an important role in many tasks, ranging from remote sensing and medical imaging to query by content in large image data bases, and so forth [2].Texture analysis is one of the most important techniques when images which consist of repetition or quasi repetition of some fundamental image elements are analyzed and interpreted (e.g., [3]).Various feature extraction and classification techniques have been suggested for the purpose of texture analysis in the past.Since there are many variations among nature textures, to achieve the best performance for texture analysis or retrieval, different features should be chosen according to the characteristics of texture images.It is well recognized that these texture analysis methods capture different texture properties of the image.
There are four major stages in texture analysis, that is, feature extraction, texture discrimination, texture classification, and shape from texture [4].The first stage of image texture analysis is feature extraction.Texture features obtained from this step are used to discriminate textures, classify image textures, or determine object shape.Feature extraction computes a characteristic that can describe texture properties of a digital image.The process that partitions a textured image into regions, each corresponding to a perceptually homogeneous texture, is texture discrimination.In the stage of texture classification, a rule, which classifies a given test image of unknown classes to one of the known classes, is designed.Shape from texture reconstructs 3D surface geometry from texture information.Feature extraction techniques mainly include first-order histogram based features, cooccurrence matrix based features, and multiscale features [4].Firstorder histogram based features, according to the shape of the histogram of intensity levels, provides a number of clews as to the character of the image.The second-order histogram is considered as the cooccurrence matrix [5].Cooccurrence matrix based features are the estimate of the joint probability distributions of pairs of pixels.In order to calculate multiscale features, many time-frequency methods are adopted [6].The common methods are Wigner distributions, Gabor functions, wavelet transform, and ridgelet transform.Wigner distributions can produce inference terms which lead to wrong signal interpretation.Gabor filter results in redundant features at different scales or channels [7].Wavelet transform is a linear operation and possesses a capability of time localisation of signal spectral features.For these reasons, it is interesting in application to texture analysis for wavelet transform.Ridgelet transform can deal effectively with line singularities in 2D.It is well known that texture classification based on ridgelet statistical features (RSFs) and ridgelet cooccurrence features (RCFs) has been done by Arivazhagan et al. [8].
In the last few decades, wavelet theory has been widely used for texture classification purposes [9][10][11].However, wavelet transform is not translation invariant.In 2012, Mallat advanced scattering transform which is invariant to translations and Lipschitz continuous relatively to deformations [12].Scattering transform can overcome the weakness of wavelet transform, that is, not translation invariant.The idea is that scattering transform is computed by iterating over successive wavelet transforms and modulus operators.Scattering transform maps high frequency information of images to low frequency.Then, scattering transform can provide a stationary representation.Scattering transform has found applications in texture classification (e.g., [13,14]).These classification tasks are based on original scattering vectors.
In this paper, the scattering transform is applied on a set of texture images.Statistical features and cooccurrence features are extracted from original images and each of scattering subbands.These features are used for classification.For the sake of comparative analysis, classification tasks are done using RSFs, RCFs, wavelet statistical features (WSFs), and wavelet cooccurrence features (WCFs), respectively.The experimental results show that the success rate of our feature extraction techniques is promising but unsatisfactory.But it is considered as a proof of concept for scattering statistical features (SSFs) and scattering cooccurrence features (SCFs).
The rest of this paper is organized as follows.In Section 2, the theory of scattering transform is briefly reviewed.The feature extraction and texture classification are explained in Section 3. In Section 4, texture classification experimental results are discussed in detail.Finally, concluding remarks are given in Section 5.

Scattering Transform
Wavelet transform is a process which was applied to original signal by a filter [15]. denotes a discrete, finite rotation group in  2 .A wavelet function  is a band-pass filter.The following formula is rotation and dilation of : with  = 2   ∈ ∧ =  ×  and || = 2  ,  ∈  is rotation parameter; 2  for  ∈  is dilation parameter.A texture () is modeled as a realization of a stationary process.So the wavelet transform of () is written as follows: () =  *   () . (2) []() = | *   ()| is the wavelet modulus of ().The high frequency coefficients of wavelet transform are mapped to the low frequency form by the modulus operator [16].The result of the convolution of texture () and zoom function Φ  () is low frequency information; that is, where The high frequency information which is lost by the wavelet modulus operator could be recovered by next  [12], resulting in the scattering propagators [] along different paths  = ( 1 , . . .,   ) as follows: In particular, when  = 0, []() = ().
The wavelet modulus operator is iteratively applied to progressively map the high frequency information to the low frequency information.Thus scattering operator is defined from   () = () * Φ  () and []() = | *   ()| [12].The information of texture is scattered to different paths in the iterative process.The scattering operator   implements a sequence of wavelet convolutions and modulus, followed by a convolution with Φ  : The scattering transform is thus computed with a cascade of wavelet transform and modulus.The scattering transform process could be described by a deep network architecture (see, e.g., [17,18]), as shown in Figure 1.Mallat has proved that the energy of the deepest layer converges quickly to zero as the length of path increases in [12].Bruna and Mallat [19] have illustrated that most of the energy is concentrated in || ≤ 3. Further details about scattering transform are presented in [12].

Feature Extraction and Texture Classification
The steps involved in texture training and texture classification are shown in Figure 2.
Texture Training.At the stage of the texture training, the known texture images are decomposed by using scattering transform.Then, mean and standard deviation of original images and subbands of two layers decomposed images are calculated as features using the formulas given in the following: Standard deviation where (, ) is the transformed valued in (, ) for any image of size × [20].These features are stored in features library as scattering statistical features (SSFs) which are further used in the texture classification phase.
In addition, in order to further verify the classification rate, cooccurrent matrix () [21] is formed for each subband of scattering transform and each image, respectively.From the cooccurrence matrix, the features such as cluster prominence, cluster shade, contrast, and local homogeneity are given by Arivazhagan and Ganesan [9].These features are obtained by ( 9)- (12).These features are stored in the feature database as scattering cooccurrence features (SCFs): Cluster shade (cs) Local homogeneity (lho) where M = ∑  ∑  (, ), M = ∑  ∑  (, ), and (, ) is the (, )th element of the cooccurrence matrix .
Texture Classification.Here, the unknown texture images are decomposed using scattering transform.Then, SSFs and SCFs of original images and subbands of scattering decomposed images are extracted using ( 7)-( 12), respectively.These features are compared with the corresponding feature values stored in the features library using a distance formula, given as follows: where  is an unknown texture,  indicates number of features,   () represents the features of  while  is a known th texture in the library, and   () is the features of known th texture.If the distance D() is minimum among all textures which is available in the library, then the known texture is classified as th texture.This classification approach is very simple, efficient, and effective in many fields [22].This rule is widely used in object recognition [23], text categorization [24], pattern recognition [25], and so on.Performance of the feature sets is tested with success rate.Let   be the number of subimages correctly classified and let M be the total number of subimages, derived from each texture image.Then classification success rate   is calculated using

Experimental Results and Discussion
In this section, several experiments are carried out on texture databases from Brodatz texture album [26] and VisTex color image database [27].Four experiments are conducted with only one objective which is investigation of the texture classification performance based on the proposed methods of feature extraction.For the purpose of comparison, the classification experiment is repeated with RSFs, RCFs, WSFs, and WCFs, respectively.In order to verify performance of our feature extraction methods on large amounts of data and small amounts of data, VisTex color image database   is used thrice, the first two times with a small number of images and the third time with a large number.Furthermore, the efficiency of feature extraction approaches proposed is demonstrated with the average success rate and image regions correctly classified.Since Bruna and Mallat have illustrated that most of scattering energy is concentrated in || ≤ 3, we mainly consider the first three layers in the current work.It is noted that the computing cost of the first three layers is larger than that of the first two layers.It is a pity that the classification performance of the first three layers is slightly better than that of the first two layers.Therefore, the maximum number of scattering layers is 2 in our experiments.In addition, in order to get optimal values of the number of orientations and the maximum scale, we try to change the values of these parameters and do a large number of experiments.Comprehensively considering the computation complexity and classification performance, the number of scattering orientations is 4 and the maximum scale of scattering transform is 2.There is a various number of scattering matrices in different layers of scattering transform.As a result, the resulting number of scattering matrices in the zeroth layer of scattering transform is only one, there are 8 scattering matrices in the first layer of scattering transform, and the number of scattering matrices of the second layer is 16.
Firstly, Dataset-1 contains 20 monochrome images which are obtained from VisTex color image database, each of size 512 × 512.Texture image classification is done for Dataset-1 using SSFs and SCFs.Here, each texture image is subdivided into sixty-four 64 × 64, sixteen 128 × 128, and four 256 × 256 nonoverlapping image regions.So, there are a total of 1680 subimages regions in the database.By decomposing an image using scattering transform, 25, 25, and 25 subbands are obtained for the image of size 64 × 64, 128 × 128, and 256 × 256, respectively.SSFs and SCFs are calculated over all the scattering decomposed subbands.Furthermore, SSFs and SCFs of the regions of size 64 × 64, 128 × 128, and 256 × 256 are also obtained.
The experimental results are summarized in Table 1.From Table 1, it is found that, compared with RSFs and WSFs, when classification is carried out with statistical features that is, mean and standard deviation of original images and subbands of transform decomposed images, the mean success rate obtained from SSFs is the highest, that is, 96.07%.It can be seen that SCFs perform better than RCFs but poorer than WCFs.Using the feature vectors which contain the combination of statistical features and cooccurrence features, the mean success rate for feature vectors F3, R3, and W3 is 95.60%, 98.69%, and 97.80%, respectively.
Next, Dataset-2 containing thirty 512 × 512 size monochrome images which are obtained from VisTex color image database is used for analysis.In a similar manner, for Dataset-2, each texture image is subdivided into four 256 × 256, sixteen 128 × 128, and sixty-four 64 × 64 nonoverlapping image regions.Therefore, there are 2520 subimage regions, respectively, in the database.SSFs and SCFs are extracted from original images and subbands of scattering transform decomposed images.
The classification results which are obtained for all the 84 subimage regions derived from each texture image in Dataset-2 are given in Table 2. Table 2 shows the following: (i) using the feature vector F1, the success rate achieved is 94.96%; (ii) using SSFs as feature vector F1, a mean success rate is about 20.12% more than the average success rate using F2 whose mean success rate is only 74.84%; (iii) the mean success rate obtained using F3 is 94.01%which is about 0.95% less than the average success rate obtained using F1.
In addition, our proposed approaches are compared with R1, R2, R3, W1, W2, and W3 in terms of the classification performance.Compared with RSFs and WSFs, the performance of SSFs is the best.For cooccurrence features, SCFs get better classification performance than WCFs, while its mean accuracy is slightly lower than that from RCFs.From Table 2, it is found that when classification is carried out with W3 and R3, the mean success rate is 96.47% and 96.79%, respectively.But when classification is done with F3, the mean success rate is slightly reduced to 94.01%.Then, Dataset-3 containing one hundred and twelve monochrome images, obtained from Brodatz texture album, is used for analysis.Size of each image in Dataset-3 is 512 × 512.Each texture image is subdivided into four 256 × 256 and sixteen 128 × 128 nonoverlapping image regions.Hence, the database includes a total of 2240 subimage regions, respectively.The feature vectors SSFs and SCFs for each image are calculated from the subbands of scattering transform decomposed image and the original image.
The classification results are summarized in Table 3.The mean success rate of feature vectors F1, F2, and F3 is 91.34%, 78.62%, and 90.18%, respectively.As shown in Table 3, for statistical features, it is noted that the highest mean success rate is obtained using SSFs.Comparing with RCFs and WCFs, the performance of SCFs is better than WCFs and worse than RCFs.Likewise, the mean classification accuracy obtained using F3 is higher than that achieved using W3 and lower than the mean score got using R3, when the performance of F3 is compared with that of feature vectors R3 and W3.
Finally, Dataset-4 is created from one hundred and twenty-nine monochrome images from VisTex color image database.The database is constructed by dividing each 512 × 512 image into nonoverlapping four 256 × 256 and sixteen 128 × 128 image regions.There are 2580 image regions in the database.SSFs and SCFs are extracted from subbands of scattering transform decomposed image and the original image.F1 contains mean and standard deviation.F2 includes SCFs, that is, cluster prominence, cluster shade, contrast, and local homogeneity.F3 is the combination of SSFs and SCFs.Classification is done using three different feature vectors (F1, F2, and F3).F1, F2, and F3 are calculated from scattering subbands and original images.
The classification results are summarized in Table 4.The classification is implemented using feature vector F1 and a mean success rate achieved is 85.81%.Using F2, the mean success rate is 73.99%.Then, using F3, the mean success rate obtained is only 85.23%.The mean success rate obtained using F1 is about 11.82% more than the average success rate obtained using F2.The mean success rate obtained using F3 is about 11.24% more than the average success rate obtained using F2.
Comparing with the performance of RSFs and WSFs, the average correct classification rate achieved using SSFs is the highest.Comparing with RCFs and WCFs, the performance of SCFs is better than that of RCFs and worse than the performance of WCFs.Likewise, the mean classification gain obtained using F3 is higher than that achieved using W3 and lower than the average classification rate of R3, when the performance of F3 is compared with that of R3 and W3.
From experiment results of this section, it is found that a joint phenomenon is that F2 is much worse than F1, whilst F3 is a little bit worse than F2; we speculate that it may be due to high variance in the estimation of cooccurrence features.Through the comparison with wavele transform and ridgelet transform, the classification performance based on scattering statistical features is the best in the four datasets.For cooccurrence features, the mean classification accuracy of SCFs is comparable with that of RCFs and WCFs in this study.When combining statistical features and cooccurrence features, the average classification accuracy obtained by F3 is lower than that achieved by feature vectors R3 and W3 for small amounts of datasets.For large amounts of datasets, the experimental results obtained by F3 are better than that achieved using W3, but worse than the outcomes of R3.

Conclusion
In this present work, the highest mean success rate achieved using scattering statistical and cooccurrence features is 96.07%, 94.96%, 91.34%, and 85.81% in Dataset-1, Dataset-2, Dataset-3, and Dataset-4, respectively.Our methods may not be competitive to state-of-the-art feature extraction methods using significant image knowledge and heuristics.However, we find that these results are promising and view them as a proof of concept for SSFs and SCFs.From the exhaustive experiments conducted with texture image datasets, it is inferred that statistical features in the context of scattering representations provide a good compromise between discriminability and good feature properties, whereas cooccurrence features come with nonhigh discriminability.
Our current work has so far focused on algorithmic development and experimental justification.More thorough theoretical analysis of feature extraction methods proposed is expected in the future.Furthermore, this work can be extended for an efficient classification system design with excellent success rate of classification.

Figure 1 :
Figure 1: Scattering transform.It can be seen as a network which iterates over wavelet transform and modulus operator.