CLASSIFICATION OF REDUCTION INVARIANTS WITH IMPROVED BACKPROPAGATION

Data reduction is a process of feature extraction that transforms the data space into a feature space of much lower dimension compared to the original data space, yet it retains most of the intrinsic information content of the data. This can be done by using a number of methods, such as principal component analysis (PCA), factor analysis, and feature clustering. Principal components are extracted from a collection of multivariate cases as a way of accounting for as much of the variation in that collection as possible by means of as few variables as possible. On the other hand, backpropagation network has been used extensively in classification problems such as XOR problems, share prices prediction, and pattern recognition. This paper proposes an improved error signal of backpropagation network for classification of the reduction invariants using principal component analysis, for extracting the bulk of the useful information present in moment invariants of handwritten digits, leaving the redundant information behind. Higher order centralised scaleinvariants are used to extract features of handwritten digits before PCA, and the reduction invariants are sent to the improved backpropagation model for classification purposes.


Introduction.
The curse of many or most neural network applications is that the number of potentially important variables can be overwhelming.There are problems whenever we deal with a very large number of variables.These include the sheer size of the computational burden can slow even the fastest computers to the point of uselessness and there can be substantial correlation between variables.The method of principal components is primarily a data-analytic technique that obtains linear transformations of a group of correlated variables such that certain optimal conditions are achieved.The most important of these conditions is that the transformed variables are uncorrelated [7].
Moment invariants have been proposed as pattern sensitive features in classification and recognition applications.Hu (1962) was the first to introduce the geometric moment invariants which are invariant under change of size, translation, and orientation [2].Moments and functions of moments can provide characteristics of an object that uniquely represent its shape and have extensively employed as the invariant global features of an image in pattern recognition and image classification since 1960s.
This paper discusses the use of principal component analysis to reduce the complexity of invariants dimension for unconstrained handwritten digits and an improved error function of backpropagation model for classification purposes.The rest of the paper's presentation is as follows: Section 2 gives a review on moment invariants and higher order centralised scale invariants, while Section 3 gives a summary of principal component analysis and its methodology.Section 4 gives an overview of backpropagation model and the proposed error function.Finally, Section 5 shows the experimental results, and Section 6 is the conclusion.

Geometric moment invariants.
The geometric moments (see [2]) of order p + q of a digital image are defined as where 2) The translation invariant central moments are obtained by placing the origin at the centroid of the image, where Under the scale transformation (the change of size), each coefficient of any algebraic form f (x,y) will be an algebraic invariant, by the definitions of invariants (see [6]), a pq = α p+q a pq . (2.6) Then, for moment invariants, By eliminating α between the zeroth-order relation, and the remaining ones, the absolute scale moment invariants is generated as µ pq µ 00 ((p+q)/2)+1 , p+ q = 2, 3,..., (2.9) in which the image is assumed to have equal scaling in the x and y directions.
Consider the following linear transformation which performs different scalings in the x and y directions (see [1]), (2.11) Improved scale invariants can be derived from (2.9) using higher order centralised moments and algebraic invariants for unequal scaling as (see [5]) (2.12) 3. Principal component analysis.Principal component analysis (PCA) is a multivariate technique in which a number of related variables are transformed to a set of uncorrelated variables.The starting point for PCA is the sample covariance matrix S. For a p-variable problem (see [3]) where s 2 i is the variance of the ith variable, x i and s ij is the covariance between the ith and jth variables.
The principal axis transformation will transform p correlated variables x 1 ,x 2 ,...,x p into p new uncorrelated variables z 1 ,z 2 ,...,z p .The coordinate axes of these new variables are described by the characteristic vectors u i which make up the matrix U of direction cosines used in the transformation where x and x are p × 1 vectors of observations on the original variables and their means.
The transformed variables are called the principal components of x.The ith principal component is and has mean zero, variance l i , and the ith eigenvalue.In other words, the vectors that define the principal components are the eigenvectors of the covariance matrix, and that the eigenvalues are the variances of the principal components.Figure 3.1 is an example of a screen plot for PCA and its eigenvalues.In this paper, we choose PCA of eigenvalue greater than 1.0 as a mean for the extracted components accordingly.

Methodology of PCA on unconstrained handwritten digits.
The complexity of data is because of its highly correlation among variables.As such, PCA is meant to choose the variables that really represent it.Thus, in our experiment, we pick variables of correlation greater than 0.7 on unconstrained handwritten digits.We group  these digits based on individual handwritings.Thus we have 200 samples of handwritten digits from five persons.Each individual has 40 samples of handwritten digits with different styles of writing.Table 3.1 shows the total variance explained on unconstrained handwritten digits of group I.The column that represents component is the number of variables, whereby in this study is higher order centralised scale invariants.The percentage of variance informs us that only 3 extracted components are sufficient to represent the whole informations about the invariants of group I. [4].This network has served as a useful methodology to train multilayered neural networks for a wide variety of applications.The backpropagation model is a supervised learning algorithm using feedforward networks which make use of target values.Backpropagation network is basically a gradient descent method and its objective is to minimize the mean squared error (see between the target values and the network outputs.Thus the mean square error function (MSE) is defined as

Backpropagation model. The backpropagation network is introduced by Rumelhart and McClelland
where t kj is the target output and o kj is the network output.The proposed error function for standard backpropagation (mm) is defined implicitly as with where E k = t k − a k , E k is the error at the output unit k, t k is the target value of the output unit k, and a k is an activation of the unit k.
The updating weight of standard backpropagation model is Thus, in this case, (4.4) can be rewritten as Knowing that a k = f (Net k ), so for the proposed method, sigmoid function of 1/1 + e −2x is used.By taking partial derivatives of the a k , and simplify it by substituting in terms of a k , It is known that

.11)
Thus by taking partial derivatives with respect to the activation function a k gives Simplifying (4.12), it becomes Substituting (4.13) into (4.10)gives the proposed error signal of backpropagation for the output layer as and an error signal for modified backpropagation of the hidden layer is the same as standard backpropagation,

.15)
A proposed backpropagation error function can be illustrated geometrically as in Figure 4.2, with its errors reducing rapidly compared to MSE, thus giving less iterations for convergence.

Experimental results on unconstrained handwritten digits.
We tested 200 samples of unconstrained handwritten digits from 0 through 9 with various shapes for classifications.Due to computer space and memory limitations, we categorise these samples into five groups, that is, group I-group V.Each group has 40 samples of unconstrained handwritten digits with various shapes and styles of writing.The learning rate and momentum parameter were set to 0.9 and 0.2 for proposed backpropagation with sigmoid as an activation function.Each group has different PCA, therefore, for this paper, we choose group I (see the appendix) as a sample for training using proposed backpropagation.After PCA process of group I, we choose 3 extracted components with eigenvalues greater than 1 which are invariants of the third order with almost 70% of variations.
Figure 5.1 shows the convergence rate for scale invariants using proposed backpropagation for unconstrained handwritten digits before and after the reduction process of group I. Classification rates for proposed backpropagation using scale invariants are successfully recognised after using PCA.Table 5.1 shows the convergence rate and iteration for proposed backpropagation for unconstrained handwritten digits of group I. Table 5.2 shows the variations for each group and number of components extracted from invariants of those handwritten digits accordingly.

Conclusion.
We presented the use of PCA as a mean to reduce invariants complexity of unconstrained handwritten digits, and classifications of those digits using proposed backpropagation.Higher order centralised scale invariants are used to extract digit images before PCA technique is applied for dimensionality reduction.From the experiments, we find that PCA is able to reduce the number of invariants for these digits without losing their useful informations.In other words, after PCA, we use extracted components of third order moments before we proceed for classifications using proposed backpropagation model.Dimensionality reduction using PCA indirectly saves the computation time and space by using less number of invariants variables for unconstrained handwritten digits.From Figure 5.1 and Table 5.1, we see that the convergence rate is faster using the extracted components of invariants for these handwritten digits accordingly.

Call for Papers
Thinking about nonlinearity in engineering areas, up to the 70s, was focused on intentionally built nonlinear parts in order to improve the operational characteristics of a device or system.Keying, saturation, hysteretic phenomena, and dead zones were added to existing devices increasing their behavior diversity and precision.In this context, an intrinsic nonlinearity was treated just as a linear approximation, around equilibrium points.
Inspired on the rediscovering of the richness of nonlinear and chaotic phenomena, engineers started using analytical tools from "Qualitative Theory of Differential Equations," allowing more precise analysis and synthesis, in order to produce new vital products and services.Bifurcation theory, dynamical systems and chaos started to be part of the mandatory set of tools for design engineers.
This proposed special edition of the Mathematical Problems in Engineering aims to provide a picture of the importance of the bifurcation theory, relating it with nonlinear and chaotic dynamics for natural and engineered systems.Ideas of how this dynamics can be captured through precisely tailored real and numerical experiments and understanding by the combination of specific tools that associate dynamical system theory and geometric tools in a very clever, sophisticated, and at the same time simple and unique analytical environment are the subject of this issue, allowing new methods to design high-precision devices and equipment.
Authors should follow the Mathematical Problems in Engineering manuscript format described at http://www .hindawi.com/journals/mpe/.Prospective authors should submit an electronic copy of their complete manuscript through the journal Manuscript Tracking System at http:// mts.hindawi.com/according to the following timetable:

Figure 5 . 1 .
Figure 5.1.Convergence rate for handwritten digits before and after PCA of group I.

Figure A. 1
Figure A.1 shows samples of handwritten digits of group I.
Net k = δ k .By the chain rule gives .5)Knowing that Net k = k w kj o j + θ k , thus by taking the partial derivative, we have

Table 5 .
1. Convergence rate for handwritten digits before and after PCA of group I.

Table 5 .
2. Total variations for each group.