Handwritten digit recognition plays a significant role in many user authentication applications in the modern world. As the handwritten digits are not of the same size, thickness, style, and orientation, therefore, these challenges are to be faced to resolve this problem. A lot of work has been done for various non-Indic scripts particularly, in case of Roman, but, in case of Indic scripts, the research is limited. This paper presents a script invariant handwritten digit recognition system for identifying digits written in five popular scripts of Indian subcontinent, namely, Indo-Arabic, Bangla, Devanagari, Roman, and Telugu. A 130-element feature set which is basically a combination of six different types of moments, namely, geometric moment, moment invariant, affine moment invariant, Legendre moment, Zernike moment, and complex moment, has been estimated for each digit sample. Finally, the technique is evaluated on CMATER and MNIST databases using multiple classifiers and, after performing statistical significance tests, it is observed that Multilayer Perceptron (MLP) classifier outperforms the others. Satisfactory recognition accuracies are attained for all the five mentioned scripts.
1. Introduction
The field of automated reading of printed or handwritten documents by the electronic devices is known as Optical Character Recognition (OCR) system, which is broadly defined as the process of recognizing either printed or handwritten text from document images and converting it into electronic form. OCR systems can contribute tremendously to the advancement of the automation process and can improve the interaction between man and machine in many applications, including office automation, bank check verification, postal automation, and a large variety of business and data entry applications. Handwritten digit recognition is the method of recognizing and classifying handwritten digits from 0 to 9 without human interaction [1]. Although the recognition of handwritten numerals has been studied for more than three decades and many techniques with high accuracy rates have already been developed, the research in this area continues with the aim of improving the recognition rates further.
Handwritten digit recognition is a complex problem due to the fact that variation exists in writing style of different writers. The phenomenon that makes the problem more challenging is the inherent variation in writing styles at different instances. Due to this reason, building a generic recognizer that is capable of recognizing handwritten digits written by diverse writers is not always feasible [2]. However, the extraction of the most informative features with highly discriminatory ability to improve the classification accuracy with reduced complexity remains one of the most important problems for this task. It is a task of great importance for which there are standard databases that allow different approaches to be compared and validated.
India is a multilingual country with 23 constitutionally recognized languages written in 12 major scripts [1]. Besides these, hundreds of other languages are used in India, each one with a number of dialects. The officially recognized languages are Hindi, Bengali, Punjabi, Marathi, Gujarati, Oriya, Sindhi, Assamese, Nepali, Urdu, Sanskrit, Tamil, Telugu, Kannada, Malayalam, Kashmiri, Manipuri, Konkani, Maithili, Santhali, Bodo, English, and Dogri. The 12 major scripts used to write these languages are Devanagari, Bangla, Oriya, Gujarati, Gurumukhi, Tamil, Telugu, Kannada, Malayalam, Manipuri, Roman, and Urdu. In a multilingual country like India, it is a common scenario that a document like job application form, railway ticket reservation form, and so forth is composed of text contents written in different languages/scripts in order to reach a larger cross section of people. The variation of different scripts may be in the form of numerals or alpha numerals in a single document page. But the techniques developed for text identification generally do not incorporate the recognition of digits. This is because the features required for the text identification may not be applicable for identifying the digits.
The paper is organized as follows: Section 2 presents a brief review of some of the previous approaches to handwritten digit recognition whereas, in Section 3, we introduce our script independent handwritten digit recognition system. Section 4 describes the performance of our system on realistic databases of handwritten digits and, finally, Section 5 concludes the paper.
2. Review of Related Works
Gorgevik and Cakmakov [3] developed Support Vector Machine (SVM) based digits recognition system for handwritten Roman numerals. They extracted four types of features from each digit image: (1) projection histograms, (2) contour profiles, (3) ring-zones, and (4) Kirsch features. They reported 97.27% recognition accuracy on National Institute of Standards and Technology (NIST) handwritten digits database [4]. In [5], Chen et al. proposed max-min posterior pseudoprobabilities framework for Roman handwritten digit recognition. They extracted 256 dimension directional features from the input image. Finally, these features were transformed into a set of 128 features using Principal Component Analysis (PCA). They reported recognition accuracy of 98.76% on NIST database [4]. Labusch et al. [6] described a sparse coding based feature extraction method with SVM as a classifier. They found recognition accuracy of 99.41% on MNIST (Modified NIST) handwritten digits database [7]. The work described in [8] combined three recognizers by majority vote, and one of them is based on Kirsch gradient (four orientations), dimensionality reduction by PCA, and classification by SVM. They achieved an accuracy rate of 95.05% with 0.93% error on 10,000 test samples of MNIST database [7]. Mane and Ragha [9] performed handwritten digit recognition using elastic image matching technique based on eigendeformation, which is estimated by the PCA of actual deformations automatically selected by the elastic matching. They achieved an overall accuracy of 94.91% on their own database collected from different individuals of various professions for the experiment. Cruz et al. [10] presented a handwritten digit recognition system which uses multiple feature extraction methods and classifier ensemble. A total of six feature extraction algorithms, namely, Multizoning, Modified Edge Maps, Structural Characteristics, Projections, Concavities Measurements, and Gradient Directional, were evaluated in this paper. A scheme using neural networks as a combiner achieved a recognition rate of 99.68% on a training set of 60,000 images and a test set of 10,000 images of MNIST database.
Dhandra et al. [11] investigated a script independent automatic numeral recognition system for recognition of Kannada, Telugu, and Devanagari handwritten numerals. In the proposed method, 30 classes were reduced to 18 classes by extracting the global and local structural features like directional density estimation, water reservoirs, maximum profile distances, and fill-hole density. Finally, a probabilistic neural network (PNN) classifier was used for the recognition system which yielded an accuracy of 97.20% on a total of 2550 numeral images written in Kannada, Telugu, and Devanagari scripts. In [12], Yang et al. proposed supervised matrix factorization method used directly as multiclass classifier. They reported recognition accuracy of 98.71% with supervised learning approach on MNIST database [7]. In [13], a mixture of multiclass logistic regression models was described. They claimed recognition accuracy of 98% on the Indian digit database provided by CENPARMI [14]. Das et al. [15] described a technique for creating a pool of local regions and selection of an optimal set of local regions from that pool for extracting optimal discriminating information for handwritten Bangla digit recognition. Genetic algorithm (GA) was then applied on these local regions to sample the best discriminating features. The features extracted from these selected local regions were then classified with SVM and recognition accuracy of 97% was achieved. In [16], a wavelet analysis based technique for feature extraction was reported. For classification, SVM and k-Nearest Neighbor (k-NN) were used and an overall recognition accuracy of 97.04% was reported on MNIST digit database [7]. A comparative study in [17] was conducted by training the neural network using Backpropagation (BP) algorithm and further using PCA for feature extraction. Digit recognition was finally carried out using 13 algorithms, neural network algorithm, and the Fisher Discriminant Analysis (FDA) algorithm. The FDA algorithm proved less efficient with an overall accuracy of 77.67%, whereas the BP algorithm with PCA for its feature extraction gave an accuracy of 91.2%.
In [18], a set of structural features (namely, number of holes, water reservoirs in four directions, maximum profile distances in four directions, and fill-hole density) and k-NN classifier were employed for classification and recognition of handwritten digits. They reported recognition accuracy of 96.94% on 5000 samples of MNIST digit database [7]. In [19], AlKhateeb and Alseid proposed an Arabic handwritten digit recognition system using Dynamic Bayesian Network. They employed DCT coefficients based features for classification. The system was tested on Indo-Arabic digits database (ADBase) which contains 70,000 Indo-Arabic digits [20] and an average recognition accuracy of 85.26% was achieved on 10,000 samples. Ebrahimzadeh and Jampour [21] proposed an appearance feature-based approach using Histogram of Oriented Gradients (HOG) for handwritten digit recognition. A linear SVM was then used for classification of the digits in MNIST dataset and an overall accuracy of 97.25% had been realized. Gil et al. [22] presented a novel approach using SVM binary classifiers and unbalanced decision trees. Two classifiers were proposed in this study where one used the digit characteristics as input and the other used the whole image as such. It is observed that a handwritten digit recognition accuracy of 100% was achieved on MNIST database using the whole image as input. El Qacimy et al. [23] investigated the effectiveness of four feature extraction approaches based on Discrete Cosine Transform (DCT), namely, DCT upper left corner (ULC) coefficients, DCT zigzag coefficients, block based DCT ULC coefficients, and block based DCT zigzag coefficients. The coefficients of each DCT variant were used as input data for SVM classifier and it was found that block based DCT zigzag feature extraction yielded a superior recognition accuracy of 98.76% on MNIST database. AL-Mansoori [24] implemented a MLP classifier to recognize and predict handwritten digits. A dataset of 5000 samples were obtained from MNIST database and an overall accuracy of 99.32% was achieved.
From the above literature, it is clear that most of the works have been done for the Roman script, whereas relatively few works [11, 15, 19] have been reported for the digit recognition written in Indic scripts. The main reasons for this slow progress could be attributed to the complexity of the shape of Indic scripts as opposed to Roman script. Again, the discriminating power of the features exploited till now is not easily measurable; investigative experimentations will be necessary for identifying new feature descriptors for effective classification of complex handwritten digits of different scripts. It is also revealed that the methods, described in the literature, suffer from larger computational time mainly due to feature extraction from large dataset. In addition, the above recognition systems fail to meet the desired accuracy when exposed to different multiscript scenario. Hence, it would be beneficial for multilingual country like India if there is a method which is independent of script and yields reasonable recognition accuracy. This has motivated us to introduce a script invariant handwritten digit recognition system for identifying digits written in five popular scripts, namely, Indo-Arabic, Bangla, Devanagari, Roman, and Telugu. The key module of the proposed methodology is shown in Figure 1.
Schematic diagram illustrating the key modules of the proposed methodology.
3. Feature Extraction Methodology
One of the basic problems in the design of any pattern recognition system is the selection of a set of appropriate features to be extracted from the object of interest. Research on the utilization of moments for object characterization in both invariant and noninvariant tasks has received considerable attention in recent years. Describing digit images with moments instead of other more commonly used pattern recognition features (described in [21–23]) means that global properties of the digit image are used rather than local properties. So, for the present work, we considered a moment based approach which is described in the next subsection.
3.1. Moments
Moments are pure statistical measure of pixel distribution around the center of gravity of the image and allow capturing global shapes information [25]. They describe numerical quantities at some distance from a reference point or axis. Moments are commonly used in statistics to characterize the distribution of random variables and, similarly, in mechanics to characterize bodies by their spatial distribution of mass.
A complete characterization of moment functional over a class of univariate functions was given by Hausdorff [26] in 1921.
Let μn be a real sequence of numbers and let us define(1)Δmμn=∑i=0m-1imiμn+i.Note that Δmμn can be viewed as the mth order derivative of μn.
By the Hausdorff theorem, a necessary and sufficient condition that there exists a monotonic function F(x) satisfying the system(2)μn=∫01xndFx,n=0,1,2,…is that the system of linear inequalities(3)Δkμn≥0k=0,1,2,…should be satisfied; that is, if f(x) is a positive function (in case of image processing), then the set of functionals(4)∫01xnfxdx,n=0,1,…completely characterizes the function.
A necessary and sufficient condition that there exists a function F(x) of bounded variation satisfying (7) is that the sequence(5)∑m=0ppmΔp-mμmp=0,1,2,…should be bounded. The use of moments for image analysis is straightforward if we consider a binary or gray level image segment as a two-dimensional density distribution function. It can be assumed that an image can be represented by a real-valued measurable function f(x,y). In this way, moments may be used to characterize an image segment and extract properties that have analogies in statistics and mechanics. In image processing and computer vision, an image moment is a certain particular weighted average (moment) of the image pixels’ intensities or a function of such moments, usually chosen to have some attractive property or interpretation. The first significant work considering moments for pattern recognition was performed by Hu [27]. He derived relative and absolute combinations of moment values that are invariant with respect to scale, position, and orientation based on the theories of invariant algebra that deal with the properties of certain classes of algebraic expressions which remain invariant under general linear transformations. Size invariant moments are derived from algebraic invariants but can be shown to be the result of simple size normalization. Translation invariance is achieved by computing moments that have been translated by the negative distance to the centroid and thus normalized so that the center of mass of the distribution is at the origin (central moments).
3.2. Geometric Moments
Geometric moments are defined as the projection of the image intensity function fx,y onto the monomial xpyq [25]. The p+qth order geometric moment Mpq of a gray level image fx,y is defined as(6)Mpq=∬-∞∞xpyqfx,ydxdy,where p,q=0,1,2,…,∞. Note that the monomial product xpyq is the basis function for this moment definition. A set of n moments consists of all Mpq’s for p+q≤n; that is, the set contains 1/2n+1n+2 elements. If f(x,y) is piecewise continuous and contains nonzero values only in a finite region of the xy-plane, then the moment sequence {Mpq} is uniquely determined by f(x,y) and, conversely, f(x,y) is uniquely determined by {Mpq}. Considering the fact that an image segment has finite area or in the worst case is piecewise continuous, moments of all orders exist and a complete moment set can be computed and used uniquely to describe the information contained in the image. However, obtaining all the information contained in the image requires an infinite number of moment values. Therefore, to select a meaningful subset of the moment values that contain sufficient information to characterize the image uniquely for a specific application becomes very important. In case of a digital image of size M×N, the double integral in (6) is replaced by a summation which turns into this simplified form:(7)mpq=∑x=1M∑y=1Nxpyqfx,y,where p,q=0,1,2,… are integers.
When f(x,y) changes by translating, rotating, or scaling, then the image may be positioned such that its center of mass (COM) is coincided with the origin of the field of view, that is, (x¯=0) and (y¯=0) and then the moments computed for that object are referred to as central moment [25] and it is designated by μpq. The simplified form of central moment of order (p+q) is defined as follows: (8)μpq=∑x=1M∑y=1Nx-x¯py-y¯qfx,y,where x¯=m10/m00 and y¯=m01/m00.
The pixel point (x¯,y¯) is the COM of the image. The central moments μpq computed using the centroid of the image are equivalent to mpq whose center has been shifted to centroid of the image. Therefore, the central moments are invariant to image translations. Scale invariance can be obtained by normalization. The normalized central moments denoted by ηpq are defined as (9)ηpq=μpqμ00γ,where γ=p+q/2+1 for p+q=2,3,….
The second order moments, η02,η11,η20 known as the moments of inertia, may be used to determine an important image feature called orientation [25]. Here, the feature values F1–F3 have been computed from moments of inertia of the word images. In general, the orientation of an image describes how the image lies in the field of view or the directions of the principal axes. In terms of moments, the orientation of the principal axis, θ, taken as feature value F4, is given by (10)θ=12tan-12μ11μ20-μ02,where θ is the angle of the principal axis nearest to the x-axis and is in the range -π/4≤θ≤π/4. The minimum and maximum distances (rmin and rmax) between the centroid and the boundary of an image are also feature descriptors. The ratio rmax/rmin is called elongation or eccentricity (F5) and can be defined in terms of central moments as follows:(11)e=μ20-μ022+4μ112μ00.
3.3. Moment Invariants
Based on the theory of algebraic invariants, Hu [27] derived relative and absolute combinations of moments that are invariant with respect to scale, position, and orientation. The method of moment invariants is derived from algebraic invariants applied to the moment generating function under a rotation transformation. The set of absolute moment invariants consists of a set of nonlinear combinations of central moment values that remain invariant under rotation. A set of seven invariant moments can be derived based on the normalized central moments of order three that are invariant with respect to image scale, translation, and rotation. Consider (12)ϕ1=η20+η02,ϕ2=η20-η022+4η112,ϕ3=η30-3η122+3η21-η032,ϕ4=η30+η122+η21+η032,ϕ5=η30-3η12η30+η12η30+η122-3η21+η032+3η21-η03η21+η033η30+η122-η21+η032,ϕ6=η20-η02η30+η122-η21+η032+4η11η30+η12η21+η03,ϕ7=3η21-η03η30+η12η30+η122-3η21+η032+3η12-η30η21+η033η30+η122-η21+η032.This set of moments is invariant to translation, scale change, mirroring (within a minus sign), and rotation. The 2D moment invariant gives seven features (F6–F12) which had been used for the current work.
3.4. Affine Moment Invariants
The affine moment invariants are derived to be invariants to translation, rotation, and scaling of shapes and under 2D Affine transformation. The six affine moment invariants [28] used for the present work are defined as follows:(13)I1=1μ004μ20μ02-μ112,I2=1μ0010μ302μ032-6μ30μ21μ12μ03+4μ30μ123+4μ03μ213-3μ212μ122,I3=1μ007μ20μ21μ03-μ122-μ11μ30μ03-μ21μ12+μ02μ30μ12-μ212,I4=1μ0011μ203μ032-6μ202μ11μ12μ03-6μ202μ21μ02μ03+9μ202μ02μ122+12μ20μ112μ03μ21+6μ20μ11μ02μ30μ03-18μ20μ11μ02μ21μ12-8μ113μ03μ30-6μ20μ022μ30μ12+9μ20μ022μ212+12μ112μ02μ30μ12-6μ112μ022μ30μ21+μ022μ302,I5=1μ006μ40μ04-4μ31μ13+3μ222,I6=1μ009μ40μ04μ22+2μ31μ22μ13-μ40μ132-μ04μ312-μ223.A total of 6 features (F13–F18) is extracted from each of the handwritten digit images for the present work.
3.5. The Legendre Moment
The 2D Legendre moment [29] of order p+q of an object with intensity function f(x,y) is defined as follows:(14)Lpq=2p+12q+14∬-1+1PpxPqyfx,ydxdy,where the kernel function Pp(x) denotes the pth-order Legendre polynomial and is given by(15)Ppx=∑k=0pCpk1-xk+-1p1+xk,where(16)Cpk=-1k2k+1p+k!p-k!k!2.Since the Legendre polynomials are orthogonal over the interval [-1,1] [20], a square image of N×N pixels with intensity function f(i,j), with 1≤i, j≤N, must be scaled to be within the region -1≤x, y≤1. The graphical plot for first 10 Legendre polynomials is shown in Figures 2(a)-2(b). When an analog image is digitized to its discrete form, the 2D Legendre moments Lpq, defined by (14), is usually approximated by the formula:(17)Lpq=2p+12q+1N-12∑i=1N∑j=1NPpxiPqyjfxi,yj,where xi=2i-N-1/(N-1) and yj=(2j-N-1)/(N-1), and, for a binary image, fxi,yj is given as(18)fxi,yj=1,ifi,jisinoriginalobject,0,otherwise.As indicated by Liao and Pawlak [30], (17) is not a very accurate approximation of (14). For achieving better accuracy, they proposed to use the following approximated form: (19)L~pq=2p+12q+14∑i=1N∑j=1Nhpqxi,yjfxi,yj,where(20)hpqxi,yj=∫xi-Δx/2xi+Δx/2∫yj-Δy/2yj+Δy/2PpxPpydxdywith Δx=xi-xi-1=2/N-1 and Δy=yj-yj-1=2/N-1.
Graph showing the plots for the two-dimensional Legendre polynomials Pnx: (a) P1x to P5x and (b) P6x to P10x.
To evaluate the double integral hpq(xi,yj) defined by (20), an alternative extended Simpson rule was proposed by Liao and Pawlak. These values were then used to calculate the 2D Legendre moments L~pq defined by (19). Therefore, this method requires a large number of computing operations. As one can see, L~pq can be expressed with the help of a useful formula that will be given below as a linear combination of Lmn, with 0≤m≤p, 0≤n≤q.
A set of 10 Legendre moments (F19–F28) can also be derived based on the set of invariant moments found in the previous subsection: (21)L00=ϕ00,L10=34ϕ10,L01=34ϕ01,L20=5432ϕ20-12ϕ00,L02=5432ϕ02-12ϕ00,L11=94ϕ11,L30=7452ϕ30-32ϕ10,L03=7452ϕ03-32ϕ01,L21=15432ϕ21-12ϕ01,L12=15432ϕ12-12ϕ10.
3.6. Zernike Moments
Zernike polynomials are orthogonal series of basis functions normalized over a unit circle. The complexity of these polynomials increases with increasing polynomial order [31]. To calculate the Zernike moments, the image (or region of interest) is first mapped to the unit disc using polar coordinates, where the center of the image is the origin of the unit disc. The pixels falling outside the unit disc are not considered here. The coordinates are then described by the length of the vector from the origin to the coordinate point. The mapping from Cartesian to polar coordinates is defined as follows:(22)x=rcosθ,y=rsinθ,where(23)r=x2+y2,θ=tan-1yx.An important attribute of the geometric representations of Zernike polynomials is that lower order polynomials approximate the global features of the shape/surface, while the higher ordered polynomials capture local shape/surface features. Zernike moments are a class of orthogonal moments and have been shown to be effective in terms of image representation.
Zernike introduced a set of complex polynomials which forms a complete orthogonal set over the interior of the unit circle; that is, x2+y2=1. Let the set of these polynomials be denoted by Vnmx,y. The form of these polynomials is as follows:(24)Vnmx,y=Vnmp,θ=Rnmρexpjmθ,where
n: positive integer or zero,
m: positive and negative integers subject to constraints n-m even, m≤n,
ρ: length of vector from origin to f(x,y) pixel,
θ: angle between vector ρ and x-axis in counterclockwise direction.
As mentioned above, the complex Zernike moments of order n with repetition m for a continuous image function f(x,y) are defined as follows:(25)Vnm=n+1π∬fx,yVnm∗ρ,θdxdyin the xy image plane where x2+y2≤1 and ∗ indicates the complex conjugate. Note that, for the moments to be orthogonal, the image must be scaled within a unit circle centered at the origin and(26)Vnm=n+1π∫02π∫01fρ,θRnmρexp-jmθρdρdθin polar coordinates. The Zernike moment of the rotated image in the same coordinates is given by(27)Vnmr=n+1π∫02π∫01fρ,θ-αRnmρexp-jmθρdρdθ.By change of variable, θ1=θ-α,(28)Vnmr=n+1π∫02π∫01fρ,θ1Rnmρexp-jmθ1+αρdρdθ=n+1π∫02π∫01fρ,θ1Rnmρexp-jmθ1ρdρdθexp-jmα=Vnmexp-jmα.Equation (28) shows that Zernike moments have simple rotational transformation properties; each Zernike moment merely acquires a phase shift on rotation. This simple property leads to the conclusion that the magnitudes of the Zernike moments of a rotated image function remain identical to those before rotation. Thus, the magnitude of the Zernike moment, Vnm, can be taken as a rotation invariant feature of the underlying image function. The real-valued radial polynomial Rnm(ρ) is defined as follows:(29)Rnmρ=∑x=0n-m/2-1sn-s!s!n+m/2-s!n-m/2-s!ρn-2s,where n-m = even and |m|≤n.
Zernike moments may also be derived from conventional moments μpq as follows:(30)Znl=n+1π∑k=ln∑j=0q∑m=0l-imqjlmBnlkμk-2j-l+m,2j+l-m.Zernike moments may be more easily derived from rotational moments, Dnk, by(31)Znl=∑k=lnBnlkDnk.When computing the Zernike moments, if the center of a pixel falls inside the border of unit disk x2+y2≤1, this pixel will be used in the computation; otherwise, the pixel will be discarded. Therefore, the area covered by the moment computation is not exactly the area of the unit disk. Advantages of Zernike moments can be summarized as follows:
The magnitude of Zernike moment has rotational invariant property.
They are robust to noise and shape variations to some extent.
Since the basis is orthogonal, they have minimum redundant information.
An image can better be described by a small set of its Zernike moments than any other types of moments such as geometric moments.
A relatively small set of Zernike moments can characterize the global shape of pattern. Lower order moments represent the global shape of pattern whereas the higher order moments represent the details.
Therefore, we choose Zernike moments as our shape descriptor in digit recognition process. Table 1 lists the rotation invariant Zernike moment features (F29–F64) and their corresponding numbers from order 0 to order 10 used for the present work.
List of Zernike moments and their corresponding numbers of features from order 0 to order 10.
Order(n)
Zernike moments of order n with repetition m (Vnm)
Total number of moments up to order 10
0
V00
36
1
V11
2
V20,V22
3
V31,V33
4
V40,V42,V44
5
V51,V53,V55
6
V60,V62,V64,V66
7
V71,V73,V75,V77
8
V80,V82,V84,V86,V88
9
V91,V93,V95,V97,V99
10
V10,0,V10,2,V10,4,V10,6,V10,8,V10,10
The defined features on the Zernike moments are only rotation invariant. To obtain scale and translation invariance, the digit image is first subjected to a normalization process using its regular moments. The rotation invariant Zernike features are then extracted from the scale and translation normalized image.
3.7. Complex Moments
The notion of complex moments was introduced in [32] as a simple and straightforward technique to derive a set of invariant moments. The two-dimensional complex moments of order (p,q) for the image function f(x,y) are defined by(32)Cpq=∫a1a2∫b1b2x+jypx-jyqfx,ydxdy,where p and q are nonnegative integers and j=-1. Some advantages of the complex moments can be described as follows:
When the central complex moments are taken as the features, the effects of the image’s lateral displacement can be eliminated.
A set of complex moment invariants can also be derived which are invariant to the rotation of the object.
Since the complex moment is an intermediate step between ordinary moments and moment invariants, it is relatively more simple to compute and more powerful than other moment features in any pattern classification problem.
The complex moments of order (p,q) are a linear combination with complex coefficients of all the geometric moments Mnm satisfying p+q=n+m. In polar coordinates, the complex moments of order (p+q) can be written as follows:(33)Cnl=Cpq=∫02π∫0+∞ρp+qejp-qθfρcosθ,ρsinθρdρdθ,where p+q=n and p-q=l denote the order and repetition of the complex moments, respectively. If the complex moment of the original image and that of the rotated image in the same polar coordinates are denoted by Cpq and Cpqr, the relationship [33] between them is given as follows:(34)Cpqr=Cpqe-jp-qθ,where θ is the angle at which the original image is rotated. The complex moment features represent the invariant properties to lateral displacement and rotation. Based on the definition of moment invariants, we know that as the image is rotated, each complex moment goes through all possible phases of a complex number while its magnitude Cpq remains unchanged. If the exponential factor of the complex moment is canceled out, we will obtain its absolute invariant value, which is invariant to the rotation of the images. The rotation invariant complex moment features (F65–F130) and their corresponding numbers from order 0 to order 10 used for the present work are listed in Table 2.
List of complex moments and their corresponding numbers of features from order 0 to order 10.
Order (n)
Complex moments (Cpq)
Complex moments of order n with repetition l (Cnl)
Finally, a feature vector consisting of 130 moment based features is calculated from each of the handwritten numeral images belonging to five different scripts. Summarization of the overall moment based feature set used in the present work is enlisted in Table 3.
Description of feature vector.
Serial number
List of moments
Number of features
1
Geometric moment (F1–F5)
5
2
Moment invariant (F6–F12)
7
3
Affine moment invariant (F13–F18)
6
4
Legendre moment (F19–F28)
10
5
Zernike moment (F29–F64)
36
6
Complex moment (F65–F130)
66
Total
130
4. Experimental Study and Analysis
In this section, we present the detailed experimental results to illustrate the suitability of moment based approach to handwritten digit recognition. All the experiments are implemented in MATLAB 2010 under a Windows XP environment on an Intel Core2 Duo 2.4 GHz processor with 1 GB of RAM and performed on gray-scale digit images. The accuracy, used as assessment criteria for measuring the performance of the proposed system, is expressed as follows:(35)⌻Accuracy Rate%=Number of Correctly classified digitsTotal number of digits×100%.
4.1. Detailed Dataset Description
Handwritten numerals from five different popular scripts, namely, Indo-Arabic, Bangla, Devanagari, Roman, and Telugu, are used in the experiments for investigating the effectiveness of the moment based feature sets as compared to conventional features. Indo-Arabic or Eastern-Arabic is widely used in the Middle-East and also in the Indian subcontinent. On the other hand, Devanagari and Bangla are ranked as the top two popular (in terms of the number of native speakers) scripts in the Indian subcontinent [34]. Roman, originally evolved from the Greek alphabet, is spoken and used all over the world. Also, Telugu, one of the oldest and popular South Indian languages of India, is spoken by more than 74 million people [34]. It essentially ranks third by the number of native speakers in India.
The present approach is tested on the database named as CMATERdb3, where CMATER stands for Center for Microprocessor Application for Training Education and Research, a research laboratory at Computer Science and Engineering Department of Jadavpur University, India, where the current research activity took place. db stands for database, and the numeric value 3 represents handwritten digit recognition database stored in the said database repository. The testing is currently done on four versions of CMATERdb3, namely, CMATERdb3.1.1, CMATERdb3.2.1, CMATERdb3.3.1, and CMATERdb3.4.1 representing the databases created for handwritten digit recognition system for four major scripts, namely, Bangla, Devanagari, Indo-Arabic, and Telugu, respectively.
Each of the digit images are first preprocessed using basic operations of skew corrections and morphological filtering [25] and then binarized using an adaptive global threshold value computed as the average of minimum and maximum intensities in that image. The binarized digit images may contain noisy pixels which have been removed by using Gaussian filter [25]. A well-known algorithm known as Canny Edge Detection algorithm [25] is then applied for smoothing the edges of the binarized digit images. Finally, the bounding rectangular box of each digit image is separately normalized to 32 × 32 pixels. Database is made available freely in the CMATER website (http://www.cmaterju.org/cmaterdb.htm) and at http://code.google.com/p/cmaterdb/.
A dataset of 3000 digit samples is considered for each of the Devanagari, Indo-Arabic, and Telugu scripts. For each of these datasets, 2000 samples are used for training purpose and the rest of the samples are used for the test purpose, whereas a dataset of 6000 samples is used by selecting 600 samples for each of 10-digit classes of handwritten Bangla digits. A training set of 4000 samples and a test set of 2000 samples are then chosen for Bangla numerals by considering equal number of digit samples from each class. For Roman numerals, a dataset of 6000 training samples is formed by random selection from the standard handwritten MNIST [7] training dataset of size 60,000 samples. In the same way, 4000 digit samples are selected from MNIST test dataset of size 10,000 samples. These digit samples are enclosed in a minimum bounding square and are normalized to 32 × 32 pixels dimension. Typical handwritten digit samples taken from the abovementioned databases used for evaluating the present work are shown in Figure 3.
Samples of digit images taken from CMATER and MNIST databases written in five different scripts: (a) Indo-Arabic, (b) Bangla, (c) Devanagari, (d) Roman, and (e) Telugu.
4.2. Recognition Process
To realize the effectiveness of the proposed approach, our comprehensive experimental tests are conducted on the five aforementioned datasets. A total of 6,000 (for Devanagari, Indo-Arabic, and Telugu scripts) numerals have been used for the training purpose whereas the remaining 3000 numerals (1000 from each of the script) have been used for the testing purpose. For Bangla and Roman scripts, a total of 8,000 numerals (4000 taken from each script) have been used for the training purpose whereas the remaining 4,000 numerals (2000 taken from each script) have been used for the testing purpose. The designed feature set has been individually applied to eight well-known classifiers, namely, Naïve Bayes, Bayes Net, MLP, SVM, Random Forest, Bagging, Multiclass Classifier, and Logistic. For the present work, the following abovementioned classifiers with the given parameters are designed:
Naïve Bayes: Naïve Bayes classifier: for details, refer to [35].
MLP: Learning Rate = 0.3, Momentum = 0.2, Number of Epochs = 1000, minerror = 0.02.
SVM: Support Vector Machine using radial basis kernel with (p=1): for details, refer to [36].
Random Forest: Ensemble classifier that consists of many decision trees and outputs the class that is the mode of the classes output by individual trees: for details, refer to [37].
Bagging: Bagging Classifier: for detail, refer to [38].
Logistic: LogitBoost is used with simple regression functions as base learner: for details, refer to [39].
The design parameters of classifiers are chosen as typical values used in the literature or by experience. The classifiers are not specifically tuned for the dataset at hand even though they may achieve a better performance with another parameter set, since the goal is to design an automated handwritten digit recognition system based on the chosen set of classifiers.
The digit recognition performances of the present technique using each of these classifiers and their corresponding success rates achieved at 95% confidence level are shown in Figures 4(a)-4(b), respectively. It can be seen from Figure 4 that the highest digit recognition accuracy has been achieved by the MLP classifier which are found to be 99.3%, 99.5%, 98.92%, 99.77%, and 98.8% on Indo-Arabic, Bangla, Devanagari, Roman, and Telugu scripts, respectively. The performance analysis involves two parameters, namely, Model Building Time (MBT) and Recognition Time (RT). MBT is based on the time required to train the system on the given training samples whereas RT is based on the time required to recognize the given test samples. The MBT and RT required for the abovementioned classifiers on all the five databases are shown in Figures 5(a)-5(b).
Graph showing (a) recognition accuracies and (b) 95% confidence scores of the proposed handwritten digit recognition technique using eight well-known classifiers on digits of five different scripts.
Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digit recognition.
4.3. Statistical Significance Tests
The statistical significance test is one of the essential ways for validating the performance of the multiple classifiers using multiple datasets. To do so, we have performed a safe and robust nonparametric Friedman test [40] with the corresponding post hoc tests on Indo-Arabic script database. For the present experimental setup, the number of datasets (N) and the number of classifiers (k) are set as 12 and 8, respectively. These datasets are chosen randomly from the test set. The performances of the classifiers on different datasets are shown in Table 4. On the basis of these performances, the classifiers are then ranked for each dataset separately, the best performing algorithm gets the rank 1, the second best gets rank 2, and so on (see Table 4). In case of ties, average ranks are assigned to the classifiers to break the tie.
Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses are used for performing the Friedman test).
Datasets
Recognition accuracy (%)
Classifiers
Naïve Bayes
Bayes Net
MLP
SVM
Random Forest
Bagging
Multiclass Classifier
Logistic
#1
86(8)
92(7)
100(1)
99(2.5)
97(6)
99(2.5)
98(4.5)
98(4.5)
#2
99(3)
98(5.5)
100(1)
99(3)
96(7.5)
96(7.5)
97(5.5)
99(3)
#3
98(5)
94(7)
100(1)
93(8)
99(2.5)
98(5)
99(2.5)
98(5)
#4
93(8)
94(7)
99(1.5)
98(3.5)
98(3.5)
97(5)
96(6)
99(1.5)
#5
91(8)
92(7)
100(1.5)
99(3.5)
99(3.5)
97(6)
98(5)
100(1.5)
#6
99(2.5)
98(4.5)
100(1)
96(7)
97(6)
98(4.5)
99(2.5)
95(8)
#7
91(8)
94(7)
100(1)
99(3)
99(3)
99(3)
98(5.5)
98(5.5)
#8
91(8)
96(7)
100(1)
98(4.5)
98(4.5)
97(6)
99(2.5)
99(2.5)
#9
92(8)
94(7)
100(1.5)
100(1.5)
97(6)
99(4)
99(4)
99(4)
#10
99(2.5)
97(6.5)
100(1)
99(2.5)
97(6.5)
97(6.5)
98(4)
97(6.5)
#11
94(8)
96(7)
100(1)
98(5)
99(3)
98(5)
99(3)
99(3)
#12
98(4)
98(4)
100(1)
97(7)
98(4)
99(2)
97(7)
97(7)
Mean rank
R1=6.08
R2=6.37
R3=1.125
R4=4.25
R5=4.67
R6=4.75
R7=4.33
R8=4.33
Let rji be the rank of the jth classifier on ith dataset. Then, the mean of the ranks of the jth classifier over all the N datasets will be computed as follows:(36)Rj=1N∑i=1Nrji.The null hypothesis states that all the classifiers are equivalent and so their ranks Rj should be equal. To justify it, the Friedman statistic [40] is computed as follows:(37)χF2=12Nkk+1∑jRj2-kk+124.Under the current experimentation, this statistic is distributed according to χF2 with k-1 (=7) degrees of freedom. Using (37), the value of χF2 is calculated as 30.46. From the table of critical values (see any standard statistical book), the value of χF2 with 7 degrees of freedom is 14.0671 for α=0.05 (where α is known as level of significance). It can be seen that the computed χF2 differs significantly from the standard χF2. So, the null hypothesis is rejected.
Singh et al. [40] derived a better statistic using the following formula:(38)FF=N-1χF2Nk-1-χF2.FF is distributed according to the F-distribution with k-1 (=7) and (k-1)(N-1) (=77) degrees of freedom. Using (38), the value of FF is calculated as 8.0659. The critical value of F (7, 77) for α = 0.05 is 2.147 (see any standard statistical book) which shows a significant difference between the standard and calculated values of FF. Thus, both Friedman and Iman et al. statistics reject the null hypothesis.
As the null hypothesis is rejected, a post hoc test known as the Nemenyi test [40] is carried out for pairwise comparisons of the best and worst performing classifiers. The performances of two classifiers are significantly different if the corresponding average ranks differ by at least the critical difference (CD) which is expressed as follows: (39)CD=qαkk+16N.For the Nemenyi test, the value of q0.05 for eight classifiers is 3.031 (see Table 5(a) of [41]). So, the CD is calculated as 3.0318.9/6.12, that is, 3.031, using (39). Since the difference between mean ranks of the best and worst classifier is much greater than the CD (see Table 3), we can conclude that there is a significant difference between the performing abilities of the classifiers. For comparing all classifiers with a control classifier (say MLP), we have applied the Bonferroni-Dunn test [40]. For this test, CD is calculated using the same (39). But here, the value of q0.05 for eight classifiers is 2.690 (see Table 5(b) of [41]). So, the CD for the Bonferroni-Dunn test is calculated as 2.6908.9/6.12, that is, 2.690. As the difference between the mean ranks of any classifier and MLP is always greater than CD (see Table 3), the chosen control classifier performs significantly better than other classifiers for Indo-Arabic database. A graphical representation of the abovementioned post hoc tests for comparison of eight different classifiers on Dataset #1 is shown in Figure 6. Similarly, it can also be shown for Bangla, Devanagari, Roman, and Telugu databases that the chosen classifier (MLP) performs significantly better than the other seven classifiers.
Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test.
4.4. Comparison among Moment Based Features
For the justification of the feature set used in the present work, the diverse combinations of six different types of moments, namely, geometric moment (F1–F5), moment invariant (F6–F12), affine moment invariant (F13–F18), Legendre moment (F19–F28), Zernike moment (F29–F64), and complex moment (F65–F130), are compared by considering all the possible combinations. This is done for measuring the discriminating strength of the individual moment features and their combinations based on their complementary information. These can be listed as follows:
Geometric moment + moment invariant + affine Moment invariant (F1–F18).
Legendre moment (F19–F28).
Geometric moment + moment invariant + affine moment invariant + Legendre moment (F1–F28).
Zernike moment (F29–F64).
Geometric moment + moment invariant + affine moment invariant + Legendre moment + Zernike moment (F1–F64).
Legendre moment + Zernike moment (F19–F64).
Complex moment (F65–F130).
Zernike moment + complex moment (F29–F130).
Geometric moment + moment invariant + affine moment invariant + Legendre moment + Zernike moment + complex moment (F1–F130).
The graphical comparison of the corresponding numeral recognition accuracies achieved by MLP classifier over the same test set is shown in Figure 7. It can be observed from Figure 7 that the present combination of moment feature set outperforms all the other possible combinations.
Graphical comparison showing the recognition accuracies of all the possible combinations of moment based features achieved by MLP classifier.
4.5. Detail Evaluation of MLP Classifier
In the present work, detailed error analysis with respect to different parameters, namely, Kappa statistics, mean absolute error (MAE), root mean square error (RMSE), True Positive rate (TPR), False Positive rate (FPR), precision, recall, F-measure, Matthews Correlation Coefficient (MCC), and Area under ROC (AUC), is computed. Tables 5–9 provide the said statistical measurements for handwritten numeral recognition written in Indo-Arabic, Bangla, Devanagari, Roman, and Telugu scripts, respectively.
Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique on handwritten Indo-Arabic numerals (here, MAE means mean absolute error, RMSE means root mean square error, TPR means True Positive rate, FPR means False Positive rate, MCC means Matthews Correlation Coefficient, and AUC means Area under ROC).
Class
Statistical performance measures
Kappa statistics
MAE
RMSE
TPR
FPR
Precision
Recall
F-measure
MCC
AUC
“0”
0.9922
0.1758
0.293
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“1”
0.990
0.001
0.990
0.990
0.990
0.989
1.000
“2”
0.960
0.002
0.980
0.960
0.970
0.966
0.990
“3”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“4”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“5”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“6”
0.990
0.004
0.961
0.990
0.975
0.973
0.999
“7”
0.990
0.000
1.000
0.990
0.995
0.994
1.000
“8”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“9”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
Mean
0.9922
0.1758
0.293
0.993
0.0007
0.9931
0.993
0.993
0.9922
0.9989
Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique on handwritten Bangla numerals.
Class
Statistical performance measures
Kappa statistics
MAE
RMSE
TPR
FPR
Precision
Recall
F-measure
MCC
AUC
“0”
0.9944
0.0535
0.115
1.000
0.001
0.990
1.000
0.995
0.994
1.000
“1”
1.000
0.002
0.980
1.000
0.990
0.989
1.000
“2”
1.000
0.001
0.990
1.000
0.995
0.994
1.000
“3”
0.990
0.000
1.000
0.990
0.995
0.994
1.000
“4”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“5”
0.990
0.001
0.990
0.990
0.990
0.989
1.000
“6”
0.970
0.000
1.000
0.970
0.985
0.983
1.000
“7”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“8”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“9”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
Mean
0.9944
0.0535
0.115
0.995
0.0005
0.995
0.995
0.995
0.9943
1.000
Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique on handwritten Devanagari numerals.
Class
Statistical performance measures
Kappa statistics
MAE
RMSE
TPR
FPR
Precision
Recall
F-measure
MCC
AUC
“0”
0.988
0.0342
0.0847
0.969
0.002
0.984
0.969
0.977
0.974
0.999
“1”
0.969
0.000
1.000
0.969
0.984
0.983
1.000
“2”
1.000
0.005
0.956
1.000
0.977
0.975
1.000
“3”
0.985
0.003
0.970
0.985
0.977
0.975
0.999
“4”
1.000
0.002
0.985
1.000
0.992
0.992
1.000
“5”
0.985
0.000
1.000
0.985
0.992
0.991
1.000
“6”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“7”
0.985
0.000
1.000
0.985
0.992
0.991
1.000
“8”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“9”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
Mean
0.988
0.0342
0.0857
0.9893
0.0012
0.9895
0.9893
0.9891
0.9881
0.9998
Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique on handwritten Roman numerals.
Class
Statistical performance measures
Kappa statistics
MAE
RMSE
TPR
FPR
Precision
Recall
F-measure
MCC
AUC
“0”
0.9975
0.16
0.2716
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“1”
0.995
0.001
0.995
0.995
0.995
0.994
0.999
“2”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“3”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“4”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“5”
0.995
0.000
1.000
0.995
0.997
0.997
1.000
“6”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“7”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“8”
0.994
0.001
0.989
0.994
0.991
0.990
0.999
“9”
0.994
0.001
0.994
0.994
0.994
0.994
0.999
Mean
0.9975
0.16
0.2716
0.9978
0.0003
0.9978
0.9978
0.9977
0.9975
0.9997
Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique on handwritten Telugu numerals.
Class
Statistical performance measures
Kappa statistics
MAE
RMSE
TPR
FPR
Precision
Recall
F-measure
MCC
AUC
“0”
0.9867
0.0051
0.0449
0.980
0.001
0.990
0.980
0.985
0.983
0.988
“1”
0.970
0.002
0.980
0.970
0.975
0.972
0.991
“2”
1.000
0.002
0.980
1.000
0.990
0.989
1.000
“3”
0.990
0.000
1.000
0.990
0.995
0.994
0.999
“4”
0.990
0.002
0.980
0.990
0.985
0.983
0.998
“5”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“6”
0.990
0.003
0.971
0.990
0.980
0.978
0.995
“7”
0.990
0.002
0.980
0.990
0.985
0.983
1.000
“8”
1.000
0.000
1.000
1.000
1.000
1.000
1.000
“9”
0.970
0.000
1.000
0.970
0.985
0.983
0.994
Mean
0.9867
0.0051
0.0449
0.988
0.0012
0.9881
0.988
0.988
0.9865
0.9965
5. Conclusion
India is a multilingual and multiscript country comprising of 12 different scripts. But there are not much competent works done towards handwritten numeral recognition of Indic scripts. The following issues are observed with handwritten digit recognition system: (1) mostly they have worked on limited dataset. (2) Training and testing times are not mentioned in most of the works. (3) Most of the works have been done for Roman because of the availability of larger dataset like MNIST. (4) Recognition systems for Indic scripts are mainly focused on single script. (5) Limitation to some feature extraction methods also exist; that is, they are local to a particular script/language rather having a global scope. In this work, we have verified the effectiveness of a moment based approach to handwritten digit recognition problem that includes geometric moment, moment invariant, affine moment invariant, Legendre moment, Zernike moment, and complex moment. The present scheme has been tested for five different popular scripts, namely, Indo-Arabic, Bangla, Devanagari, Roman, and Telugu. These methods have been evaluated on the CMATER and MNIST databases using multiple classifiers. Finally, MLP classifier is found to produce the highest recognition accuracies of 99.3%, 99.5%, 98.92%, 99.77%, and 98.8% on Indo-Arabic, Bangla, Devanagari, Roman, and Telugu scripts, respectively. The results have demonstrated that the application of moment based approach leads to a higher accuracy compared to its counterparts. Among the most important ones, an advantage of this feature extraction algorithm is that it is less computationally expensive where the most of the published works need more computation time. These features are also very simple to implement compared to other methods. It is obvious that, to improve the performance of proposed system further, we need to investigate more the sources of errors. Potential moment features other than the presented ones may also exist.
To further improve the performance, possible future works are as follows: (1) although the moment based features perform superbly on the whole, complementary features like concavity analysis may help in discriminating confusing numerals. For example, Indo-Arabic numerals “2” and “3” can better be separated by considering the original size before normalization. (2) For classifier design, it is better to select model parameters (classifier structures) by cross validation rather than empirically as done in our experiments. (3) Combining multiple classifiers can improve the recognition accuracy.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors are thankful to the Center for Microprocessor Application for Training Education and Research (CMATER) and Project on Storage Retrieval and Understanding of Video for Multimedia (SRUVM) of Computer Science and Engineering Department, Jadavpur University, for providing infrastructure facilities during progress of the work. The current work, reported here, has been partially funded by University with Potential for Excellence (UPE), Phase-II, UGC, Government of India.
SinghP. K.SarkarR.NasipuriM.Offline Script Identification from multilingual Indic-script documents: a state-of-the-artLiuC.-L.NakashimaK.SakoH.FujisawaH.Handwritten digit recognition: benchmarking of state-of-the-art techniquesGorgevikD.CakmakovD.Handwritten digit recognition by combining SVM classifiers2Proceedings of the International Conference on Computer as a Tool (EUROCON '05)November 2005Belgrade, Serbia1393139610.1109/EURCON.2005.1630221GarrisM. D.BlueJ. L.CandelaG. T.ChenX.LiuX.JiaY.Learning handwritten digit recognition by the max-min posterior pseudo-probabilities methodProceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR '07)September 2007Parana, Brazil34234610.1109/icdar.2007.43787292-s2.0-51149120196LabuschK.BarthE.MartinetzT.Simple method for high-performance digit recognition based on sparse codingLeCunY.BottouL.BengioY.HaffnerP.Gradient-based learning applied to document recognitionWenY.LuY.ShiP.Handwritten Bangla numeral recognition system and its application to postal automationManeV.RaghaL.Handwritten character recognition using elastic matching and PCAProceedings of the International Conference on Advances in Computing, Communication and ControlJanuary 2009Mumbai, IndiaACM41041510.1145/1523103.15231842-s2.0-70349152503CruzR. M. O.CavalcantiG. D. C.RenT. I.Handwritten digit recognition using multiple feature extraction techniques and classifier ensembleProceedings of the 17th International Conference on Systems, Signals and Image ProcessingJune 2010Rio de Janeiro, Brazil215218DhandraB. V.BenneR. G.HangargeM.Kannada, telugu and devanagari handwritten numeral recognition with probabilistic neural network: a script independent approachYangJ.WangJ.HuangT.Learning the sparse representation for classificationProceedings of the 12th IEEE International Conference on Multimedia and Expo (ICME '11)July 2011Barcelona, SpainIEEE1610.1109/icme.2011.60120832-s2.0-80155182249GiménezA.Andrés-FerrerJ.JuanA.SerranoN.Discriminative bernoulli mixture models for handwritten digit recognitionProceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR '11)September 2011Beijing, ChinaIEEE55856210.1109/icdar.2011.1182-s2.0-82355186395Al-OhaliY.CherietM.SuenC.Databases for recognition of handwritten Arabic chequesDasN.SarkarR.BasuS.KunduM.NasipuriM.BasuD. K.A genetic algorithm based region sampling for selection of local features in handwritten digit recognition applicationAkhtarM. S.QureshiH. A.Handwritten digit recognition through wavelet decomposition and wavelet packet decompositionProceedings of the 8th International Conference on Digital Information Management (ICDIM '13)September 2013Islamabad, PakistanIEEE14314810.1109/icdim.2013.66939922-s2.0-84893606121DanZ.XuC.The recognition of handwritten digits based on BP neural network and the implementation on AndroidProceedings of the 3rd International Conference on Intelligent System Design and Engineering Applications (ISDEA '13)January 2013Hong KongIEEE1498150110.1109/isdea.2012.3592-s2.0-84874477465BabuU. R.VenkateswarluY.ChinthaA. K.Handwritten digit recognition using K-nearest neighbour classifierProceedings of the World Congress on Computing and Communication Technologies (WCCCT '14)March 2014Trichirappalli, India606510.1109/wccct.2014.72-s2.0-84899427913AlKhateebJ. H.AlseidM.DBN—based learning for Arabic handwritten digit recognition using DCT featuresProceedings of the 6th International Conference on Computer Science and Information Technology (CSIT '14)March 2014Amman, Jordan22222610.1109/csit.2014.68060042-s2.0-84901236242AbdleazeemS.El-SherifE.Arabic handwritten digit recognitionEbrahimzadehR.JampourM.Efficient handwritten digit recognition based on Histogram of oriented gradients and SVMGilA. M.FilhoC. F. F. C.CostaM. G. F.Handwritten digit recognition using SVM binary classifiers and unbalanced decision treesEl QacimyB.KerroumM. A.HammouchA.Feature extraction based on DCT for handwritten digit recognitionAL-MansooriS.Intelligent handwritten digit recognition using artificial neural networkGonzalezR. C.WoodsR. E.HausdorffF.Summationsmethoden und Momentfolgen. IHuM.-K.Visual pattern recognition by moment invariantsPetrouM.KadyrovA.Affine invariant features from the trace transformYapP.-T.ParamesranR.An efficient method for the computation of Legendre momentsLiaoS. X.PawlakM.On image analysis by momentsKhotanzadA.HongY. H.Invariant image recognition by Zernike momentsAbu-MostafaY. S.PsaltisD.Recognitive aspects of moment invariantsAbu-MostafaY. S.PsaltisD.Image normalization by complex momentsAugust 2015, http://en.wikipedia.org/wiki/Languages_of_IndiaJohnG. H.LangleyP.Estimating continuous distributions in Bayesian classifiersProceedings of the 11th Conference on Uncertainty in Artificial Intelligence (UAI '95)August 1995San Mateo, Calif, USA338345KeerthiS. S.ShevadeS. K.BhattacharyyaC.MurthyK. R. K.Improvements to platt's SMO algorithm for SVM classifier designBreimanL.Random forestsBreimanL.Bagging predictorsle CessieS.van HouwelingenJ. C.Ridge estimators in logistic regressionSinghP. K.SarkarR.DasN.BasuS.NasipuriM.Statistical comparison of classifiers for script identification from multi-script handwritten documentsDemšarJ.Statistical comparisons of classifiers over multiple data sets