An Efficient Primitive-Based Method to Recognize Online Sketched Symbols with Autocompletion

We present a new structural method of sketched symbol recognition, which aims to recognize a hand-drawn symbol before it is fully completed. It is invariant to scale, stroke number, and order. We also present two novel descriptors to represent the spatial distribution between two primitives. One is invariant to rotation and the other is not. Then a symbol is represented as a set of descriptors.The distance between the input symbol and the template one is calculated based on the assignment problem.Moreover, a fast nearest neighbor (NN) search algorithm is proposed for recognition. The method achieves a satisfactory recognition rate in real time.


Introduction
Sketch recognition is widely used in pen-based interaction, especially with the increasing popularity of devices with touch screens.It is a natural and efficient means of capturing information by automatically interpreting hand-drawn sketches and it can be the import part of the early design process, where it helps people explore rough ideas and solutions in an informal environment.Sketch recognition has been successfully applied in education [1,2], engineering [2,3], design [4], and so on.
Sketch recognition refers to recognition of predefined symbols or free-form drawings (e.g., an unconstrained circuit drawing); in the latter case, the recognition task is generally preceded by segmentation in order to locate individual symbols [5].This paper focuses on the recognition of handdrawn isolated symbols.However, it is a difficult problem due to the inherent imprecision and ambiguity of a freehand drawing [6].Many challenges remain in terms of intraclass compactness and interclass separation due to the variability of sketching, because it is likely that different people have different drawing styles, such as the stroke order, stroke number, and nonuniform scaling, as well as complex local shifts.
Moreover, the styles of the same individual may differ even at different times.
A practical application system should place few drawing constraints on users.So the invariance properties to scale, stroke number, and stroke order are desirable characteristics.In many applications, a graphical symbol can be drawn towards different orientations; hence, the recognition algorithm should also be rotation-free when necessary.A similar research is handwriting recognition, such as handwritten digit and Chinese character recognition, which has many effective algorithms.
The term autocompletion refers to predicting the sketched symbols before the drawing is completed [5].It can be advantageous for the users in several applications [7].Autocompletion during sketching is desirable since it can be used to facilitate sketching by offering possible user-intended symbol classes and to reduce user-originated errors by providing feedback immediately after receiving a new input stroke.
However, autocompletion is a more difficult problem, due to classifying with the partial information before the drawing is completed.Firstly, a hand-drawn symbol is ambiguous if it appears as a subsymbol of more than one symbol class.Secondly, the partially drawn symbol and the fully completed 2 Mathematical Problems in Engineering one differ in visual appearance.Finally, the similarity of them is changed with the process of sketching.
This paper focuses on the recognition of hand-drawn isolated symbols before they are fully completed and presents a structural framework to recognize online sketched symbols.The key contributions of our method are listed as follows: (1) We present a framework to recognize sketched symbols with autocompletion.It has inherent invariance to stroke number and order.It can work with a single (or possibly more) representative template for each symbol class.So it provides fine extensibility to new shapes.It obtains high recognition accuracy in real time even when the hand-drawn symbols are highly incomplete.
(2) We also present two novel descriptors, to represent the spatial distribution between two primitives.One is called DAR (directional adaptive region descriptor), which is not invariant to rotation.The other is called DZM (directional Zernike moment descriptor), which inherits the rotation invariance from the traditional Zernike moments (ZM) descriptor.They are both statistical descriptors rather than topological relations (e.g., intersections, parallelism, etc.).So there is no need to recognize primitives.The approach is independent of the primitive types.(3) A simple and fast NN search algorithm is proposed for recognition.It can reduce the runtime of structural matching.And it requires no burdensome mathematical procedures and complex data structures.
The rest of the paper is organized as follows.Section 2 contains a brief survey on the main approaches for sketched symbol recognition.In Section 3 we describe the proposed method for the recognition of partially drawn symbols.Section 4 evaluates the performance of our method.Lastly, we conclude the paper with some final remarks and a brief discussion on future work.

Related Work
According to a widely accepted taxonomy, sketched symbol recognition methods are classified into two main categories: structural and statistical [6].
Structural methods focus on building structural shape descriptions.The basic step is stroke segmentation using temporal and spatial features.Then a sketched symbol can be represented as a tree or graph, and the similarity between two sketched symbols can be calculated by structural matching.Hammond and Davis [8] developed a hierarchical language to describe how diagrams are drawn, displayed, and edited.Then they used the language to perform automatic symbol recognition.Attributed relational graph (ARG) is an excellent structural model to describe both geometry and topology of a symbol [9], and it is insensitive to orientation, scaling, and stroke order.The advantage of structural methods is distinguishing similar shapes.But the disadvantage is their sensitivity to the results of stroke segmentation.Furthermore, many approaches require the identification of the primitive type (e.g., line, arc, ellipse, etc.) and the spatial (topological) relations between two primitives (e.g., intersections, parallelism, etc.).And due to the high computational complexity, approximate algorithms for structural matching are often used, such as the approximate graph matching algorithms presented in [9].
Statistical approaches look at the visual appearance of symbols, without stroke segmentation and primitive recognition.Mostly, a number of features are extracted from the pixels of the unknown symbol, followed by a statistical classifier.Some shape descriptors, such as Zernike moment [10,11] and shape context [12], can be used to represent a sketched symbol.Kara and Stahovich [13] proposed an image-based recognizer, using four template classifiers.In their method, polar coordinates were used to achieve rotation invariance.Ouyang and Davis [14] proposed a visual approach to sketched symbol recognition.It used a set of visual features that captured online stroke properties like orientation and endpoint location.Almazán et al. [15] described a framework to learn a model of shape variability based on the Active Appearance Model (AAM) and proposed two types of BSM (Blurred Shape Model [16]) descriptors.Willems et al. [17] explored a large number of online features, which were sorted in three feature sets due to different levels of details.Delaye and Anquetil [18] presented a set of 49 features, called HBF49, for the representation of hand-drawn symbols.And HBF49 can be used as a reference to evaluate a symbol recognition system.The advantage of statistical methods is the high robustness to noise and different drawing styles, such as stroke order and direction.It avoids the complex phase of primitive extraction.
Furthermore, most of current researches, such as the above cited methods in [8][9][10][11][12][13][14][15][16][17][18], concern the fully completed symbols.To date, only a few systems have been introduced supporting autocompletion [5,7,[19][20][21].In these works, a symbol is usually represented as a spatial relation graph (SRG) [20] or a spatial division tree (SDT) [19], and then the similarity or distance between the input symbol (probably incomplete) and the template can be calculated based on these representations.In [20] a syntactic approach is presented.Costagliola et al. proposed a graph-based method [6,7].It uses the ARG to represent a symbol, and the recognition is based on subgraph isomorphism.But it is not invariant to rotation.Unlike these structural methods, Tirkaz et al. [5] proposed an image-based method, whose framework was fully probabilistic.And it also has no inherent invariance to rotation.But the approach relies on the observation that people do tend to prefer certain stroke drawing orderings over others.Hence, it is not completely invariant to stroke order but relies on user's preferred order in the training data.

Our Approach
We designed our approach primarily using the primitivebased matching.An overview of the recognition process is shown in Figure 1.In particular, we mainly show how to represent the symbols and how to calculate the distance between the hand-drawn symbol and a template.This method is organized as follows.Firstly, all strokes of a symbol are preprocessed (Section 3.1) and segmented into primitives (Section 3.2).Then the descriptor is extracted for every biprimitive (Section 3.3) (a biprimitive means a pair of primitives).In this step, we propose two novel descriptors.Next, the distance (equivalently, dissimilarity) between the unknown symbol and a template is calculated by a bipartite graph matching (equivalently, optimal assignment) procedure (Section 3.4).Finally, a fast NN search algorithm is proposed for symbol recognition (Section 3.5).

Preprocessing.
The preprocessing of the input sketch directly facilitates pattern description and affects the quality of description.Its tasks include noise elimination, shape scaling, and resampling.These operations are simple to perform and guarantee a better stability of extracted features, for any type of input sketch.
The noise in input trajectories is due to erratic hand motions and the inaccuracy of digitization.The noise reduction techniques include smoothing, filtering, and wild point correction.As the quality of input devices steadily advances, trajectory noise becomes less influential and simple smoothing operations will suffice.
To achieve invariance under scaling and translation, the coordinates of stroke points are simply shifted and linear scaled such that all points are enclosed in a standard box.In our experiments we set ,  ∈ [0, 100].It means translating maximal dimension of a symbol to 100 with aspect ratio preserved [22].
Since online strokes are typically sampled at a constant temporal frequency, the distance between neighboring points varies based on the pen speed.This produces more sampled points where the pen is typically slower.In order to make feature extraction more reliable, we resample each stroke at a constant spatial distance.In our experiments the resampling interval is set to 1.0.

Corner Finding.
Our method works with primitives and not directly with strokes, so corner finding is an essential step in order to extract the primitives, as well as most structural methods.
Primitives are regarded as simple graphical components, such as lines, arcs, and ellipse.The objective of corner finding is decomposing a stroke into primitives.There are many existing methods for corner finding, for example, IStraw [23], MergeCF [24], ClassSeg [25], SpeedSeg [26], QPBDP [27], DPFrag [28], and RankFrag [6,7].In fact, it is another wellstudied problem in sketch-based interfaces.However, our main work is to represent the symbols and calculate the distance between two symbols after corner finding.We use the existing corner finding algorithm in [7].It is the revisited version of Ouyang and Davis's work in [29].It has been reported with satisfactory performance.Instead of immediately trying to decide which points are corners, it repeatedly removes the point that is least likely to be a corner.The details of the method are available in [7].In a brief review, the method works as follows.
(1) Initially, a number of equally spaced points are extracted from the stroke and are all added to a list of possible corners.
(2) Then the points which are least likely to be corners are iteratively removed from the list.The likelihood is evaluated through a cost function.For each point   in the list, a cost value is computed as where   is the subset of points in the resampled stroke between point  −1 and point  +1 and mse(  ;  −1 ,  +1 ) is the mean squared error between the set   and the line segment formed by ( −1 ,  +1 ).The term dist(  ;  −1 ,  +1 ) is the minimum distance between   and the line segment formed by ( −1 ,  +1 ).
For the point  with the smallest cost, it is iteratively removed from the list, and the cost in (1) is updated.At each iteration, the decision to remove a point is taken on a binary classifier, which is previously trained with data.The data include ten features, extracted from the strokes, points, or stroke fitting errors.Six of the features are described in [29], while the rest are advanced in [7].The features are shown in Table 1.
(3) During classification, if the classifier decides that  is not a corner, it removes the vertex and continues to the next elimination candidate.Otherwise, if it decides that it is a corner, the process stops and all remaining vertices are returned as corners.

Feature Extraction.
After corner finding, a sketched symbol is represented by a set of biprimitives.The main task of this step is to calculate the biprimitive descriptor.It is used to describe how two primitives are spatially related within a symbol.
We propose two descriptors.One is called DAR (directional adaptive region descriptor).It is inspired by the directional features, whose effectiveness in representing a character or symbol has been demonstrated in both handwritten character and sketch recognition [30].But it is not invariant to rotation.The other is called DZM (directional Zernike moment descriptor), which incorporates local direction information into the ZMs.It inherits the rotation invariance from the traditional ZMs.
After feature extraction, each symbol is represented through a set of proposed descriptors where each element is associated with a pair of primitives.

DAR Descriptor.
Firstly, for each resampled point   , there is a local line defined by two consecutive points (  ,  +1 ).Each local line is decomposed into components in standard directions.We employ four chain code directions.

Feature
Description Reference

Cost
The cost of removing the vertex, from (1). [29]

Diagonal
The diagonal length of the stroke's bounding box.

Ink density
The length of the stroke divided by the diagonal length.

Max distance
The distance to the farther of its two neighbors ( −1 or  +1 ) normalized by the distance between the two neighbors.

Min distance
The distance to the nearer of its two neighbors normalized by the distance between the two.

Sum distance
The sum of the distances to the two neighbors normalized by the distance between the two.

EllipseFit
A function calculated on the whole stroke, returning a real value between 0 and 1.The higher this value is, the more the stroke resembles an ellipse (or a circle).
[7] PolyFit A function calculated for the candidate corner point   on the substroke whose endpoints are the previous and the next candidate corner ( −1 and  +1 , resp.), returning a real value between 0 and 1.The higher this value is, the more the stroke resembles a polyline composed of the two segments  −1   and    +1 .

Angle
The magnitude of the angle with vertex in the candidate corner point, calculated with respect to the previous and the next sampled points of the stroke.

SeqNumber
The sequence number of the iteration of the removal process.The major advantage is the independence of local stroke direction; for example, the decomposition of (  ,  +1 ) is the same as that of ( +1 ,   ).If a local line lies between two neighboring standard directions, it is decomposed into two components in the two standard directions, as shown in Figure 2. Thus the local line is assigned to four directional planes, corresponding to four chain code directions.The length of the local line component is assigned to the corresponding pixel in the plane.An example of the whole process is shown in Figure 3.
Secondly, each directional plane is partitioned into several uniform zones.In our method, there are two kinds of partition for the high accuracy, such as Figure 3.In partition 1, the biprimitive region is partitioned into four subregions by two dashed lines, which both pass through the centroid (marked as a red point in the figure) of the two primitives.And the directions of the dashed lines are 0 and 90 degrees.Meanwhile in partition 2, the directions of the two dashed lines are 45 and 135 degrees, respectively.
Lastly, in each subregion, we accumulate the pixels.So for a biprimitive, we get two 4 × 4 = 16 dimensional vectors, named V 1 and V 2 , corresponding to the two kinds of partition, respectively.
Given two biprimitives,  and , denote their DARs as , respectively.Then we define the DAR distance dist DAR (, ) as follows: where ‖ ⋅ ‖ means Euclidean distance.

DZM Descriptor. DZM incorporates local direction
information into the ZMs which represent only the spatial distribution of sample points.A shape is decomposed into several component channels and the DZM descriptor consists of the ZMs from all channels.Figure 4 shows an example.Firstly, similar to DAR, for each resampled point   , there is a local line defined by two consecutive points ( −1 ,  +1 ).The local directional angle   for   is defined as the angle between lines (  , ) and ( −1 ,  +1 ), where  is centroid (marked as a red point in Figure 4).Obviously, invariance to rotation is intrinsic to the angle   .
Secondly, each local angle is decomposed into components in  uniformly spaced standard angles, such as {0, (1/), (2/), . . ., }; each of them would also be referred to as a channel later.If a local angle value lies between two standard angles, it is decomposed into two components in the two standard angles.It is similar to the process of directional decomposition in DAR, while the standard directions (angles) are different.Thus the biprimitive is decomposed into  directional planes (subimages), corresponding to  standard angles.The membership degree (component length) of   is assigned to the corresponding pixel in the plane.And the planes are invariant to rotation.
Finally, extract a set of  ZMs on each plane ( is the order of ZMs).Eventually, the DZM descriptor consists of

Symbol Matching.
To facilitate the presentation, the input sketched symbol is recorded as  (meaning unknown, probably incomplete), and the template symbol is denoted as If the biprimitive numbers of  and  are not equal, the cost matrix can be made square by adding "dummy" biprimitives to the smaller set.).In order to recognize the incompletely hand-drawn symbols, the matching cost is defined as where the term length( ) is the total length of the biprimitive and the variable  is an empirical parameter.And dist( ) is DAR distance in (2) or DZM distance in (3).It depends on whether rotation invariance is required by the user or not.
In our experiments  can be set to 0.  is, the more the penalty is.Given the matching cost matrix C between  and , we want to minimize the total cost of matching subject to the constraint that the matching is one-to-one; that is,  is a permutation.In this case, a biprimitive will be matched to a dummy whenever there is no real match.This is an instance of the square assignment (or weighted bipartite matching) problem, which can be solved in ( 3 ) time using Hungarian algorithm.The minimum  in (7) is the distance between  and .

Fast NN Search Algorithm for Symbol Recognition.
By calculating the distances between  and each of the templates, the NN techniques can be used for symbol recognition.However, because the pattern is not described as a vector, the traditional strategies to speed up, such as KD-trees and Mtrees, cannot be used.So the expensive cost of computation is a key issue which needed to be addressed.Inspired by the work in [31], we propose a simple and fast NN search algorithm for our framework of sketch recognition.The main idea is to reject a lot of candidates based on the lower bound of distances efficiently.Denote  and  as It means  is the sum of minimums in each row of the matching cost matrix C and  is the real distance between  and  using Hungarian algorithm.Obviously, the calculation of  is simpler and faster than , and  ≤  holds.So  can be seen as the lower bound of .The fast NN search algorithm is described below with two steps.
Step 1. Sequentially scanning all the templates, the lower bound of the distance between  and the th template (  ,  = 1, 2, . ..) is calculated (denoted as   ).Meanwhile, the template with the minimized   is recorded as   where  is the subscript of the template.  is regarded as the initial candidate of nearest neighbour.Then the real distance between  and   is computed using Hungarian algorithm in (7) and regarded as the initial probably minimum distance (denoted as  min ) in all templates.
Step 2. Scan each template again sequentially to compare its   with the probably minimum distance  min .If   >  min holds, then   ≥   >  min .It means   is not the nearest neighbor and could be rejected immediately.Otherwise, the real distance   between   and  is computed using Hungarian algorithm.Then, if   <  min holds, the candidate of nearest neighbor is updated as  =  and  min =   ; otherwise   would be rejected.
After scanning the templates twice in the above two steps, the final   is the nearest neighbor of .The advantage of the search algorithm is that it reports the exact nearest neighbor, not an approximate one, and requires very simple implementing with no sophisticated data structures.

Evaluation and Discussion
The proposed method is tested on two datasets which have already been introduced in literatures.The symbols in COAD dataset [5] and COAD2 dataset [7] are two different subsets of the symbols used in domain of Military Course of Action Diagrams.In total the COAD dataset contains 620 samples from 20 classes of symbols drawn by eight users.Meanwhile the COAD2 dataset contains 4520 sketched symbols drawn by eight users, belonging to 113 classes.Some sketched samples of COAD are shown in Figure 5, and the template symbols of COAD2 are shown in Figure 6.

Accuracy of Recognition with Autocompletion.
The accuracy in corner finding has been recorded in [7] as 99.65% for correct corners accuracy and 99.20% for all-or-nothing accuracy.Thus we mainly test the accuracy of sketch recognition with autocompletion.
Firstly, five perfect samples per class of symbols were chosen as the templates.Then the strokes of each original test symbol were reordered randomly in order to guarantee that the results were independent of stroke order.Additionally, when we use DZMs as biprimitive descriptors, the test symbols were rotated randomly to guarantee the rotation invariance.Next, for each symbol composed of  primitives, the recognizer was launched  times, each with the first 1, . . .,  primitives, representing the symbol at different completion status.Furthermore, the top  recognition rate reports the percentage of times that the correctly matching template is in the top  positions of the candidate list.The results are shown in Figure 7.The recognition rate is calculated as a function of the number of primitives which have already been drawn.

Evaluation of the Proposed Descriptors.
In the proposed method, a symbol is represented as a set of descriptors, where each element is associated with a biprimitive.Figure 8 shows     an example.A sketched symbol is represented as six biprimitives.Obviously each sketched symbol (probably incomplete) belonging to this symbol class in Figure 8 will share one or more biprimitives in the figure.This idea is similar to the bagof-features representation in the research of image retrieval [32].
In order to evaluate the proposed two descriptors quantitatively, we compared them with two other descriptors.One is called PSP (primitive spatial relation) presented in [7].It is an adaptation of shape context descriptor defined in [33].And PSP has no rotation invariance, as well as the proposed DAR.The other is the ZM descriptor, which is one of the best shape descriptors [34].And ZM is invariant to rotation, as well as the proposed DZM.
Firstly, because the descriptors are calculated on biprimitives, we built two subdatasets.We chose 20 and 100 classes of biprimitives from COAD and COAD2 datasets randomly, respectively.They were used to evaluate the descriptors in different sizes of datasets.Figure 9 shows a set or subset of biprimitive samples.
Then the recognition rates were calculated based on 5-fold cross validation under the nearest neighbor rule.Figure 10 shows the results.The proposed descriptors have better recognition performances.
Besides, the PSP descriptor is an adaptation of shape context.So the PSP will be computed in ( 2 ) time.Meanwhile the proposed DAR captures the distribution of every point by its directional features.It can be solved in () time.The DZM incorporates local directional information into the ZM.The computational cost of DZM is  times more than ZM, where  is the number of channels.In particular, ZM is the special version of DZM when  = 1.Although this leads to additional computational cost, more importantly the proposed DZM is more expressive and discriminative [11].

4.3.
Comparison with Other Methods.Firstly, a summary about the related methods is briefly described.Then we compared our method with the state of the art in both recognition accuracy and response time.
(1) A Summary of the Methods for Autocompletion in References.Our method is free from the identification of the primitive types, unlike many other structural methods [19][20][21].Table 2 briefly reviews the properties of the existing methods to recognize partially drawn symbols.This table also shows the comparison with other methods in the required knowledge for users.In the proposed method, the symbol matching just requires low-cost heuristic algorithm, which is simpler than others.
(2) Comparison with the ARG-Based Method in [7].The ARGbased method presented in [7] employs the NP-complete subgraph isomorphism and gives an approximate solution.And it does not support rotation invariance.So we only compared it with our method using DAR descriptor.The results are shown in Figure 11.The performance of our recognizer (the solid lines) is better than the ARG-based method (the dashed lines), especially when the symbols are highly incomplete.
In addition, we also compared the response time of our approach with ARG-based method.The programming language was MATLAB and the CPU was Intel Core at 3.10 GHz.The average running time used in extracting DAR for biprimitive was 4.6 ms.The procedure of feature extraction can be proceeding incrementally with sketching.So the main proceeding time was in the NN search procedure (including the procedure of symbol matching).The response time in COAD2 dataset is shown in Figure 12.It is calculated as a function of the total number of templates.The proposed fast NN search algorithm makes our method nearly twice faster than the ARG-based method.And it is efficient to give realtime response for a dataset consisting of hundreds of symbols.(3) Comparison with the Image-Based Method in [5].Reference [5] proposed an image-based method to recognize sketched symbols with autocompletion.In fact, it adds partially drawn symbols into the training data and extracts the global statistical features of symbols.So it does not need to segment strokes into primitives.The main advantage of the imagebased method is the robustness to the different drawing styles and noise.However, there are some problems in the image-based method.Firstly, it clusters together the partial and full symbols based on their features.And it has two important parameters, the cluster number  and the confidence threshold .The optimal parameters changed with the number of the symbol classes.So a lot of experiments are needed to train optimal parameters.Moreover, the accuracy of autocompletion relies on the number of partially drawn symbols in the training data.The autocompletion performance would fall when there are not enough partially drawn symbols.However, the training data are growing exponentially when the number of symbol classes grows [5].So the image-based method is only tested in two small datasets in [5], which contain 20 and 14 classes of symbols, respectively.And if a new symbol class is added, the method should be trained again.
The recognition rates of the two methods are shown in Table 3.Although the accuracy for full completed symbols of the proposed method is lower than that of [5], our accuracy for partially drawn ones is better.The main reason is that, in COAD dataset, there are many symbols which are the subsymbols of other symbol classes.For instance, the symbol  is the subsymbol of  in Figure 5.So the symbol  is easily misrecognized as incomplete .But in [5] the cluster procedure is beneficial to the recognition of fully completed symbols.So the proposed method and the image-based method are suitable for different applications.When the size of symbol classes is small and the high accuracy for fully completed symbols is required, the image-based method is better for its high robustness.But when the recognition algorithm is used to support immediate feedback when users are sketching, the proposed method is better for its simple structural matching.

Conclusions
We have presented a new framework to recognize multistroke symbols with autocompletion.Firstly, strokes are segmented to primitives.Secondly, a symbol is represented as a set of biprimitives, each of which is represented as a shape descriptor.We propose two new descriptors, named DAR and DZM, respectively.Finally, the distance between an unknown symbol and a template one is calculated by biprimitive matching.Moreover, a fast NN search algorithm is also proposed, which significantly improves the search speed.
Our method is independent of stroke number and order.And there is no need to recognize primitives.Furthermore, invariance to rotation is achieved by using DZM descriptor.And it can work with few templates for each symbol class, easily extending to new symbols.
However, a limitation of our method is that a primitive cannot be drawn using more than one stroke.The future work is to alleviate the shortcomings, inspired by the work in [35].

Figure 1 :
Figure 1: The flowchart of our approach.

Figure 2 :
Figure 2: Directional decomposition of a local line segment.

Figure 4 :
Figure 4: An example of DZM descriptor for a biprimitive (two channels).

𝑇.
The purpose of symbol matching is to compute the distance between  and .After feature extraction, a symbol is represented as a set of descriptors where each element is associated with a biprimitive.So  and  can be denoted as = {   } ,  = 1,2, . . .,             ;  = {   } ,  = 1, 2, . . .,             , (4) where    is the th biprimitive of  and |  | is the biprimitive number, and    is the th biprimitive of  and |  | is the biprimitive number of .Let C denote the matching cost matrix between  and .Each element  , means the matching cost between    and    .Consider C = [ , ] = [ (   ,    )] .

Figure 5 :
Figure5: A sample symbol from each class in the COAD dataset[5].
Recognition using DZM in COAD2 dataset

Figure 7 :
Figure 7: Recognition rate by the number of primitives drawn in two datasets.

Figure 8 :
Figure 8: A sketched symbol is represented as a bag of biprimitives.

Figure 11 :Figure 12 :
Figure 11: Comparing our method using DAR with the ARG-based method in COAD2 dataset.

Table 1 :
List of features for corner finding.
7. The matching cost of (   ,   0 ) is set to  length(   ) for "penalty, " because if |  | > |  |, it is likely that  and  belong to different symbol classes.And the longer the biprimitive

Table 2 :
The summary of methods for autocompletion.

Table 3 :
The comparison with the image-based method in COAD dataset.
[5]e accuracy of image-based method comes from[5]when confidence threshold is set to 0.