Quantum Neural Network Based Machine Translator for Hindi to English

This paper presents the machine learning based machine translation system for Hindi to English, which learns the semantically correct corpus. The quantum neural based pattern recognizer is used to recognize and learn the pattern of corpus, using the information of part of speech of individual word in the corpus, like a human. The system performs the machine translation using its knowledge gained during the learning by inputting the pair of sentences of Devnagri-Hindi and English. To analyze the effectiveness of the proposed approach, 2600 sentences have been evaluated during simulation and evaluation. The accuracy achieved on BLEU score is 0.7502, on NIST score is 6.5773, on ROUGE-L score is 0.9233, and on METEOR score is 0.5456, which is significantly higher in comparison with Google Translation and Bing Translation for Hindi to English Machine Translation.


Introduction
Machine translation is one of the major fields of NLP in which the researchers are having their interest from the time computers were invented. Many machine translation systems are available with their pros and cons for many languages. Researchers have also presented different approaches for computer to understand and generate the languages with semantics and syntactics. But still many languages are having translation difficulties due to ambiguity in their words and the grammatical complexity. The machine translator should address the key characteristic properties which are necessary to increase the performance of machine translation up to the level of human performance in translation. Most of the machine translators are working on the alignment of words in chunk (sentence).
This paper presents the quantum neural based machine translation for Hindi to English. The quantum neural network (QNN) based approach increases the accuracy during the knowledge adoptability. In this work our main focus is to show the significant increase in the accuracy of machine translation during our research with the pair of Hindi and English sentences. The machine translation is done using the new approach based on quantum neural network which learns the patterns of language using the pair of sentences of Hindi and English.
Some researchers have done their machine translation (MT) using statistical machine translation (SMT). The SMT uses the pattern recognition for automatic machine translation systems for available parallel corpora. Statistical machine translation needs alignment mapping of words between the source and target sentence. On one hand alignments are used to train the statistical models and, on the other hand, during the decoding process to link the words in the source sentence to the words of target sentence [1][2][3][4]. But SMT methods are having the problem of word ordering. To overcome the problem of word ordering and for increasing the accuracy, some researchers introduced the concept of syntax-based reordering for Chinese-to-English and Arabic-to-English [5].
Recently some work has been done with Hindi by several researchers using different methods of machine translation, like example based system [5,6], rule based [7], statistical machine translation [8], and parallel machine translation system [9]. A. Chandola and Mahalanobis described the use of corpus pattern for alignment and reordering of words for English to Hindi machine translation using the neural 2 The Scientific World Journal network [10], but still there are a lot of possibilities to develop a MT System for Hindi to increase the accuracy of MT. Some of the important works on Hindi are discussed in Section 2.
The main motivation behind the study of QNN is the possibility to address the unrealistic situation as well as realistic situation, which is not possible with the traditional neural network. QNN learns and predicts more accurately and needs less computation power and time for learning in comparison to artificial neural network. Researchers introduced the novel approach of neural network model based on quanta states superposition, having multilevel transfer function [11][12][13].
The most important difference among classical neural network and QNN is of their respective activation functions. In QNN as a substitute of normal activation functions, a multilevel activation function is used. Each multilevel function consists of the summation of sigmoid functions excited with quantum difference [14].
In QNN, the multilevel sigmoid function has been employed as activation function and is expressed as where denotes the total multilevel positions in the sigmoid functions and denotes quantum interval of quantum level [2].

ANGLABHARTI-II Machine Translation
System. ANGL-ABHARTI-II was proposed in 2004 which is a hybrid system based on the generalized example-base (GEB) with raw example-base (REB). At the time of development, the author establishes that the alteration in the rule-base is hard and the outcome is possibly random. This system consists of error-analysis component and statistical language component for postediting. Preediting component can change the entered sentence to a structure, to translate without difficulty [5].

MATRA Machine Translation
System. The MaTra was introduced in 2004 which is based on transfer approach using a frame-like structured representation. In this the rule-based and heuristics approach is used to resolve ambiguities. The text classification module is used for deciding the category of news item before working in entered sentence. The system selects the appropriate dictionary based on domain of news. It requires human assistance in analyzing the input. This system also breaks up the complex English sentence to easy sentences, after examining the structure, it produces Hindi sentences. This system is developed to work in the domain of news, annual reports, and technical phrases [7].

Proposed Machine Translation System for Hindi to English
The proposed machine translation (MT) system consists of two approaches, one is rule based MT system and the other is quantum neural based MT system. The source language goes into the rule based MT system and passes through the QNN based MT system to refine the MT done by rule based MT module, which basically recognizes and classifies the sentence category. 2600 sentences are used with English and their corresponding Devanagari-Hindi sentences. Each Devanagari-Hindi sentence consists of words with question word, noun, helping verb, negative word, verb, preposition, article, adjective, postnoun, adverb, and so forth. Each English sentence contains a question word, noun, helping verb, negative word, verb, preposition, article, adjective, postnoun, adverb, and so forth. The data used to train is produced by an algorithm, which is based on simple deterministic grammar. The entire architecture of the proposed MT system model is given in Figure 1.

Quantum Neural Architecture
As shown in the Figure 2, three-layer architecture of QNN consist of inputs, one layer of multilevel hidden units, and output layer. In QNN as a substitute of normal activation functions, a multilevel activation function is used. Each multilevel function consists of summation of sigmoid functions excited with quantum difference. Where denotes total multilevel positions in sigmoid functions, denotes quantum interval of quantum level : ) .
Here every neural network node represents three substates in itself with the difference of quantum interval with quantum

Quantum Neural Implementation of Translation Rules
The strategy is to first identify and tag the parts of speech using Table 1 and then translate the English (source language) sentences literally into Devanagari-Hindi (target language) with no rearrangement of words. After syntactic translation, rearrangement of the words has been done for accurate translation retaining the sense of translated sentence. The rules are based on parts of speech, not based on meaning. To facilitate the procedure, distinctive three-digits codes based on their parts of speech are assigned which are shown in Table 1. For a special case when input sentence and the resulting sentence are having unequal number of words, then the dummy numeric code .000 is used for giving a similar word alignment.

4
The Scientific World Journal The outcome of neural network might not be the perfect integer, it should be round off and few basic error adjustments might be needed to find the output numeral codes. Even the network is likely to arrange the location of 3-digit codes. By this, it learns the target language knowledge which is needed for semantic rearrangement and also helps in parts of speech tagging, by pattern matching: it is also helpful to adopt and learn the grammar rules up to a level. For handling the complex sentences the algorithm is used. The algorithm first removes the interrogative and negative words, on the basis of conjunction; the system breaks up and converts the complex sentence into two or more small simple sentences. After the translation of each of the simple sentences, the system again rejoins the entire subsentences and also adds the removed interrogative and negative words in the sentence. The whole process is explained in Algorithm 1 in the next section.

Algorithm for Proposed QNN Based MT System for
Complex Sentences QNNMTS (SENTENCE, TOKEN, N, LOC). Here SENTENCE is an array with elements containing Hindi words. Parameter TOKEN contains the token of each word and LOC keeps track of position. ICOUNT contains the maximum number of interrogative words encountered in sentence, NCOUNT contains the maximum number of negative words encountered in the sentence, and CCOUNT contains the maximum number of conjunction words encountered in the sentence. (see Algorithm 1).

Experiment and Results
All words in each language are assigned with a unique numeric code on the basis of their respective part of speech. Experiments show that memorization of the training data is occurring. The results shown in this section are achieved after training with 2600 Devanagari-Hindi sentences and their English translations. 500 tests are performed with the system for each value of quantum interval ( ) with random data sets selected from 2600 sentences; the dataset is divided in 4 : 3 : 3 ratios,respectively, for training, validation, and test from 2600 English sentences and their Devanagari-Hindi translations. In Table 2, the values are the average of 500 tests performed with the system for each value of quantum interval ( ) for 2600 sentences. The best performance is shown for value of quantum interval ( ) equal to one with respect to all the parameters; that is, epoch or iterations needed to train the network, the training performance, validation performance, and test performance in respect to their mean square error (MSE). Here it is clearly shown that QNN at ( ) equal to one is very much efficient as compared to classical artificial neural network at ( ) equal to zero. Table 2 clearly shows the comparison between the performances of QNN with ANN in respect to above said performance parameters and as a result we can conclude that QNN is better than ANN for machine translation.

Evaluations and Comparison
This paper proposed a new machine translation method which can combine the advantage of quantum neural network. 2600 sentences are used to analyze the effectiveness of the proposed MT system. The performance of proposed system is comparatively analyzed with Google Translation (http://translate.google .com/) and Microsoft's Bing Translation (http://www.bing .com/translator) by using various MT evaluation methods like BLEU, NIST, ROUGE-L, and METEOR. For evaluation purpose we translate the same set of input sentences by using our proposed system, Google Translation, and Bing Translation and then evaluate the output got from each of [End of if structure] [End of for] Step 9. Rejoin the entire sub sentences, if split in Step 4.
Step 10. Semantic Translation Step 11. Exit. Algorithm 1 the systems. The fluency check is done by -gram analysis using the reference translations.

BLEU.
We have used BLEU (bilingual evaluation understudy) to calculate the score of system output. BLEU is an IBM developed metric, which uses modified n-gram precision to compare the candidate translation against reference translations [16].
Comparative bar diagram between proposed system, Google, and Bing based on BLEU scale is shown in Figure 3. The bar diagram clearly shows that the proposed system has remarkably high accuracy of 0.7502 on BLEU scale, Bing 6 The Scientific World Journal where Count clip = min (Count; Max Ref Count). In other words, one truncates each word's count. Here denotes length of the candidate translation and denotes reference sentence length. Then calculate brevity penalty BP: Then, geometric mean of cooccurrences over [17]. Figure 4 shows the comparative bar diagram between proposed system, Google, and Bing based on NIST scale. The bar diagram clearly shows that the proposed system has remarkably high accuracy of 6.5773 on NIST scale, Bing has shown accuracy of 4.1744, and Google has shown 4.955 accuracy on NIST scale NIST score = BP NIST * PRECISION NIST , where where Info weights more the words that are difficult to predict and count is computed over the full set of references; theoretically the precision range is having no limit where LenHypo is total length of hypothesis and LenRef is average length of all references which does not depend on hypothesis.
7.3. ROUGE-L. ROUGE-L (recall-oriented understudy for gisting evaluation-longest common subsequence) calculates the sentence-to-sentence resemblance using the longest common substring among the candidate translation and reference translations. The longest common substring represents the similarity among two translations. lcs calculates the resemblance between two translations of length and of length ; denotes reference translation and denotes candidate translation [18]. Comparative bar diagram between proposed system, Google, and Bing based on ROUGE-L scale is shown in Figure 5. The bar diagram clearly shows that the proposed where lcs is precision and lcs is recall and LCS( , ) denotes the longest common substring of and , and = lcs / lcs when lcs / lcs = lcs / lcs Rouge-L = Harmonic Mean ( lcs , lcs ) = (2 * lcs * lcs ) ( lcs + lcs ) .
7.4. METEOR. METEOR (metric for evaluation of translation with explicit ordering) is developed at Carnegie Mellon University. Figure 6 shows comparative bar diagram between proposed system, Google, and Bing based on METEOR scale. The bar diagram clearly shows that the proposed system has remarkably high accuracy of 0.5456 on METEOR scale, Bing has shown accuracy of 0.1384, and Google has shown 0.2021 accuracy on METEOR scale. The METEOR weighted harmonic mean of unigram precision ( = / ) and unigram recall ( = / ) used.
Here denotes unigram matches, denotes unigrams in candidate translation, and is the reference translation. mean is calculated by combining the recall and precision via a harmonic mean that places equal weight on precision and recall as ( mean = 2 /( + )).
This measure is for congruity with respect to single word but for considering longer -gram matches; a penalty is calculated for the alignment as ( = 0.5( / ) 3 ).
Here denotes the number of chunks and denotes the number of unigrams that have been mapped [19]. 8 The Scientific World Journal Final METEOR-score (M-score) can be calculated as follows: Meteor-score = mean (1 − ) . (11) Experiments confirm that the accuracy was achieved for machine translation based on quantum neural network, which is better than other bilingual translation methods.

Conclusion
In this work we have presented the quantum neural network approach for the problem of machine translation. It has demonstrated the reasonable accuracy on various scores. It may be noted that BLEU score achieved 0.7502, NIST score achieved 6.5773, ROUGE-L score achieved 0.9233, and METEOR score achieved 0.5456 accuracy. The accuracy of the proposed system is significantly higher in comparison with Google Translation, Bing Translation, and other existing approaches for Hindi to English machine translation. Accuracy of this system has been improved significantly by incorporating techniques for handling the unknown words using QNN. It is also shown above that it requires less training time than the neural network based MT systems.