^{1}

^{1}

^{2}

^{1}

^{2}

Known plaintext attack is a common attack method in cryptographic attack. For ciphertext, only known part of the plaintext but unknown key, how to restore the rest of the plaintext is an important part of the known plaintext attack. This paper uses backpropagation neural networks to perform cryptanalysis on AES in an attempt to restore plaintext. The results show that the neural network can restore the entire byte with a probability of more than 40%, restoring more than half of the plaintext bytes with a probability of more than 63% and restoring more than half of the bytes above 89%.

With the development of machine learning technology, the research of cryptanalysis is not limited to traditional manual deciphering. The intelligent deciphering based on machine learning, especially the emergence of intelligent deciphering based on neural network, provides a new development direction for cryptanalysis.

Neural network-based cryptanalysis can solve the shortcomings of traditional cryptanalysis methods in terms of attack difficulty and the amount of data required for attacks. First of all, neural network as an ideal black box recognition tool can effectively analyze and simulate the black box problem (cryptanalysis problem), infinitely approach the cryptanalysis problem, and finally get the algorithm equivalent to the encryption and decryption algorithm to achieve cryptanalysis. In this case, the neural network does not need to restore the key of the algorithm and the specific setting parameters and only needs to input the ciphertext, and after training, the corresponding plaintext can be obtained with a certain probability. Secondly, as a machine learning method, neural network can effectively achieve the corresponding cryptanalysis results and can achieve the final deciphering algorithm with less training set (plaintext-ciphertext pair). Therefore, neural networks are rapidly recognized by the industry as a method of cryptanalysis and are gradually called a new research direction in the field of cryptanalysis.

The application of neural network in cryptanalysis is mainly used for the global deduction of cryptographic algorithms [

In 2008, Bafghi et al. [

Inspired by [

Section

The cryptanalysis method based on neural network utilizes the learning ability of the neural network to train the neural network with the known Ming ciphertext. After the training is completed, the neural network can restore the plaintext from the ciphertext that does not belong to the training set. The corresponding system structure is shown in Figure

System structure of neural network cryptanalysis.

The ciphertexts are input in the neural network, and the output results are compared with the known plaintexts to obtain an error function. The weight is continuously corrected according to the error until the neural network is successfully trained. And finally the plaintext can be restored with a certain probability. This attack method is considered to be a global deductive attack, which is functionally equivalent to the original decryption algorithm without knowing the key. This analysis method is similar to the global approximation method for multilayer feedforward neural networks in [

In order to train a neural network with an acceptable error rate, it is necessary to expand the network size, so it is necessary to increase the time of each training cycle. There are many related parameters that need to be set in the neural network, such as the number of neurons, the number of hidden layers, the training function, etc. These parameters will be specified in Section

The relevant experiments were carried out in MATLAB_R2016a with a neural network toolbox. The equipment used in the experiment was MacBook, and the relevant data is shown in Table

Experimental equipment.

Version | macOS High Sierra 10.13.1 |

Processor | 1.2 GHz Intel Core m5 |

RAM | 8 GB 1867 MHz LPDDR3 |

Graphics Card | Intel HD Graphics 515 1536 MB |

We divide the ciphertext data into 64-bit blocks, that is, get 64 channels of input and output, and represent them in matrix form in MATLAB.

In the experiment, we choose feedforward backpropagation neural network and cascaded feedforward backpropagation neural network, in which the hidden layer and output layer of the feedforward BP neural network are only affected by the previous layer. The hidden layer and the output layer of the cascaded feedforward BP neural network are related to each of the previous layers. The specific experimental structure is shown in Figures

System structure based on feedforward BP neural network.

System structure based on cascaded feedforward BP neural network.

The relevant parameters of the neural network are set as follows:

Set four hidden layers, each of which has 128, 256, 256, and 128 neurons. Since the experiment in [

The training function of the neural network selects the quantized conjugate gradient method ’trainscg’, which is suitable for large networks and occupies storage due to the conjugate gradient method. The space is small, and the ’trainscg’ quantized conjugate gradient method saves more time than other conjugate gradient methods.

The error function that selects the correction weight during training is the mean square error (MSE). The error function is also called the loss function. The two most commonly used loss functions are cross entropy and mean square error. Among them, the cross entropy characterizes the distance between two probability distributions, which is a loss function that is used more in the classification problem. Unlike the classification problem, the regression mainly solves the prediction of specific numerical values. The restoration of plaintext from the known ciphertext belongs to the regression problem. For the regression problem, the most commonly used loss function is the mean square error.

The maximum number of training cycles is set to 500, and there are three training stop conditions, namely, (1) reaching the final training cycle number of 500; (2) the acceptable mean square error limit of 0.05; (3) continuous verification failure The maximum number of times is 20.

(1) The training process first creates a neural network and selects the network layout, that is, determines the corresponding neurons in each layer, and then sets the training stop conditions.

(2) At the beginning of the training, part of the data in the data set is used as the training set, and the rest is used as the test set. The training of the ciphertext in the training set is performed, the ciphertext is input and the corresponding result is output, and the result is XORed with the original plaintext, and the weight is continuously corrected according to the error function.

(3) The training is over and it is judged whether the training is successful. If the number of consecutive verification failures reaches 10, and the mean square error is higher than 0.1, the training fails; if the mean square error is less than 0.1, the training is successful.

(4) If the training fails, reinitialize the network and return to step

(5) If the training is successful, enter the test process.

(1) After the training is completed, the ciphertext matrix

(2) The ciphertext

(3) The ciphertext

(4) Compare the output result

Experiments are specific to the AES-128, ECB, and CBC modes of the AES-256 algorithm. According to the experimental results in [

In the specific experiment, in the first part, we select the first 85% of the data set as the training set and the last 15% as the test set and count the output results of each size of

Algorithms | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | total |
---|---|---|---|---|---|---|---|---|---|---|

AES-128_ECB | 118 | 46 | 24 | 23 | 22 | 15 | 7 | 1 | 0 | 256 |

AES-128_CBC | 126 | 45 | 20 | 21 | 21 | 13 | 7 | 2 | 0 | 256 |

AES-256_ECB | 127 | 44 | 18 | 21 | 20 | 16 | 8 | 2 | 0 | 256 |

AES-256_CBC | 117 | 48 | 24 | 21 | 21 | 16 | 7 | 2 | 0 | 256 |

Algorithms | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | total |
---|---|---|---|---|---|---|---|---|---|---|

AES-128_ECB | 128 | 31 | 22 | 24 | 25 | 16 | 8 | 2 | 0 | 256 |

AES-128_CBC | 104 | 48 | 29 | 23 | 24 | 17 | 9 | 2 | 0 | 256 |

AES-256_ECB | 151 | 24 | 15 | 18 | 22 | 15 | 9 | 2 | 0 | 256 |

AES-256_CBC | 104 | 35 | 32 | 33 | 24 | 18 | 7 | 3 | 0 | 256 |

Accuracy comparison of AES, DES, and 3DES.

Algorithms | AES-128_ECB | AES-256_ECB | AES-128_CBC | AES-256_CBC | DES_ECB | 3DES_ECB |
---|---|---|---|---|---|---|

Average epoch | 44 | 37 | 46 | 45 | 352 | 239 |

Average MSE | 0.0501 | 0.0536 | 0.0569 | 0.0611 | 0.0308 | 0.0372 |

Average error | 0.1768 | 0.2095 | 0.1699 | 0.1909 | 0.083 | 0.114 |

Experiment data source and size | [ | unexplained, | ||||

Average size of data required for successful experiments | | | | |||

Results source | this paper | [ | [ |

Number of bytes all correct of feedforward BP neural networks.

Number of bytes all correct of cascaded feedforward BP neural networks.

Number of two consecutive bytes all correct of cascaded feedforward BP neural networks.

Number of four consecutive bytes all correct of cascaded feedforward BP neural networks.

Number of eight consecutive bytes all correct of cascaded feedforward BP neural networks.

It can be seen from Tables

It can be seen from Table

From Figure

Since the feedforward BP neural network is not good for restoring consecutive bytes, this paper only shows the restoration results of the cascade feedforward BP neural network. It can be seen from Figures

This paper discusses the global deductive study of AES-128 and AES-256 algorithms. For the research goal, we use the feedforward BP neural network and the cascade feedforward BP neural network to restore the ciphertexts of the AES-128 and AES-256 algorithms in ECB and CBC modes. As a new method for cryptanalysis, neural network can restore the corresponding plaintext according to ciphertext. In the restored result, the number of bytes in all pairs is above 40%, and the number of bytes in more than half is 89%. Above, and for the cascade feedforward BP neural network, as the training set data increases, the error rate decreases, and the number of pairs of all pairs increases. In the global deductive study, we found that different neural networks will get different results, and different data types will lead to differences in error rates. Therefore, we will use this as an opportunity to continue to study different neural networks and consider the impact of more data types on error rates.

As an emerging cryptanalysis method, the cryptanalysis of plaintext restoration based on neural network is still in the experimental stage. The research on AES is also a new attempt. To get better results, we need to know more plaintext in advance and set more restrictions. Future research may require specific structural features specific to the AES algorithm to perform cryptanalysis more efficiently.

The data used to support the findings of this study are available from the corresponding author upon request.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

The authors declare that there are no conflicts of interest regarding the publication of this paper.

This work is supported by the Open Project Program of the State Key Laboratory of Mathematical Engineering and Advanced Computing; National Key Research and Development Project 2016-2018 (2016YFE0100600); Open Fund Project of Information Assurance Technology (KJ-15-008); State Key Laboratory of Cryptography and Science.