Most of the traditional cryptanalytic technologies often require a great amount of time, known plaintexts, and memory. This paper proposes a generic cryptanalysis model based on deep learning (DL), where the model tries to find the key of block ciphers from known plaintext-ciphertext pairs. We show the feasibility of the DL-based cryptanalysis by attacking on lightweight block ciphers such as simplified DES, Simon, and Speck. The results show that the DL-based cryptanalysis can successfully recover the key bits when the keyspace is restricted to 64 ASCII characters. The traditional cryptanalysis is generally performed without the keyspace restriction, but only reduced-round variants of Simon and Speck are successfully attacked. Although a text-based key is applied, the proposed DL-based cryptanalysis can successfully break the full rounds of Simon32/64 and Speck32/64. The results indicate that the DL technology can be a useful tool for the cryptanalysis of block ciphers when the keyspace is restricted.

Cryptanalysis of block ciphers has persistently received great attention. In particular, recently, many cryptanalytic techniques have emerged. The cryptanalysis based on the algorithm of algebraic structures can be categorized as follows: a differential cryptanalysis, a linear cryptanalysis, a differential-linear cryptanalysis, a meet-in-the-middle (MITM) attack, and a related-key attack [

However, the conventional cryptanalysis might be impractical or have limitations to be generalized. First, most of conventional cryptanalytic technologies often require a great amount of time, known plaintexts, and memory. Second, although the traditional cryptanalysis is generally performed without the keyspace restriction, only reduced-round variants are successfully attacked on recent block ciphers. For example, no successful attack on the full-round Simon or the full-round Speck, which is a family of lightweight block ciphers, is known [

This paper proposes a generic deep learning- (DL-) based cryptanalysis model that finds the key from known plaintext-ciphertext pairs and shows the feasibility of the DL-based cryptanalysis by applying it to lightweight block ciphers. Specifically, we try to utilize deep neural networks (DNNs) to find the key from known plaintexts. The contribution of this paper is two-fold: first, we develop a generic and automated cryptanalysis model based on the DL. The proposed DL-based cryptanalysis is a promising step towards a more efficient and automated test for checking the safety of emerging lightweight block ciphers. Second, we perform the DL-based attacks on lightweight block ciphers, such as S-DES, Simon, and Speck. In our knowledge, this is the first attempt to successfully break the full rounds of Simon32/64 and Speck32/64 although we apply the text-based key for the block ciphers.

The remainder of this paper is organized as follows: Section

_{0}, _{1}, …, _{n−1}) and _{0}, _{1}, …, _{n−1}), where _{i} is the _{i} is the _{0}, _{1}, …, _{m−1}), where _{i} is the

ML has been successfully applied in a wide range of areas with significant performance improvement, including computer vision, natural language processing, speech, and game [

The studies on the ML-based cryptanalysis can be classified as follows: first, some studies focused on finding the characteristics of block ciphers by using ML technologies. The authors in [

Second, some studies used ML technologies to classify encrypted traffic or to identify the cryptographic algorithm from ciphertexts. In [

Third, other researchers have endeavoured to find out the mapping relationship between plaintexts, ciphertexts, and the key, but there are few scientific publications. The work in [

We consider (

The modern term “DL” is considered as a better principle of learning multiple levels of composition, which uses multiple layers to progressively extract higher level features from the raw input [_{r} pairs of _{t} pairs randomly generated with different keys. Finally, given

A schematic diagram of the DL-based cryptanalysis.

The structure of a DNN model for the cryptanalysis is shown in Figure _{i}, and the (_{j}, where _{i}, where

A DNN model.

The ML algorithm learns from data. Hence, we need to generate data set for training and testing the DNN. Because the algorithms of modern block ciphers are publicly released, we can generate _{r} + _{s}, _{r} is used for training the DNN, and _{s} is used for testing the DNN. Let the

Data set.

The goal of our model is to minimize the difference between the output of the DNN and the key. Let

The DNN learns the value of the parameter

After training, the performance of the DNN is evaluated in terms of the bit accuracy probability (BAP) of each key bit. Here, the BAP of the

Because the output of the DNN is a real number,

Then, the BAP of the

Assume that we have

By using the de Moivre–Laplace theorem, as

For the data set, we generate the plaintext as any combination of a random binary digit, that is,

Characters used in the text key generation.

Occurrence probability in the text key generation.

Taking the occurrence probability of each key bit into consideration, the performance of finding the

The performance of the DL-based cryptanalysis is evaluated for the lightweight block ciphers: S-DES, Simon32/64, and Speck32/64, as shown in Table

Block ciphers used in case studies.

Item | S-DES | Simon | Speck |
---|---|---|---|

Block size (bits), | 8 | 32 | 32 |

Key size (bits), | 10 | 64 | 64 |

Round, | 2 | 32 | 22 |

In order to train the DNN with an acceptable loss rate, it is necessary to expand the network size. Hyperparameters, such as the number of hidden layers, the number of neurons per hidden layer, and the number of epochs, should be tuned in order to minimize a predefined loss function. The traditional way of performing hyperparameter optimization has been grid search or random search. Other hyperparameter optimizations are Bayesian optimization, gradient-based optimization, evolutionary optimization, and population-based training [^{−5}. If the number of epochs is greater than 3000, the error becomes small, and when it reaches 5000, it is sufficiently minimized, so we set the number of epochs is fixed to 5000. Consequently, the parameters used for training the DNN models are as follows: the number of hidden layers is 5, the number of neurons at each hidden layer is 512, and the number of epochs is 5000. We use the adaptive moment (Adam) algorithm for the learning rate optimization of the DNN.

The powerful “

Implemented DL-based cryptanalysis simulator.

S-DES, designed for education purposes at 1996, has similar properties and structure as DES but has been simplified to make it easier to perform encryption and decryption [_{K}, which involves both permutation and substitution operations and depends on a key input; a simple permutation function that switches the two halves of the data; the function _{K} again; and finally a permutation function that is the inverse of the initial permutation (_{K}.

Because the length of the key is limited, the brute-force attack, which is known as an exhaustive key search, is available. Some previous work presented an approach for breaking the key using genetic algorithm and particle swarm optimization [

For training and testing the DNN, we generate _{r} = 50000 and _{s} = 10000. The plaintext is any combination of a random binary digit, that is,

Figure _{1}, _{5}, and _{8}, are quite vulnerable to the attack and the key bit of _{6} is the safest. Because the minimum value of the BAP is

Bit accuracy in the S-DES with a random key.

Figure _{2}, _{5}, _{8}) in the text key and (_{1}, _{5}, _{8}) in the random key. The key bit of _{6} is the safest both in the text key and in the random key.

Deviation in the S-DES.

Lightweight cryptography is a rapidly evolving and active area, which is driven by the need to provide security or cryptographic measures to resource-constrained devices such as mobile phones, smart cards, RFID tags, and sensor networks. Simon and Speck is a family of lightweight block ciphers publicly released in 2013 [

As of 2018, no successful attack on full-round Simon or full-round Speck of any variant is known. The authors in [^{63} and the data complexity of 2^{32}. The work in [^{63} and the data complexity of 2^{31}.

For training and testing the DNN, we generate ^{64}, the actual keyspace is reduced to 2^{48}. For training, we use

Figure

Bit accuracy probability of the Simon32/64 with a random key.

Figure _{2} with a probability of 0.99. The minimum value of BAPs is 0.51603 at _{3}, which is greater than

Bit accuracy probability and deviation of the Simon32/64 with a text key.

Figure

Bit accuracy probability of the Speck32/64 with a random key.

Figure _{3}, which is greater than

Bit accuracy probability and deviation of the Speck32/64 with a text key.

We developed a DL-based cryptanalysis model and evaluated the performance of the DL-based attack on the S-DES, Simon32/64, and Speck32/64 ciphers. The DL-based cryptanalysis may successfully find the text-based encryption key of the block ciphers. When a text key is applied, the DL-based attack broke the S-DES cipher with a success probability of 0.9 given 2^{8.08} known plaintexts. That is, the DL-based cryptanalysis reduces the search space nearly by a factor of 8. Moreover, when a text key is applied to the block ciphers, the DL-based cryptanalysis finds the linear approximations between the plaintext-ciphertext pairs and the key, and therefore, it successfully broke the full rounds of Simon32/64 and Speck32/64. When a text key is applied, with a success probability of 0.99, the DL-based cryptanalysis finds 56 bits of Simon32/64 with 2^{12.34} known plaintexts and 56 bits of Speck32/64 with 2^{12.33} known plaintexts, respectively. Because the developed DL-based cryptanalysis framework is generic, it can be applied to attacks on other block ciphers without change.

The drawback of our proposed DL-based cryptanalysis is that the keyspace is restricted to the text-based key. However, although uncommon, a text-based key can be used to encrypt. For example, the login password entered with the keyboard can be text based if the input data are not hashed. Modern cryptographic functions are designed to be very random looking and to be very complex, and therefore, ML can be difficult to find meaningful relationships between the inputs and the outputs if the keyspace is not restricted. Hence, our approach limited the keyspace to only text-based keys, and the proposed DL-based cryptanalysis could successfully break the 32 bit variants of Simon and Speck ciphers. If the keyspace is not limited, the DL-based cryptanalysis failed to attack the block ciphers. In the future, the accuracy of ML will be improved, and the accuracy becomes more precise, thanks to the development of algorithms and hardware. Moreover, advanced data transformation that efficiently maps cryptographic data onto ML data will help the DL-based cryptanalysis to be performed without the keyspace restriction.

The data used to support the findings of this study are available from the corresponding author upon request.

The authors declare that there are no conflicts of interest regarding the publication of this paper.

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (nos. 2019R1F1A1058716 and 2020R1F1A1065109).