A Secure Implementation of a Symmetric Encryption Algorithm in White-Box Attack Contexts

In a white-box context, an adversary has total visibility of the implementation of the cryptosystem and full control over its execution platform. As a countermeasure against the threat of key compromise in this context, a new secure implementation of the symmetric encryption algorithm SHARK is proposed. The general approach is to merge several steps of the round function of SHARK into table lookups, blended by randomly generated mixing bijections. We prove the soundness of the implementation of the algorithm and analyze its security and efficiency.The implementation can be used in web hosts, digital right management devices, andmobile devices such as tablets and smart phones. We explain how the design approach can be adapted to other symmetric encryption algorithms with a slight modification.


Introduction
There are three main models of the capability of an adversary to attack a cryptosystem [1]. First is the black-box model. It is a traditional attack model where an adversary only has access to the input and corresponding output of a cryptosystem. The limited information available means that an attack is usually difficult and time consuming. The second model is the greybox model, where a leakage function is present. In such an attack model, the adversary can deploy side-channel cryptanalysis techniques. Several grey-box models can be defined because of the large variety of leakage functions. Third is the white-box model where the adversary has total visibility of the cryptographic software implementation and full control over its execution. One could refer to the white-box model as the worst-case model. The white-box model is used to analyze algorithms that are running in an untrustworthy environment, that is, an environment in which applications are subject to attacks from the execution platform.
Typical white-box attack contexts include (1) a server or PC that an attacker has got the "root" or "admin" privilege of it, (2) a mobile agent that is running on a malicious host, (3) an attacker has control of an outdoor wireless sensor network node, (4) digital right management (DRM) components in cable television applications.
Secure computing in a white-box attack context (WABC) is a challenge because, as discussed in [2,3], (1) fully-privileged attack software shares a host with cryptographic software and has complete access to the implementation of algorithms, (2) dynamic execution (with instantiated cryptographic keys) can be observed, and (3) internal details of cryptographic algorithms are both completely visible and alterable.
Standard design and implementation of symmetric encryption algorithms were not intended to operate in a whitebox attack context where their execution could be observed. In fact, cryptographic models usually assume that endpoints, hosts, and hardware protection tokens are to be trusted. This is not the case in a white-box attack. By actively monitoring standard cryptographic functions or memory dumps, an attacker can even extract the cryptographic keys. This is extremely dangerous when using a symmetric encryption scheme because the decryption algorithm uses the same key as the encryption algorithm. In response to this security challenge, we propose a new, secure, and white-box implementation of a symmetric encryption algorithm that reduces the risk of keys being compromised. Note that the terms "white-box encryption algorithm" and "white-box implementation of an encryption algorithm" are used interchangeably throughout the paper.
The remainder of this article is organized as follows. Section 2 describes recent advances in white-box cryptography. A new white-box symmetric encryption algorithm is proposed in Section 3, followed by a security analysis in Section 4. Section 5 analyzes the complexity and performance of the new algorithm and includes a suggested implementation approach and some experimental results. In Section 6 we conclude with a discussion of our findings and ideas for future research.

Recent Advances in White-Box Cryptography
White-box cryptography provides protection to software implementations of encryption algorithms that may be executed on an untrustworthy host or other white-box attack contexts.
The main constraint is that the result must be directly execu-  [3]. Their original proposal is that these two algorithms could be used in digital rights management (DRM) applications to satisfy the need to protect digital information content from unauthorized access, use, and dissemination. In [4], Jacob et al. proposed that a fault injection attack, where an attacker injects errors into the program environment during execution, could defeat some obfuscation methods. They presented a cryptanalysis of a variant of the algorithm in [2] that does not have external encodings. Link and Neumann implemented white-box DES and triple-DES algorithms along the lines of Chow et al., with alterations that improved the security of the key [5]. Their system is secure against the previously published attacks on the implementation of Chow et al. and their own adaptation of a statistical bucketing attack. In 2007, Wyseur et al. [6] and Goubin et al. [7] independently cryptanalyzed all existing obfuscation methods of DES. Both attacks were based on a truncated differential cryptanalysis. Goubin et al. presented an attack that analyzed the first rounds of the white-box DES implementations, while Wyseur et al. presented an attack that works on the internal information.
In [8], Billet et al. presented an efficient and practical attack against the obfuscated AES implementation proposed by Chow et al. in [3]. It used negligible memory and had the worst time complexity of 2 30 . In 2009, Michiels et al. improved the attack so that it could be deployed on a generic class of white-box implementations [9]. In 2011, Karroumi proposed a new white-box implementation that uses dual representations of AES [10]. Karroumi  In [12], Xiao and Lai proposed a secure implementation of white-box AES after a detailed analysis of the attack technique in [8] on the AES implementation proposed in [3]. In their scheme, the obfuscation works on at least two cells of an AES state, which the attacker cannot divide them into small ones and remove them using the attack technique proposed in [8]. The time complexity of Xiao and Lai's white-box AES implementation is 2 24 . It is slower than Chow et al. 's implementation, which has a time complexity of 2 20 [3]. Furthermore, the size of Xiao and Lai's white-box AES implementation is 20502 KB. In 2012, Mulder et al. [13] presented a cryptanalysis of a white-box AES implementation, based on Xiao and Lai's idea. They applied the linear equivalence algorithm presented by Biryukov et al. in [14] as a building block. The cryptanalysis efficiently extracts the AES key with a work factor of approximately 2 32 . Furthermore, the size of Xiao and Lai's implementation still has potential to be improved.

A Novel White-Box Symmetric Encryption Algorithm
In this section, we propose a new white-box symmetric encryption algorithm based on SHARK [15]. Our general approach is to merge several steps of each round function of SHARK into table lookups, blending by randomly generated mixing bijections. We use techniques from [10,12] to obtain the obfuscated implementation.

The Symmetric Encryption Algorithm, SHARK.
SHARK is a six round substitution permutation-network that alternates a key mixing stage with linear and nonlinear transformation layers. We can split each round of the SHARK algorithm into three distinct layers: a nonlinear layer of substitution boxes, a diffusion layer, and a key addition layer. An interpolation attack can break the five rounds of a modified version of SHARK [16], but the security of the six round SHARK cipher is acceptable for many applications. Let : Then the nonlinear layer can be defined as : be the linear transformation corresponding to the diffusion layer. Then there exists a matrix such that ( ) = ⇔ = ⋅ .
Furthermore, let be the round key of the th round and let [ ] : (2 8 ) 8 → (2 8 ) 8 be the key addition mapping. Now, the symmetric encryption algorithm SHARK with encryption key is defined as follows:

Components of the White-Box Encryption Algorithm.
To hide the encryption key, we must merge several steps of each round function of SHARK into table lookups blended by randomly generated mixing bijections. In this section, we investigate how to design such tables and how randomly generated mixing bijections can be counteracted.
), we can also define the algorithm as where [ ] is the round function of the th round with round key defined as The flow of SHARK depicted in (2) and (3) is shown in Figure 1. Let be a 64 × 64 nonsingular matrix over (2), defined for = 1, . . . , 6 as where , = 0, . . . , 6 are randomly generated 64 × 64 nonsingular matrices over (2).
The external input encoding, , is a 64 × 64 nonsingular matrix over (2) defined as where , , = 1, . . . , 6, = 0, . . . , 3 are randomly generated 16 × 16 nonsingular matrices over (2). The external output encoding = ( 6 ) −1 is also a 64 ×64 nonsingular matrix over In a white-box encryption algorithm, round functions should be obfuscated to protect the round keys against attacks from an adversary. Using the definitions above, we can define the obfuscated round functions, which we will implement using a set of tables ( -Boxes). For each round, , let the obfuscated subround function be [ , , ] : The number of possible different representations of (2 16 ) is 8160. The isomorphic transformation Δ that takes the description of the cipher under the standard irreducible polynomial to another description with a different irreducible polynomial is linear. For each round , Δ is chosen randomly from these isomorphic transformations. Let be preround mixing bijections. Let ] , = 0, . . . , 5, be postround diffusion-mixing bijections.

The Complete White-Box Encryption
Algorithm. Using the components described in the previous section, the encryption process is shown in Algorithm 1.
We will now prove the soundness of our algorithm.
The following corollary shows how to decrypt the output of SHARK [ ] by modifying, the decryption process of SHARK, that is, SHAR −1 [ ].

Against Billet et al. 's and Michiels et al. 's Attack.
Billet et al. [8] described a very efficient attack against the white-box AES implementation proposed in [3]. Recovering information about the key by a local inspection of the lookup tables seems difficult, as the tables are designed to satisfy diversity and ambiguity criteria. In the Billet et al. attack, the authors take advantage of the fact that it is easier to recover information by analyzing compositions of lookup tables corresponding to one encoded AES round.
In this paper, the proposed implementation means that some attack techniques aimed at the simplicity of AESboxes are not valid. Furthermore, we have also used isomorphic transformations to increase the white-box diversity. For these reasons, the Billet et al. attack will not work.
The ideas presented in [3] can be used to derive a whitebox implementation for any substitution linear-transformation network cipher [17]. Michiels et al. [9] presented an algorithm for extracting the round keys of such a cipher when all block rows of the diffusion matrices have disjoint spanning block sets. This condition on the diffusion matrices is, for example, satisfied by all maximum distance separable matrices [18,19]. In our algorithm, we have implemented reverse operations of linear mixing bijections in a different way. This ensures that our technique is immune from the attack of Michiels et al.

Against Mulder et al. 's Attack.
Mulder et al. [13] presented a cryptanalysis of Xiao-Lai white-box AES implementation by using Biryukov et al. 's highly efficient linear equivalence algorithm [14]. The linear equivalence algorithm checks linear equivalence between two permutations ( -boxes), 1 and 2 , and finds two invertible linear mappings, 1 and 2 , such that 2 ∘ 1 ∘ 1 = 2 . This is an important problem in symmetric cryptography.
Biryukov et al. 's linear equivalence algorithm exploits the following two ideas. The first is that we can guess portions of 1 , which will provide us with knowledge of the values of 2 .
These new values from 2 allow the algorithm to extract new information about 1 . The linear (affine) structure of the mappings causes another process, which they refer to as the exponential amplification of guesses. Their second idea is that if we know vectors from the mapping 1 , we also know 2 linear combinations of these vectors.
Mulder et al. proposed a modified version of the linear equivalence algorithm in [13]. The time complexity of solving the linear equivalence problem of a building block decreases from 2 44 to 2 29 . It follows that the attack efficiently extracts the AES key from Xiao-Lai white-box AES implementation with a time complexity of approximately 2 32 . In the case of our white-box SHARK implementation, we have not found any technique that can reduce the time complexity in the same manner because of the following reasons.
(2) We use a different approach to compute .
Furthermore, the Δ transformation that we use in this paper can provide a higher work factor. The overall work factor of Mulder, Roelse, and Preneel's attack against our whitebox SHARK implementation is the product of following three factors: (1) 2 44 (= 3 2 2 , = 16) to solve the linear equivalence problem of a building block,  (2) 2 13 (≈ 8160) to guess all the dual components, (3) 2 2 because there are four building blocks in each round.
Thus, our white-box SHARK implementation remains with a security level higher than 2 44+13+2 = 2 59 against Mulder et al's attack.

Size and Performance
In this section, we first analyze the size of static data that the algorithm requires. We then make some suggestions regarding the implementation and provide some experimental results. Finally, we discuss a highly efficient work mode for encrypting data.
Each round of our algorithm requires four -Box tables. As the size of each table is 2 16 × 64 bits = 2 19 bytes, the size of the 28 tables is 14 MB. The size of each matrix is 64 × 64 bits = 2 9 bytes. Thus, the size of these matrices is 3 KB. Combining these values, we determine that the size of all lookup tables and matrices is 14339 KB.
Three operations are needed to run the SHARK [ ] algorithm: bit multiplication, bit addition, and -Box table lookup. We list the number of required operations in Table 1.
Of course, this is a "naïve" implementation as we can speed up the algorithm by using the memory-speed tradeoff technique. A multiplication table can map two input bytes ( 0 , . . . , 7 and 0 , . . . , 7 ) into a single bit ( 0 × 0 ) ⊕ ( 1 × 1 ) ⊕ ⋅ ⋅ ⋅⊕( 7 × 7 ). With the help of such multiplication table, we can optimize the complexity of matrix multiplications and obtain a fast software implementation. The extra cost of memory is only 8 KB. This implementation requires three operations: multiplication table lookup, bit addition, and -Box table lookup. Table 2 lists the required number of each operation.
We have investigated the time taken to encrypt 1 MB of data in the electronic codebook (ECB) mode on a ThinkPad notebook. The average time of the naïve implementation is 23.3 seconds and the average time of the fast implementation is only 1.2 seconds. Table 3 shows the details of the testing environment.
Clearly, the proposed algorithm is much slower than the standard algorithm because of the additional time taken when multiplying by , = 1, . . . , 6. This is true even when Plaintext 0 Plaintext 1 Plaintext 2 Plaintext 3 Figure 5: Flow of the white-box SHARK algorithm in composite PCBC mode. using the fast implementation. But the proposed algorithm running in the composite propagating cipher-block chaining (PCBC) mode, as suggested by [20], is much faster than ECB mode. In the composite PCBC mode, the speed of encryption is almost the same as the standard implementation. Figure 5 shows the flow chart of the white-box SHARK algorithm running in the composite PCBC mode.

Conclusions and Discussion
In this paper, we propose a new white-box encryption algorithm that obfuscates the cipher SHARK. Our general approach is to merge several steps of the round function of SHARK into table lookups blended by randomly generated mixing bijections. Techniques used in [10,12] are used in this paper to obtain the obfuscated cipher. Hence, this algorithm is secure against the attacks of Billet et al. [8], Michiels et al. [9], and Mulder et al. [13]. Thus, the algorithm is a countermeasure against the threat of key compromise in white-box attack context. This design of white-box SHARK can also be used to obtain a white-box AES with a slight modification. The outcome of adapting our design to use AES will be a white-box AES implementation with the size of lookup tables and matrices being 20502 MB and with a security level of 2 92 . We have chosen SHARK because it results in smaller tables and matrices and has a simpler description.
Future work should be focused on the size of the implementation. If we can significantly decrease the size, white-box encryption algorithms may be applied to lightweight applications such as the Internet of Things or wireless sensor networks.