Searching for Cryptographically Significant Rotation Symmetric Boolean Functions by Designing Heuristic Algorithms

It has been proved that the set of rotation symmetric Boolean functions (RSBFs) is abundant in cryptographically strong functions with multiple criteria. In this study, we design two genetic algorithms and apply them to search for balanced RSBFs with high nonlinearity. The experimental results show that our methods can generate cryptographically strong Boolean functions with high nonlinearity, 1-resilient functions, and optimal algebraic immunity. It shows that these functions have superiority from the view point of practical application in cryptosystems compared with known ones which are obtained by other heuristics.


Introduction
As the basic nonlinear components, Boolean functions can achieve the confusion and diffusion for ciphers (pp. 398-399 of [1]). When we apply them to cryptosystems, Boolean functions should embrace excellent cryptographic properties, such as balancedness, correlation immunity, high nonlinearity, high algebraic degree, and high algebraic immunity. However, all such characteristics cannot be optimum at the same time, and trade-offs should be of consideration. erefore, constructions of Boolean functions with compromise criteria are always a challenging open problem [2,3].
A metaheuristic is designed to generate an almost best solution to an optimization problem with guidance between local improvement and higher level strategies. In the literature, lots of papers used heuristic algorithms to search for cryptographically important Boolean functions and several long open problems had been solved [4][5][6]. Hill climbing (HC) and genetic algorithm (GA) were firstly applied to search for highly nonlinear Boolean functions in 1996 [7,8] by modifying the true table of a Boolean function. Because the HC is a local search procedure and cannot generally achieve global optimization, then global metaheuristics, such as genetic algorithm, simulated annealing, ant colony, and their hybridization with HC, have been presented to improve the solutions between the conflicting cryptographic criteria [9][10][11][12]. For example, in 1998, the authors introduced a genetic algorithm combined with HC and found balanced Boolean functions in 6, 8, and 10 variables with nonlinearity greater than 2 n− 1 − 2 n/2 [9]. However, their method seems not to be valid always for all situations.
Because rotation is a useful operator which can speed up the performance of the ciphers and preserve the security at the same time [13][14][15], recently, RSBFs have been attracted to be researched because of their advantages in cryptographic algorithms for the simple structure, fast speed, high resource utilization, and their richness of cryptographically significant Boolean functions [16][17][18]. We call a Boolean function RS if its outputs are invariant under the input of the cyclic shift. Using a steepest-descent-like algorithm, Kavut et al. [4] have searched Boolean functions in 9 variables with nonlinearity 241. is led to solving an almost three-decadeold open problem if there exist Boolean functions of 9 variables whose nonlinearity greater than bent concatenation bound 240.
However, this algorithm cannot guarantee the balancedness of Boolean functions. By applying simulated annealing (SA) to 9-variable RSBFs and with some algebraic techniques, Liu and Youssef [6] constructed 10-variable Boolean functions with algebraic degree 7, resiliency degree 2, and nonlinearity 488. is result has answered the open problem about the existence of such functions in [3]. Motivated by the previous work, searching for RSBFs with multiple cryptographical properties should be further investigated.
In this study, we generalize the traditional genetic algorithm and apply it to search for balanced functions with high nonlinearity in the class of RSBFs. e experimental results demonstrate our method can generate excellent Boolean functions with high nonlinearity, 1-resilient functions, and optimal algebraic immunity. We can also obtain bent functions which have been applied widely in cryptography, spread spectrum, coding theory, and combinatorial design. It shows that these functions have superiority from the point of practical application in cryptosystems compared with known ones that are obtained by other heuristics. We organize this paper as follows. In Section 2, we introduce some preliminary definitions and useful results. Section 3 describes the traditional genetic algorithm. Based on this algorithm, we propose a modification of GA named GA-reset. We also design an algorithm to generate balanced RSBFs. A generality of GA-reset is presented when we pursue the high nonlinearity of RSBF. By combining these algorithms proposed in this paper, we have obtained excellent RSBFs with the variables of 8, 10, and 12. We give a conclusion in Section 4.

Boolean Functions.
Let GF(2) n be the n-dimensional vector space over the finite field GF(2) � 0, 1 { }. Denote by ⊕ the addition operation over GF (2). Let 0 and 1 be the allzero vector and the all-one vector of GF(2) n , respectively. An n-variable Boolean function f(x) can be represented uniquely as an n-variable polynomial, called its algebraic normal form (ANF). An n-variable Boolean function f(x), where x � (x 0 , x 1 , . . . , x n−1 ) ∈ GF(2) n , is a mapping from GF(2) n to GF (2), which can be represented uniquely as an n-variable polynomial, called its algebraic normal form (ANF): e algebraic degree is defined as the number of variables in the highest order product term with nonzero coefficient. A Boolean function is said to be affine if its degree does not exceed 1. e set of all n-variable affine functions is denoted by A n . We call a function nonlinear if it is not in A n . e Hamming weight w H (x) of a binary vector x ∈ GF(2) n is the number of its nonzero coordinates, and the Hamming weight w H (f) of a Boolean function f is the size of its support x ∈ GF (2) w � (w 0 , w 1 , . . . , w n−1 ), let w · x be an inner product in GF(2) n , for instance, the usual inner product w 0 x 0 + w 1 x 1 +, · · · , +w n−1 x n−1 . en, the Walsh coefficients for a Boolean function f(x) ∈ B n are the values of the real valued function over GF (2) n defined by e Walsh spectrum of the Boolean function f is the set of all the Walsh coefficients W f (w).
For convenience, we use W f instead of W f (0). It is easy to derive the following elementary identity: and the well-known formula (see . 2.17, p.13, of [2])

Definition 2. A Boolean function on GF(2) n is called t-resilient if and only if its Walsh coefficients satisfied
From Xiao-Messay theorem [19], the algebraic degree of a t-resilient function f ∈ B n is at most We can extend the definition of ρ k n on tuples as follows: Let G n be the cyclic group of the permutation ρ k n : 0 ≤ k ≤ n − 1 , and we denote by the orbit of (x 0 , x 1 , . . . , x n−1 ) under the action of G n . It is obvious that G n (x 1 , x 2 , . . . , x n ) generates a partition of the vector space GF(2) n . It is shown in [20] that the number of orbits of GF(2) n is exactly where ϕ is Euler's function.

Related
Work. e term genetic algorithm (GA) was first used by John Holland in 1995 based on Darwinian evolution theory. Followed by Spillman [21] and Clark [22], it was shown that GA has been successfully applied in cryptanalysis of classical ciphers and modern ciphers [8,9,23,24]. In evolution of Boolean functions, Millan et al. [7,8] firstly applied GA to find Boolean functions with high nonlinearity. By introducing a resetting step, they combined GA with HC and obtained balanced Boolean functions with high nonlinearity [8]. However, most of the previous work applied several fitness functions to obtain Boolean functions with multiple cryptographical criteria. In this study, we will show that one can obtain cryptographically strong Boolean function by using the fitness function defined in step (2) of Algorithm 1.
GAs is inspired by bio-operators such as mutation, crossover, and selection. It usually starts from a sample of individuals which is generated randomly. In each iteration, there is an iterative process with the sample, which is called a generation. In a genetic algorithm, the sample with candidate solutions (i.e., individuals) is expected to evolve toward better solutions. For more details, see [25].

Searching for Cryptographically Strong RSBFs
We represent the individuals as truth tables of Boolean functions. However, when the search space is restricted to the class of RSBFs, each orbit indicates a gene and the length of the crossover is equal to the number of the orbits. If a bit in the truth table of an RSBF is changed, then it means that all outputs corresponding to an orbit should be changed to obtain another RSBF. Take 10-variable RSBFs as an example. ere are g 10 � 108 orbits of GF(2) n ; we list them in Table 1. e genetic algorithm searching for RSBFs is designed as Algorithm 1.

Remark 1.
e function in step (2) was first proposed by [26] to measure the cryptographical stability of a Boolean function. Kavut et al. make use of it in their steepest-descentlike iterative algorithm and find RSBFs in 9 variables with nonlinearity 241 [4]. Because this fitness function minimizes the squared distance of a Boolean function with even number of variables to bent functions in terms of Walsh spectra, therefore, we can expect a highly nonlinear RSBF with the minimum of it. By experiments, we found that when the initial population size is 30, the efficiency of the algorithm and the scale of the solutions have the best trade-off.

A Modification of GA-Rest.
In the previous algorithm, the "child" solution produced by the "parents" solutions with the genetic operators, crossover and mutation, is generally not a balanced RSBF. erefore, we improve it as Algorithm 2. Let p 1 and p 2 be the parent n-variable balanced RSBFs and d(p 1 , p 2 ) be the Hamming distance of p 1 and p 2 . Denote by c the child bred by p 1 and p 2 . Let n 1 be the number of 1 of the truth table of c restricted to the indexes such that the parents bits are different. e objective of the algorithm is to generate a balanced RSBF such that n 1 � d(p 1 , p 2 ). Note that all entries corresponding to an orbit should be changed to obtain another RSBF if one bit of the truth table of an RSBF is complemented.

Remark 2. Note that the complementing truth table of a Boolean function does not change its nonlinearity. e check in
Step 2 of Algorithm 2 is to ensure that only the parents who are close to each other are allowed to breed. e checks in Steps 9 and 12 are used to force the child RSBF to be balanced. Experimental results show that these modifications are benefit for obtaining better solutions. . Denote by (n, m, d, N f , AI) the profile of a Boolean function as the number of its input variable, resiliency order, algebraic degree, nonlinearity, and algebraic immunity. Particularly, we denote by (n, −1, d, N f , AI) and by (n, 0, d, N f , AI) the unbalanced functions and the balanced functions, respectively. In this section, we perform the traditional GA and GA-rest to search RSBFs with 10 variables to determine which algorithm is better. By programming the traditional GA, the highest nonlinearity of the RSBFs achieved is 484, and it is balanced, but it is not resilient. Meantime, we have obtained many RSBFs with (10, 1, −, 480, 5). We present its truth table in a hexadecimal format.

Results and Discussion
By programming GA-reset, we have found balanced RSBFs with nonlinearity 486, which are higher than the results generated by the traditional GA. And also, we have obtained RSBFs with (10,1,8,484,5). e following is one of the examples: We collect the best results of the two algorithms in Table 2.
e results show that though the efficiency of GA-reset is lower than GA, the solutions obtained are better than that of GA. It seems that the efficiency of GA-reset and converge of the solutions is in a reasonable trade-off between them.

Searching for RSBF with High
Nonlinearity. If we extend the "generic operators" in Algorithm 1 to all pairs of the current generations, then the algorithm can converge quickly to bent functions. In searching for the class of 10variable RSBFs, it takes 1′45″34 to obtain bent functions. Most of them are with optimal algebraic immunity. at constructing bent function with optimal algebraic immunity has always been an open problem until the method present in [27].
is shows that the new algorithm converges to global optimal solutions targeting nonlinearity. Together with Algorithm 2, we can search balanced RSBFs with the highest nonlinearity compared with the known algorithms. We state it in Algorithm 3.

Remark 3.
It shows that though the efficiency is lower than previous algorithms, we can generate a bent RSBF within an acceptable time. Together with Algorithm 2, we can generate balanced RSBFs with strongly cryptographical properties.

Security and Communication Networks
is randomly equal to 0 or 1.
(2) Fitness function. Let the fitness function be Fitness � − ω∈GF(2) n (W 2 f (ω) − 2 n ) 2 (3) Genetic operators −Two-point crossover. e two crossover points (potential solution) are chosen randomly on the parent truth tables of the rotation symmetric functions (RSTTs). All bits between these two points are swapped between the parents, rendering two child RSTTs. −Mutation. e purpose of mutation in GAs is introducing diversity. According to the mutation probability p m , we chose some orbits of the RSTT and complete it. We checked the efficiency of the algorithm for the mutation probability of 0.2, 0.1, and 0.05 and found that when p m � 0.05, the output is optimum. −Selection. e fitness function is evaluated for each individual and then the fitness values are normalized. For the kth individual with fitness value f k , then its probability of being selected is where N is the number of individuals in the population. Compute the cumulative probability distribution F k � k i�1 p k and generate a uniform random number ξ k ranging in [0, 1); then, the kth individual is selected if F k−1 ≤ ξ k < F k .
(4) Resetting. As in [9], we add the resetting step to the traditional GAs. at is, if the fitness of the best solution cannot be improved after a number of iterations, then we retain the best solution and randomly generate N − 1 balanced RSBFs. (5) Termination. Because of the randomness of GAs, there must be enough iterations so that the solutions can be convergent. us, we assign the number of iterations to be 100,000.
In the remainder of this section, we apply Algorithm 3 to search for 8, 10, and 12-variable RSBFs and the maximum number of the iterations 100,000. We get RSBFs with (12, −1, 10, 1998, 6). ey cannot be linearly transformed to balanced functions since there is no zero in their Walsh spectra. However, we find RSBFs which are (12,0,10,1996,6) [8,9] and there are 76 zeroes in its Walsh spectrum, which can be linearly transformed to 1-resilient functions [10]. We also find RSBFs which are (12,1,10,1992,6) and one of Algorithms 2 and 3 of the examples is in Section Appendix.
We collect and compare the known results obtained by heuristics in Table 3.

Conclusion
Rotation symmetric Boolean functions have an advantage in cryptosystems since they can be described lightly. In this study, we search for balanced RSBF with excellent cryptographical properties by designing heuristic algorithms. e experimental results have proved that there is a reasonable trade-off between the efficiency of our algorithms and the convergence of the RSBFs. Bent functions can also be generated by the algorithm. By programming the algorithms shown in this study, we have obtained excellent RSBFs with the variables of 8, 10, and 12. is strategy is shown to be significantly superior to some known algorithms.
Appendix e truth table of (12, 1, 10, 1992, 5) is in the hexadecimal format. Input: e maximum number of iteration MAXITER � 300, 000. Output: Cryptographically strong balanced RSBFs (1) Generate a set of N balanced RSBFs (represented by RSTT) and calculate the fitness function of each individual. Call this set S 0 .
(2) for i � 1 to MAXITER do: (3) for all N(N − 1)/2 pairs of the set S i − 1 do (4) Perform Algorithm 1 to produce N(N − 1)/2 offsprings and compute their values of the fitness function. (5) Combine the offspring in the present set, S i − 1, and choose the best N individuals as the new set, S i .
Delete duplicate solutions. (7) if the fitness of the best solution does not decrease after a number of iterations then (8) Retain the best solution and generate P − 1 randomly balanced RSBFs as the remainder of the set. (9) end if (10) end for (11) end for (12) return the best solution from the current set.
ALGORITHM 3: Generation of GA-reset. Data Availability e data used to support the findings of this study have been uploaded to github (https://github.com/kistoday/ cryptographically-significant-rotation-symmetric-booleanfunctions).

Conflicts of Interest
e authors declare that they have no conflicts of interest.