New Statistical Randomness Tests Based on Length of Runs

Random sequences and random numbers constitute a necessary part of cryptography. Many cryptographic protocols depend on random values. Randomness is measured by statistical tests and hence security evaluation of a cryptographic algorithm deeply depends on statistical randomness tests. In this work we focus on statistical distributions of runs of lengths one, two, and three. Using these distributions we state three new statistical randomness tests. New tests use χ2 distribution and, therefore, exact values of probabilities are needed. Probabilities associated runs of lengths one, two, and three are stated. Corresponding probabilities are divided into five subintervals of equal probabilities. Accordingly, three new statistical tests are defined and pseudocodes for these new statistical tests are given. New statistical tests are designed to detect the deviations in the number of runs of various lengths from a random sequence. Together with some other statistical tests, we analyse our tests’ results on outputs of well-known encryption algorithms and on binary expansions of e, π, and√2. Experimental results show the performance and sensitivity of our tests.


Introduction
Random numbers and random sequences are extensively used in many areas such as game theory, numerical analysis, quantum mechanics, and cryptography.In cryptography, need for random sequences emerges in many different applications such as challenge and response authentication systems, generation of digital signatures, and zero-knowledge protocols.Among those, the most important feature is key generators which highly depend on random values.Use of weak random values in key generations can cause a leakage in the system and hence an adversary can gain ability to break the whole cryptosystem.Therefore, randomness testing is an essential part of security evaluation of a cryptographic algorithm.
Random sequences and random numbers can be generated by true random sources, such as atmospheric noise and radioactive decay.However, using these sources in an algorithm is unpractical.It causes challenging problems in transmitting and storing large random bits since reproducing outputs of these sources is nearly impossible.Therefore, sequences and numbers, used as a key in cryptographic algorithms such as block ciphers and synchronous stream ciphers, should be pseudorandom, that is, random looking sequences of a specific length which are produced by deterministic processes [1].Since proving randomness of these generators mathematically is nearly impossible, we use statistical randomness test for this purpose.Using statistical tests we try to detect the weaknesses that a generator could have.
Moreover, outputs of encryption algorithms should be indistinguishable from random mappings; that is, it should be random looking.This is another place where pseudorandom sequences play an important role.Also, deciding the round number of a block cipher algorithm, which is an essential part of design, is highly associated with concept of being random looking.Therefore, security of the system highly depends on production or testing of pseudorandom sequences.For these reasons, statistical randomness tests are considered as an important part of evaluating security of cryptographic algorithms.
Statistical tests are designed to test the null hypothesis  0 which states that the sequence is randomly generated.Testing a binary sequence means that its degree of randomness is evaluated by a statistical test.The conclusion is that the sequence is random or not probabilistic; in other words the hypothesis  0 is either accepted or rejected.A statistical test considers a random variable whose distribution function is known.Depending on the distribution, a real number 2 Mathematical Problems in Engineering between 0 and 1, called p value, is calculated.If the p value of the sequence is evaluated as one, we say that the sequence is completely random.On the other hand, the sequence is completely nonrandom, if p value is determined as zero.If the p value exceeds a predefined real number  ∈ [0, 1], then  0 is accepted; otherwise, it is rejected.
Usually result of one statistical test is not enough to decide the randomness of sequence.Therefore, it is better to use a collection of statistical tests, called statistical test suites, to measure different behaviours of the sequence under consideration.These suites should be well designed to give trustable results and should not be blindly populated.
In the literature, there exist various statistical test packages.Among those, the most important ones are given in Knuth's book [2], test suite presented by Rukhin [3], DIEHARD [4], CRYPT-X [5], TestU01 [6], and the test suite published by NIST [7] so far.Also there are works focusing on statistical tests individually such as a universal statistical test, stated by Maurer [8], a test based on diffusion characteristic of a block cipher [9], and topological binary test defined by Alcover et al. [10].
In this work, we propose three new statistical randomness tests which depend on famous postulates of Golomb.These tests are named as runs of length one, runs of length two, and runs of length three test.The rest of the paper is formed as follows.In Section 2, we explain Golomb's randomness postulates.Also we discuss run tests given in the literature.In Section 3, we give proofs of our fundamental theorems.Also in order to calculate the probabilities needed, we state corollaries and algorithms for each theorem.In Section 4, we state new run tests and give the pseudocodes.In Section 5, we apply new tests to binary expansion of , , and √ 2, which are obtained from NIST package [7] and outputs of five advanced encryption standard competition finalists.In the last part of implementation we generate some nonrandom data sets to emphasize the sensitivity of our tests.Finally, in Section 6, we summarize our results and state the topics for further research.

Golomb's Randomness
Postulates.Deciding the pseudorandomness of a sequence is a difficult task.The base for this task is constructed by Golomb's postulates.These postulates are one of the most important attempts to create some necessary properties for a finite (or periodic) pseudorandom sequence to be random looking.Sequences satisfying following three properties are called pseudonoise sequence [11].
Let  =  0 ,  1 , . . .,  −1 , . . .be an infinite binary sequence periodic with  (or a finite sequence of length ).A run is defined as an uninterrupted maximal sequence of identical bits.Runs of 0's are called gap; runs of 1's are called block.R1, R2, and R3 are Golomb's randomness postulates which are given as follows.
(R1) In a period of , the number of 1's should differ from the number of 0's by at most 1.In other words, the sequence should be balanced.
(R2) In a period of , at least half of the total number of runs of 0's or 1's should have length one, at least onefourth should have length 2, at least one-eighth should have length 3, and the like.Moreover, for each of these lengths, there should be (almost) equally many gaps and blocks.
(R3) The autocorrelation function () should be twovalued.That is, for some integer  and for all  = 0, 1, 2, . . .,  − 1, The first postulate states that, in an -bit sequence, the difference of number of ones and zeros should be 1 or 0. In other words, the number of ones in a sequence, that is, weight of the sequence, should be approximately /2.Frequency test, which measures the difference of number of ones and zeros in an -bit sequence, is defined to check the first postulate of Golomb.Balancedness is a fundamental feature for an algorithm's output.Therefore, frequency test is used as an initial step for almost all test suites.If an algorithm fails the frequency test, then other tests are not even applied.
The second postulate of Golomb is about number of runs in sequences.Tests, which deal with number of runs, are called run tests and these are also included in many test suites as the frequency test.Since calculating the expected number of runs of specified length in a random sequence is a difficult task (especially when specified length becomes large), most of test suites consider only the total number of runs and do not consider the number of runs of different lengths.
Lastly, the third postulate gives information about amount of similarities between the sequence and shifted version of it.If  is a random looking sequence, the autocorrelation should be constant; that is, correlation between th and ( + )th bits should give no information about the sequence for  = 1, 2, . . ., ( − 1).In this paper, we mainly focus on the first and second postulates, and the last one is not a matter of concern.
These postulates are theoretical, but difficult to check.Inspired by these postulates, we define new statistical randomness tests which are practical.In order to give the definitions, we calculate the exact probabilities.Before explaining these tests, first we give the mathematical background in order to compute the probabilities that we use in the following Section 3.

Run Test.
Run tests depend on Golomb's second postulate and investigate number of runs in a sequence and their distribution.Run tests take place in most of the test suites.Almost all of these suites, run tests, consider only the total number of runs in a sequence.The most important ones of these are the suites given in [2,4,6,7].
Knuth [2] and DIEHARD [4] test suites define the run test on random numbers.They define runs as runs up and runs down in a sequence.To illustrate their definition, consider a sequence of length 10,  10 = 138742975349.Runs are indicated by putting a vertical line between   's when   >  +1 .Hence, runs of the sequence 138742975349 can be seen as |138|7|4|29|7|5|349|.In other words, the run test examines the length of monotone subsequences.TestU01 [6] defines run and gap tests for testing the randomness of long binary stream of length .This test collects runs of 1's and 0's until the total number of runs is 2.Then, for each length  = 1, 2, . . .,  the number of runs of 1's and 0's of length in this collection is counted and recorded.Then  2 test is applied on these counts.Longest run of 1's test is also defined for the collection of strings of length  which are obtained from the original long binary string of length .
NIST [7] test suite consists of firstly 16 and then 15 various statistical tests.After its first publication, some revisions are made.In 2004, it is discovered that test setting of discrete fourier transform test and lempel-ziv test were wrong [12] and new test, which can be used instead of lempel-ziv test, is defined in [13] and correction of overlapping template matching is stated in 2007 [14].
In the suite, 2 of 15 tests are variations of run tests.They are called run test and longest run of ones in a block test.The first one deals with the total number of runs in a sequence.It calculates the total number of runs in a sequence and determines whether it is consistent with the expected number of runs, which is supposed to be close to /2 in a sequence or not.The second one determines whether the longest run of ones in the sequence is consistent with the length of the longest runs of ones which is in a random sequence.In NIST test suite the reference distributions for the run tests are a  2 distribution.
In test suite, NIST assumed that sequence of length  is of order 10 3 to 10 7 .For this reason, asymptotic reference distributions were derived and used for their tests.But, asymptotic reference distribution is misleading for smaller values of ; as stated in [7] "the asymptotic reference distributions would be inappropriate and would need to be replaced by exact distributions that would commonly be difficult to compute".In other words, asymptotic reference distributions can lead to some errors in testing short sequences such as outputs of block ciphers or hash functions.In 1999, to overcome this problem, Soto and Bassham [15] propose to concatenate short sequences.This method is used for testing the randomness of Advanced Encryption Standard candidates.Another method has been proposed by Sulak et al. [16], in which distribution functions are used in NIST test suite, replaced by exact distribution and a similar method is used for producing the p values.
In this paper we use the method stated in [16]; thus we need the exact probabilities and exact distribution of tests statistics.Finding the number of sequences having a specified number of runs of length  is a hard problem.We find the number using combinatorial formulas.After that we calculate the desired probabilities by dividing the calculated number by the total number of sequences of length .Calculating the exact probabilities of the number of runs of length  in a sequence enables us to define the new run tests.We calculate the probabilities for number of runs of lengths one, two, three and we give the detailed information in the following chapter.However, as the length grows, calculations are getting complex and time required for these calculations grows exponentially.Therefore tests involving number of runs of length  ( > 3) are unpractical for statistical test suites.

Computation of Probabilities
In this chapter, we give the theorems to find the number of sequences with specified properties and hence state the exact probabilities.The probabilities depend on the number of existing shorter runs.That is, probabilities for the number of runs of length two depends on both total number of runs and number of runs of length one; similarly number of runs of length three depends on total number of runs and number of runs of lengths one and two and so on.Since they have some dependencies with other variables, these probabilities are not directly used in tests.Therefore, after stating each theorem we give the corollaries and the algorithms to find the exact probabilities which are needed for describing the tests.
In the calculations of probabilities we frequently use the following combinatorial formulas.
Fact 1 (number of nonnegative integer solutions of linear equation [17]).The number of nonnegative integer solutions of Fact 2. The number of positive integer solutions of Proof.With the substitution   =    + 1 we get From Fact 1 it follows that the number of solutions is

Number of Runs.
In the rest of the paper we denote the total number of runs and number of runs of lengths one, two, and three as   ,  1 ,  2 , and  3 and we use samples of these variables, ,  1 ,  2 , and  3 , respectively.We denote the probability of randomly chosen binary sequence with  runs by Pr(  = ).In the same way, Pr(  =   ) is the probability of randomly chosen binary sequence with   runs of length .Also we use subscripts  1 ,  2 , . . .,   to differentiate the blocks of a long sequence or outputs of block ciphers and hash functions.Lastly,  1 ,  2 , and  3 are used to state the set of number of runs of lengths one, two, and three in the sequences accordingly.That is,   = { 1  ,  2  , . . .,    } and    corresponds the number of runs of length  in the th sequence.
Moreover, in order to illustrate the runs of a sequence we use the equation  1 + 2 +⋅ ⋅ ⋅+  =  for a sequence with length  and having  runs.  ( = 1, 2 . . ., ) represents the number of bits in th run.An important property of this illustration is that it gives no information about content of   's; that is,   can be a run of 0's or 1's.Thus, each positive integer solution of the equation  1 +  2 + ⋅ ⋅ ⋅ +   =  corresponds to two sequences: one starts with 1 and the other starts with 0. Hence, the number of sequences with length  and having exactly  runs is 2 ( −1 −1 ) by Fact 2.
Example 1.Let  = 01100010011111001100011101010000 be a binary sequence of length 32 and having 15 runs.Then, , Probabilities are calculated in a similar way as in [16].The main difference is that, in the previous approach, sequences are viewed in a circular form.Probabilities depend on weight of the sequence and parity of number of runs.We calculate the probabilities with the above notation, which is not based on circular form, and they depend on the number of runs and number of shorter runs.
Theorem 2. Let S =  1 ,  2 , . . .,   be a binary sequence of length  having total of  runs; then Proof.We can illustrate the sequence of length , having  runs, as follows: From Fact 2 the number of all binary sequences  =  1 ,  2 , . . .,   of length , having total number of  runs, is 2 ( −1 −1 ).Since there are 2  sequences, probability of a randomly chosen such sequence to have exactly  runs is

Number of Runs of Length One.
In this section, probabilities for a -bit sequence having  1 runs of length one is given in a combinatorial approach.We use the illustration defined in Section 3.1 to compute the number of sequences having total of  runs,  1 of which are of length one, and hence we calculate the probabilities.Then we state the first new run test depending on the idea of Golomb's second postulate in the next chapter.
Theorem 3. The probability of randomly chosen binary sequence  =  1 ,  2 , . . .,   with length , having total of  runs,  1 of which are runs of length one, is Proof.As in the proof of the Theorem 2, we illustrate the sequence as follows: Let us first assume that the last  1 runs are the runs of length one and the rest are of at least length two.That is, Notice that, here,   ≥ 2, so we use the change of variable The number of sequences having conditions, which are stated above, is equal to the number of nonnegative solutions of (11).Consequently, by the Fact 1, number of desired solutions is Selection of  1 runs of length 1 gives us a factor of (   1 ).Since each positive integer solution of (9) corresponds two sequences (one starts with 1; the other starts with 0), 2 is stated as factor also.Therefore, the number of all binary sequences of length , having total number of  runs,  1 of which are of length one, is equal to 2 . Hence probability of a randomly chosen such sequence to have exactly  runs,  1 of which are of length one, is Number of sequences having  runs,  1 of which are of length one, can be found using the formula above.Our aim is to compute total number of sequences of length  having  1 runs of length one without depending on the total number of runs.In order to compute aimed probabilities we use Corollary 4.

Corollary 4.
Let  1 ( 1 ) denote the number of sequences with exactly  1 runs of length one.Then, Since the number of all sequences of length  is 2  , probabilities follow immediately: Moreover, using Algorithm 1 we calculate the probabilities for a sequence of length  and  1 runs of length one so that we can investigate number of length one independently.
After finding the exact probabilities we calculate the subinterval probabilities.Following example shows the calculations of subinterval probabilities for 128-bit sequences.
In the same way we calculate the subinterval probabilities for different block lengths.All subinterval probabilities for runs of length one test can be seen in Table 2.
The above construction gives us 6 different sequences of length 8 with 2 runs of length one.Also selecting  3 and  4 gives us a factor of ( 42 ).Hence, the total number of sequences of length 8 with 4 runs, 2 of which are of length one is 2 ⋅ ( 8−4−1 4−2−1 ) ⋅ ( 42 ) = 36.

Number of Runs of Length Two.
In this section, we calculate the number of sequences having  2 runs of length two in a combinatorial approach.As in the previous section we use the same notation and the similar ideas in Section 3.1 to compute the number of sequences having total of  runs,  2 of which are of length two and hence we calculate the probabilities.After that, using these calculations, we state the second new run test.
Proof.As in the previous Theorems 2 and 3 we illustrate the sequence as follows; Let us first assume that the last  1 runs are of length one and  2 runs are the runs of length two.The rest are of length at least three.That is, Notice that here,   ≥ 3. We use the change of variables   =   − 3 for  = 1, 2, . . .,  − ( 1 +  2 ) The number of sequences having conditions, which are stated above, is equal to the number of nonnegative solutions of (23).Consequently, by the Fact 1, number of desired solutions is, Selection of  1 and  2 runs of length 1 and length 2 give us a factor of (   1 ) ( − 1  2 ).Since, each positive integer solution of (21) corresponds two sequences (one starts with 1, the other starts with 0) 2 is stated as factor also.Therefore, the number of all binary sequences of length , having total number of runs,  1 and  2 of which length one and two respectively, is equal to, Hence the probability of a randomly chosen sequence to have the above conditions is; We find the number of sequences having  runs,  1 and  2 of which are length one and two respectively, using formula above.In order to define the second new run test, we need number of sequences of length  having  2 runs of length two, without depending on the other variables such as, number of runs and number of runs of length one.Corollary 8 enables us to compute the probabilities that are needed for defining the new statistical test.
Corollary 8. Let  2 ( 2 ) denote the number of runs of sequences with exactly  runs of length two.Clearly, we have maximum ⌊/2⌋ runs of length two.Otherwise sequence length exceeds .Then, for  2 = 0, 1, 2, . . ., ⌊/2⌋, Since the number of all sequences of length  is 2  , probabilities follow immediately: Also Algorithm 2 enable the calculation for the number of sequences with desired conditions.Furthermore, subinterval probabilities can be stated in the same way as in Example 5.The subinterval probabilities can be seen in Table 3.

Number of Runs of Length Three.
In the last section of this chapter, we focus on the number of sequences having exactly  3 runs of length three.We use the same constructions with Algorithm 2: Calculating Pr( 2 =  2 ) for  2 = 1, 2, . . ., ⌊/2⌋.
Proof.As in Theorems 2, 3, and 7 we illustrate the sequence as follows: Let us first assume that the last  1 are of length 1,  2 are of length 2, and  3 are of length 3. The rest are of at least length four.Consider Notice that   ≥ 4 and we use the change of variables   =   − 4 for  = 1, 2, . . .,  − ( 1 +  2 +  3 ).
The number of cases is equal to the number of nonnegative solutions of the following equation: The number of sequences having conditions, which are stated above, is equal to the number of nonnegative solutions of (32).Consequently, by Fact 1, number of desired solutions is   2 ) ( − 1 − 2  3 ).Therefore, the number of all binary sequences of length  with conditions stated above is Hence, the probability of a randomly chosen sequence to have these conditions is We find the number of sequences having  runs,  1 ,  2 , and  3 of which are of lengths one, two, and three, using the formula above.In order to use probabilities in tests we need numbers of sequences with length  and  3 runs of length two, without depending on the other variables.Corollary 10 enables us to compute the probabilities that are needed for defining the new statistical test.
Corollary 10.Let  3 ( 3 ) denote the number of runs of sequences with exactly  3 runs of length three.Clearly, we have maximum ⌊/3⌋ runs of length three.If  3 > ⌊/3⌋ sequence length exceeds , then, for  3 = 0, 1, 2, . . ., ⌊/3⌋, Since the number of all sequences of length  is 2  , probabilities follow immediately: Since the number of all sequences of length  is 2  , probabilities follow immediately: Pr( 3 =  3 ) =  3 ( 3 )/2  .And Algorithm 3 enables the calculations of the number of sequences of length  and  3 runs of length three and hence subinterval probabilities can be stated in the same way as in Example 5.The subinterval probabilities can be seen in Table 4.
In this chapter we formulate the exact numbers of sequences with given conditions and hence corresponding probabilities are given.As we mentioned before calculating the probabilities for number of runs of length more than three is unpractical.The probabilities can be stated theoretically in the same way.However the time consumption of algorithms to find the exact values grows exponentially.Therefore, it is inconvenient to use them in test suites.

Tests Descriptions
Golomb's first postulate is about the weight of a sequence and in many test suites the postulate is implemented with a proper generalization.On the other hand, the second postulate, which is about runs of a sequence, is mostly implemented according to the total number of runs regardless of their lengths.In this chapter, we define three new statistical tests as a proper generalization of Golomb's second postulate which are runs of length one test, runs of length two test, and runs of length three test.The subjects of new run tests are  1 ,  2 , and  3 as their names state.
We test the null hypothesis ( 0 ) which states that the sequence is randomly produced.There are two type of errors which are called type I and type II errors.Type I error occurs when the data is random and  0 is rejected and the second one occurs when the data is nonrandom and  0 is accepted.Probability of type I error is called level of significance and denoted by .A statistical test evaluates the sequence against this predefined number .If p value, produced by statistical test, is greater than , then  0 is accepted.Level of significance is decided based on the applications.We set  as 0.01, as in many test suites.
We use  2 as reference distribution.The measurements are compared with the expected values.In order to make a comparison we divide number of runs of lengths one, two, and three into subintervals, as explained in Section 3. New tests use the subintervals with the following property: Pr  (  <  <  +1 ) ≈ 0.2.For example, probabilities of 128bit sequences for runs of length two test can be divided into 5 subintervals as follows: After calculating the subinterval probabilities, we count the number of runs of length  in the  different sequences and increment the corresponding subinterval counter by one according to the counted number of runs.To denote the number of sequences in the given subinterval we use   .Before the last step we calculate the  2 using the following formula [16].Also  denotes the number of sequences.Consider Lastly p value is calculated according to the given values: ) . (40) We test the  0 by comparing the produced p value with the level of significance  and accept or reject the  0 .That is, if  value > ,  0 is accepted; otherwise it is rejected.
New tests can be implemented on sequences of length  =  ⋅ 25 (where  is the block size).This number is a direct consequence of creating subintervals.In order to get reliable results, in each subinterval we need at least 5 blocks of sequences.In NIST test suite it is suggested that the sequences should be about 20.000 bits long.Therefore, new run tests can be implemented on short sequences also.
For  = 0, 1, . . .,  − 1, Counting runs of a sequence by using the definition is unpractical.So we use the derivative of a sequence to count the runs.By the definition, all 1's in the derivative of a sequence indicate the end of a run.So the number of runs of a sequence can be defined as the weight of its derivative.
Also we use a variation of derivative Δ  of length  + 1 by adding 1's at the beginning the sequence Δ.The variation of derivative is an important part of new defined run tests, since the number of runs of different length is determined by this sequence.
Remark 12. Let  =  0 ,  1 , . . .,  −1 be a binary sequence and derivative of  is denoted by Δ = Δ 0 , Δ 1 , . . ., Δ −1 .Then Δ  = Δ  0 , Δ  1 , . . ., Δ   is defined as follows: In order to count the runs at the beginning, we use a variation of derivative instead of the original derivative definition.Number of runs of length one in a sequence is indicated by the number of overlapping occurrences of 11 in its variation of derivative.In the same way number of runs of lengths 2 and 3 in a sequence is indicated by the number of overlapping occurrences of 101 and 1001, respectively.More generally we can say that number of runs of length  is indicated by the overlapping number of occurrences of 100 ⋅ ⋅ ⋅ 0 ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ −1 1.
Example 13.Let  = 01100010011111001100011101010000 be a binary sequence of length 32, having 15 runs, 6 runs of length one, 4 runs of length two, and 3 runs of length three.Then Before defining new statistical tests, we give the general idea of the test by following example.
Example 14.Let  be a binary sequence of length 2 21 .Let   and Pr  be the number of sequences in given subinterval and probability of it, respectively.
Step 1. Choose a block size .In our example we choose  as 128.
Step 3.For each   count the number of runs of lengths one, two, and three.And increment the corresponding boxes by 1.Consider Step 4.Then, we get Table 5. Count rows of each test corresponding to the number of sequences whose number of runs of length one, two, or three is in given interval.
Step 5.  2 is calculated by the given formula and p value is computed accordingly: (47) Step 6.Finally, we get the p value for each test.Together with three new run tests we implement the idea of Golomb's second postulate in statistical randomness tests.The new run tests, concerning runs of lengths one, two, and three, constitute a better proper generalization of Golomb's idea.

Implementations
In order to check the reliability of tests stated in the previous section, we implement new test together with well-known statistical tests included in NIST test suite.
In the first part of the experiments we select 5 encryption algorithms, which are Advanced Encryption Algorithms finalists, MARS [18], RC6 [19], Rijndael [20], Serpent [21], and Twofish [22]. 2 16 pseudorandom sequences of length 128 are generated with encryption of noncorrelated data by using these algorithms.In other words, in the first experiment we test the outputs of AES finalists using our tests and NIST test suite.New run tests are implemented on 2 14 pseudorandom sequences of length 128 as described in the previous section and NIST's tests are implemented on a binary sequence of length 2 21 by concatenating the outputs of algorithms.The results can be seen in Table 6.
In the second part of the experiments, we use the binary expansions of , , and √ 2. The binary expansions can be found within the NIST test suite.As in the first part we also use well-known tests that are included in NIST test suite.We collect first 2 19 bits of the binary expansions.In order to apply new run tests, collected long sequence is divided into 128-bit blocks; hence we get 2 12 sequences of length 128.Using the second implementation we show the performance of new run tests.The test results can be seen in Table 7.In the last part of the experiments, we analyse the sensitivity of new run tests.In order to do the implementation, first we need to generate a nonrandom sequence.
Let  =  0 ,  1 , . . .,  −1 be the outputs of a random number generator and 0 ≤   ≤ 1 for  = 0, 1, .Definition 15.Let  be a binary sequence of length  and th element of it is represented as   ; then bias  is defined as follows: Clearly, we can say that in a true random sequence we expect bias as 0. That is, Pr(  = 1) = Pr(  = 0) = 1/2.Moreover, this is the main idea of Golomb's first postulate.To generate nonrandom sequence we need to increase the bias.Finally using Algorithm 7 we can generate a nonrandom sequence.
Example 16.Let  =  0 ,  1 , . . .,  −1 be a random sequence with 0 ≤   ≤ 1 for  = 0, 1, . . .,  − 1; from this sequence we construct a binary sequence with bias 0.05.The generation of nonrandom sequence can be summarized as follows: In the last part of the experiments we generate nonrandom datum with different biases using the above construction.We observe the behaviour of new run tests with respect to the randomness of a sequence.The last results show the efficiency of the new tests.Moreover new run tests can detect the deviations in distributions of runs while other tests cannot.The test results can be seen in Table 8.

Conclusion
In cryptography almost all applications use random looking sequences.Therefore randomness is one of the most important issues for cryptographic algorithms.In fact, using weak random values enables an adversary to break the whole system.
In all applications, used values should be of sufficient size and be random, in such a manner that probability of any chosen quantity should be small enough to eliminate an adversary to gain any specific information.Therefore, sequences and numbers, used as a key in cryptographic algorithms, should be pseudorandom.Also these sequences should have good statistical properties.For these reasons statistical randomness is an important topic.While giving a mathematical proof that a generator is a random bit generator is nearly impossible, statistical tests are defined to detect weaknesses that a generator could have.Hence, they are considered as an important part of evaluating security of cryptographic algorithms.
In this work, we propose three new statistical tests based on Golomb's second postulate.Finding the real probabilities related to number of runs of lengths one, two, and three enables us to compare the observed values accordingly.New run tests can be used in test suites to test security of algorithms so that Golomb's second postulate is implemented in a proper way.Moreover, these tests can be used as an evaluation tool for short sequences such as outputs of block ciphers and hash functions.These tests can detect deviations in distribution runs which cannot be detected by other tests.
Also, we experiment with some standard encryption algorithms that behave like pseudorandom number generator and random sequences such as binary expansion of , , and √ 2. Implementations show the consistency of new statistical test with other well-known statistical tests.It is shown that, in order to detect the deviation from randomness (in the sense of distribution of runs), new statistical tests are more efficient than other statistical tests.
As a future work, we extend statistical tests to approach Golomb's randomness postulates more than now.And correlations between new statistical tests and also with other statistical tests can be examined.

) 1 001101
Weight of Δ is 15 which corresponds to number of runs.(ii) Number of overlapping occurrences of 11 is 6 which corresponds to number of runs of length one: Δ  = Number of overlapping occurrences of 101 is 4 which corresponds to number of runs of length two: Δ  = 1101 ⏟⏟ ⏟⏟⏟ ⏟⏟ Number of overlapping occurrences of 1001 is 3 which corresponds to number of runs length three:

( i )
Number of runs of length one test:  value = 0.357056.(ii) Number of runs of length two test:  value = 0.462207.(iii) Number of runs of length three test:  value = 0.627001.

Table 2 :
Interval and probability values for runs of length one for 64-, 128-, 256-, and 512-bit blocks.The probability of randomly chosen binary sequence  =  1 ,  2 , . . .,   with length , having  runs,  1 of which are length one and  2 of which are length and two is,
the previous sections to compute the number of sequences having total of  runs,  3 of which are of length three, and hence we calculate the probabilities.Then using these calculations, we state the last new statistical test in the next chapter.Theorem 9.The probability of chosen binary sequence  =  1 ,  2 , . . .,   with length , having  runs,  1 runs of length one,  2 runs of length two, and  3 runs of length three, is

Table 4 :
Interval and probability values for runs of length three test for 64-, 128-, 256-bit blocks.

Table 5 :
Number of sequences in given intervals for runs of length one test, runs of length two test, and runs of length three test.
4.1.Runs of Length One Test.The subject of the first new run test is runs of length one in the sequences.Test uses the probabilities calculated in the previous chapter.First, we collect the algorithms output and generate the data set S. If the given sequence of length  is a long binary sequence, the sequence is divided into -bit blocks and gets a set of Apply  2 of Goodness of Fit test to the values in  1 .return -value.Algorithm 4: Runs of length one test ( 1 ,  2 , . . .,   ),  1 = Apply  2 of Goodness of Fit test to the values in  2 .return -value.Algorithm 5: Runs of length two test ( 1 ,  2 , . . .,   ),  2 = and generates S = { 1 ,  2 , . . .,   } where  = ⌊/⌋.In our test  can be 64, 128, 256, or 512.After generating the data set, the set  1 is formed by counting the number of runs of length one in each sequence.In order to find the number of runs of length one, first we find the derivative of the binary sequence Δ  and then we count the overlapping occurrences 11 in Δ   for  = 1, 2, . . ., .After that we apply  2 of goodness of fit test to the values in  1 .We propose new run test to implement the idea of Golomb's second postulate in statistical randomness test.The pseudocode of the test is given in Algorithm 4. 4.2.Runs of Length Two Test.After giving the first new run test, we define runs of length two test.Test uses the probabilities calculated in the previous chapter.As in the runs of length one test first, we generate the data set S. Also in the second test the block size  can be 64, 128, 256, or 512.From the data set S, the set  2 is formed by counting the number of runs of length two in each sequence.Like in the previous test we get the derivative of the binary sequence Δ  .In order to find the number of runs of length two, we count the overlapping occurrences 101 in Δ   .Then we apply  2 of goodness of fit test to the values in  2 .The second new run test constitutes another approach to Golomb's second postulate.The pseudocode of the test is given as in Algorithm 5. Apply  2 of Goodness of Fit test to the values in  3 .return -value.Algorithm 6: Runs of length three test ( 1 ,  2 , . . .,   ),  3 = Runs of Length Three Test.The last new run test is runs of length three test.This test also uses the probabilities calculated in the previous chapter.Data sets are created as in the previous run tests.Also in the last new run test block size  can be 64, 128, or 256.The set  3 is formed by using S. The counting phase of this test is done by finding the total number of the overlapping occurrences 1001 in Δ   .Then we apply  2 of goodness of fit test to the values in  3 .The pseudocode of the last new run test is given in Algorithm 6. sequences

Table 6 :
Test results for the 128-bit outputs of AES finalists.Two different versions of serial test in NIST test suite.

Table 7 :
Test results for the binary expansion of , , and √ 2. Two different versions of serial test in NIST test suite.

Table 8 :
Test results for nonrandom data sets.Two different versions of serial test in NIST test suite.