The paper deals with true random number generators employing oscillator rings, namely, with the one proposed by Sunar et al. in 2007 and enhanced by Wold and Tan in 2009. Our mathematical analysis shows that both architectures behave identically when composed of the same number of rings and ideal logic components. However, the reduction of the number of rings, as proposed by Wold and Tan, would inevitably cause the loss of entropy. Unfortunately, this entropy insufficiency is masked by the pseudo-randomness caused by XOR-ing clock signals having different frequencies. Our simulation model shows that the generator, using more than 18 ideal jitter-free rings having slightly different frequencies and producing only pseudo-randomness, will let the statistical tests pass. We conclude that a smaller number of rings reduce the security if the entropy reduction is not taken into account in post-processing. Moreover, the designer cannot avoid that some of rings will have the same frequency, which will cause another loss of entropy. In order to confirm this, we show how the attacker can reach a state where over 25% of the rings are locked and thus completely dependent. This effect can have disastrous consequences on the system security.

True Random Number Generators (TRNGs) are used to generate confidential keys and other critical security parameters (CSPs) in cryptographic modules [

The quality of the generated bit-streams is evaluated using dedicated statistical tests such as FIPS 140-2 [

The TRNGs implemented in reconfigurable devices usually use metastability [

In order to increase the entropy of the generated binary raw signal and to make the generator “provably secure”, Sunar et al. employ a huge number of ROs [

the claimed security level based on a security proof,

easy (almost “push button”) implementation in FPGAs.

Without the security proof, the generator of Sunar et al. can be considered as just one of many existing TRNGs that passes the statistical tests. This security approach is essential for TRNG evaluation according to AIS31 [

the XOR gate is supposed to be infinitely fast in order to maintain the entropy generated in rings;

the rings are supposed to be independent.

Wold and Tan show in [

It is commonly accepted that contrary to the original design of Sunar et al., the modified architecture proposed by Wold and Tan maintains the entropy of the raw binary signal after the XOR gate if the number of rings is unchanged. However, we believe that several other questions are worthy of investigation. The aim of our paper is to find answers to the following questions and to discuss related problems:

Is the security proof of Sunar valid also for the generator of Wold and Tan?

What is the entropy of the generated bitstream after the reduction of number of rings?

How does security enhancement proposed by Fischer et al. in [

How should the relationship between the rings be taken into account in entropy estimation?

The paper is organized as follows: Section

Ring oscillators are free-running oscillators using logic gates. They are easy to implement in logic devices and namely in Field Programmable Gate Arrays (FPGAs). The oscillator consists of a set of delay elements that are chained into a ring. The set of delay elements can be composed of inverting and noninverting elements, while the number of inverting elements has to be an odd number. The period of the signal generated in the RO using ideal components is given by the form

the delay

the delays of interconnections are ignored.

In physical devices, the delay

The delay

Besides being influenced locally, the delays of all logic gates in the device are modified both slowly and dynamically by global jitter sources. The slow changes of the gate delay

In real physical systems, the switching current of each gate modifies locally and/or globally the voltage level of the power supply, which in turn modifies (again locally and/or globally) the gate delay. This way, the delays of individual gates are not completely independent. We will discuss this phenomenon in Section

The aim of the first part of our work was to compare the behavior of two RO-based TRNGs: the original architecture depicted in Figure

the functional simulation results correspond to an ideal behavior of the generator, this way the two underlying mathematical models can be compared;

in contrast with the real hardware, thanks to simulation we can modify the parameters of injected jitter and evaluate the impact of each type of jitter on the quality of the generated bit-stream.

Original TRNG architecture of Sunar et al. (a) and modified architecture of Wold and Tan (b).

The principle of our simulation platform and experimental platform is depicted in Figure

Simulation and experimental platforms.

We avoided the post-processing in the generator of Sunar et al. for two reasons:

the post-processing function can hide imperfections in the generated signal;

using the same structures, we wanted to compare the two generators more fairly.

At this first level of investigation, we used the statistical tests FIPS 140-2 [

In order to compare the two generators on the functional simulation level (i.e., using ideal components), the behavior of ring oscillators was modeled in VHDL by delay elements with dynamically varying delays.

The jittered half-period, generated in MatLab Ver. R2008b, is based on (

Once the parameters

Implementation of ring oscillators in simulations and in hardware.

The output signals were sampled using a D flip-flop at the sampling frequency

Enhancements of the generator architecture brought by Wold and Tan were related to the behavior of the XOR gate. In order to compare generators' behavior in two different technologies, we employed one from Altera (the same that was used in [

The Altera module contained the Cyclone III EP3C25F256C8N device. The noninverting delay elements and one inverter were mapped to LUT-based logic cells (LCELL) from Altera library (see Figure

The Actel Fusion module featured the M7AFS600FGG256 FPGA device. The non-inverting delay elements were implemented using AND2 gates from Actel library with two inputs short-connected and one inverter (again from Actel library) was added to close the loop (see Figure

On both hardware platforms, an internal PLL was used to generate the 32 MHz sampling clock

First, we compared the behavior of both generators in VHDL simulations. The generators used 1 to 20 ROs consisting of

The simulation results for both evaluated generators are presented in Figure

Results of the FIPS 140-2 tests in simulation with 30 ps of Gaussian jitter for Sunar's and Wold's architecture (excluding long runs that always passed), the tests pass if the results are in the gray region or are equal to zero for the “Runs” test.

Monobit

Poker

Failed runs

It can be seen that in all configurations the two versions of the generator gave very similar (almost identical) results. Next, we will explain the reason for this behavior.

Let

Note that

Indeed, by definition of operation

Equation (

In both Sunar's and Wold's designs, outputs of

From the mathematical point of view, XOR-ing the outputs of

In Wold's architecture, each ring oscillator is sampled at time

The claim that both ideal generators behave according to the same mathematical model is very important, because it means that the security proof of Sunar can be applied (at least theoretically) to both of them. However, as we will see in the next section, their behavior in hardware is very different.

We applied the FIPS 140-2 tests on the raw binary signals generated by the two generators, while incrementing the number of ROs. The results obtained for Actel FPGA are presented in Figure

Results of the FIPS 140-2 tests for observed TRNG architectures with varying number of ring oscillators in Actel Fusion device.

Monobit

Poker

Failed runs

Results of the FIPS 140-2 tests for observed TRNG architectures with varying number of ring oscillators in Altera Cyclone III device.

Monobit

Poker

Failed runs

It can be seen that for the Sunar's architecture the tests passed neither for Altera nor for Actel family. However, we can note that the Altera Cyclone III device gave slightly better results. This was probably due to the different behavior of the XOR gate in selected technologies. In the same time, concerning the architecture of Wold, the tests passed for both technologies if the number of ROs was higher than 8. Note that these results confirmed those obtained on standard evaluation boards from Altera and Actel published in [

The claims of Wold are thus confirmed in both technologies and various types of boards. However, it is not clear what kind of randomness lets the tests pass. Is it mostly a pseudo-randomness (coming from the sequential behavior of the generator characterized by the internal state evolution) that can theoretically be attacked or a true-randomness that should be employed? The tests are clearly not able to distinguish between them. Again, the simulation can give answers to these questions.

As the architectures of Sunar and Wold have the same ideal behavior, we will analyze only the architecture of Wold and Tan, because its behavior in hardware is closer to the idealized mathematical model. Next, we will study the impact of both Gaussian and deterministic components of the jitter on the generated raw signal.

Again, we have simulated the behavior of the Wold's architecture with 1 to 20 ROs composed of 9 elements. The half period of each RO was composed of a mean value (frequency of RO-generated clock signal varied between 197,5 MHz and 202 MHz in 250 kHz steps) and of an additional random value (normally distributed with mean 0 and

Results of the FIPS 140-2 tests in the Wold's TRNG architecture simulations with varying size of injected Gaussian jitter.

Monobit

Poker

Failed runs

Two facts can be observed in these figures:

as expected, when the random jitter increases, the tests pass more easily (i.e., with a reduced number of ROs);

more surprisingly, the tests pass even if the random jitter is not injected at all (

Thanks to the mathematical model from (

The fact that the tests (FIPS and NIST) pass for a few ROs without Gaussian jitter means that both Sunar's and Wold's architectures produce a great amount of pseudo-randomness. We recall that pseudo-randomness in the generated sequence depends on the frequencies of ROs and can be manipulated from outside the chip (e.g., by modulating the power supply or by an electromagnetic interference as it was presented recently at the CHES conference [

In the next experiments, the Gaussian jitter remained constant (

Results of the FIPS 140-2 tests in the Wold's TRNG architecture simulations while injecting a Gaussian jitter of 30 ps and a varying deterministic jitter.

Monobit

Poker

Failed runs

It can be seen that when the deterministic part increases, the tests pass more easily. But there are two problems concerning the deterministic component of the jitter:

the results are strongly dependent on the frequency of the injected signal: depending on the frequency, the output of the TRNG can vary in time in a predictable way;

the deterministic jitter can be manipulated (for example, an attacker can superimpose a chosen signal that seems to improve randomness so that the tests would pass more easily, but in fact, he can predict some trends in subsequences).

For the above-mentioned reasons, the designer should reduce the pseudo-randomness coming from the deterministic jitter component as much as possible.

As we pointed out in [

Results of the FIPS 140-2 tests of the Wold's TRNG architecture simulations with internal reference clock and varying size of injected Gaussian jitter.

Monobit

Poker

Failed runs

Results of the FIPS 140-2 test of the Wold's TRNG architecture simulations while injecting a Gaussian jitter with

Monobit

Poker

Failed runs

As expected, increasing the standard deviation of the injected Gaussian jitter component from 0 to 50 ps, the tests passed more easily for both external and internal sampling clocks (see Figures

In the last experiment, the Gaussian jitter component was constant and the deterministic jitter component varied. The results for external and internal sampling clocks are presented in Figures

The aim of the following experiment was to validate the mutual independence of ring oscillators implemented in FPGA. Note that this mutual independence of rings is a necessary condition for the validity of the security proof from [

First, we implemented pairs of 9-element ROs with similar topology (similar routing) in both tested FPGA technologies. The generated clock periods were measured using the high bandwidth (3.5 Ghz) digital oscilloscope LeCroy WavePro 735Zi. The 1.2 V power supply of the FPGA core was replaced with an external variable power supply. The output clock signal periods were measured for the power supply ranging from 0.9 V to 1.3 V.

We can observe in Figure

Dependence of clock periods of two rings in Actel Fusion device on the power supply.

The difference between the two clock periods as the function of the power supply can be observed in Figure

Difference between clock periods of two rings depending on the power supply.

The locking of two rings was also observable on the oscilloscope, as it is depicted in Figure

when the rings' frequencies were sufficiently close, it was quite easy to lock the rings by modifying the power supply;

the locking could be observed for both technologies used;

although most of the time the phase of the two signals on the oscilloscope was perfectly stable, sometimes they became unlocked for a very short time—this explains why the period difference measured in long time intervals was not exactly zero in the locking zone;

we could quite easily obtain the state when about 25% of rings were locked—most of them were locked on a dominant frequency and other smaller groups of rings on other frequencies (this phenomenon was observed on all tested cards—five cards for each evaluated technology);

the state when two or more rings were locked on the nominal voltage level (without manipulating the power supply) could also be obtained.

Waveforms of two locked (up) and unlocked (down) rings in Actel Fusion device.

As it was shown in Section

Nevertheless, even the new architecture does not eliminate serious doubts about the entropy contents in the raw signal. Unfortunately, this entropy cannot be measured. Applying the theory of Sunar et al., the entropy of the raw binary signal can be estimated knowing the sampling frequency, size of the jitter, and number of independent rings. Supposing that the rings are independent, this theory remains valid for the new generator architecture as we showed in Section

Equation (

There is another source causing the pseudo-randomness in the raw binary signal. It comes from the global deterministic jitter sources. Both above-mentioned sources of pseudo-randomness are dangerous because they can mask the entropy insufficiency (the tests of randomness will pass) and at the same time they can be manipulated. However, the pseudo-randomness coming from the evolution of the state of the generator described by (

For example, by modulating the power supply and thus changing the periods

As it was shown, the generator of Wold and Tan follows the same mathematical model as that of Sunar et al. The security proof of Sunar can thus be applied (theoretically) also in this case. Because the generator of Wold and Tan gives much better binary raw signal in hardware, it should be preferred. However, in order to assure that the proof of Sunar will hold, the number of rings should not be reduced as proposed in [

In an ideal case (i.e., when the rings are independent) and following the proof of Sunar, the number of rings is defined by the size of the Gaussian component of the jitter and by the reference clock frequency, both in relationship with the post-processing resilient function. However, even if the number of rings remains high, some of them could be locked and the effective number of exploitable rings could be significantly lower. In this case, which is easy to obtain in real FPGAs and which can concern as much as 25% of rings or more, the entropy of the generated raw signal would be much lower than expected and the generator would be predictable or manipulable.

The locking of rings depends on their topology (placement and routing) and on the technology used. The probability of locking could perhaps be reduced by a careful placement and routing on a per-device basis or by an independent powering of all rings. Applying the first approach, the designer loses the main advantage of this class of TRNGs—device-independent design. The second approach is impossible to apply in FPGAs. Another strategy can consist in detection of locking of rings in order to stop the generation of random numbers. However, the complexity of the detection circuitry would rise with the square of number of rings and it would thus limit the practical use of the generator, which is already penalized by the fact that the number of rings is considerable.

Although locking of rings can have disastrous consequences on the security of TRNGs based on ring oscillators, this phenomenon is not yet observed in the literature. For this reason, it should be studied extensively in the future.

This paper is an extended version of the conference paper [