Stego on FPGA: An IWT Approach

A reconfigurable hardware architecture for the implementation of integer wavelet transform (IWT) based adaptive random image steganography algorithm is proposed. The Haar-IWT was used to separate the subbands namely, LL, LH, HL, and HH, from 8 × 8 pixel blocks and the encrypted secret data is hidden in the LH, HL, and HH blocks using Moore and Hilbert space filling curve (SFC) scan patterns. Either Moore or Hilbert SFC was chosen for hiding the encrypted data in LH, HL, and HH coefficients, whichever produces the lowest mean square error (MSE) and the highest peak signal-to-noise ratio (PSNR). The fixated random walk's verdict of all blocks is registered which is nothing but the furtive key. Our system took 1.6 µs for embedding the data in coefficient blocks and consumed 34% of the logic elements, 22% of the dedicated logic register, and 2% of the embedded multiplier on Cyclone II field programmable gate array (FPGA).


Introduction
Cryptography [1][2][3][4] and steganography [5][6][7][8][9] are considered as the most prominent solutions among numerous techniques developed in the field of information security, particularly in all kinds of secure information (sensitive) systems to quash unauthorized attacks and protect the secret information during transmission and storage. In cryptography, the data is scrambled into an unreadable format prior to transmission or storage for hiding the contents from an attacker. The intended user accesses the data by unscrambling the data through the secret key which can be either a private or public key [1][2][3][4]. The drawback of the technique lies in the fact that it encrypts the data but does not hide its existence and the cryptic data can often entice the attackers [10]. Steganography on the other hand is a prowess of blotting out the secret content in a host medium without altering the properties of the latter with the intention that the veiled message is unperceivable (except for the receiver) [10]. Among different kinds of hosting medium namely image, audio, video, and so forth, image is considered as the most promising among the cover media due to the fact that it is easy to obtain with reasonable hiding capacity and distortion tolerance [10]. In comparison with cryptography, steganography caters a privileged echelon of privacy and security as it makes the secret information altogether invisible.
There are works in steganography through FPGA [25][26][27][28][29], but they are in spatial domain [27][28][29]. This work advises the reconfigurable hardware for adaptive integer 2 The Scientific World Journal wavelet based data hiding which embeds the large amount of data in random scan technique to improve complexity and also give high PSNR and good payload. This paper is organized as follows. The necessary introduction for IWT is given in Section 2. Section 3 describes the proposed FPGA steganography methodology using SFC in IWT followed by the hardware implementation in Section 4. Section 5 explores hardware synthesis and performance analysis. The results and discussion are given in Section 6. Finally the conclusion is given in Section 7.

Integer Wavelet Transform
This paper habituates Haar IWT to infix secret bit stream in the cover file (image). As this is the case, IWT winds up with high and low coefficients of frequency in cover. The former is gained through flanking pixel pairs' edge information, whereas the latter is gained through stifling the same in all pixels.
First stage IWT is as follows: where = pixels in odd columns and = pixels in even columns.
Consequently, this first stage leads to the next stage processes that involve high pass and low pass filter banks to find IWT coefficients. It results in four sub bands (LL, LH, HL and HH) out of which LL sub band has highly sensitive information. The rest of the bands have the in depth cover information.
Second stage IWT is as follows: HL = H odd − H even , In the second stage, H odd = H band's odd row, L odd = L band's odd row, H even = H band's even row, and L even = L band's even row.
The confidential message bits are rooted in wavelet coefficients. Inverse IWT is exercised in the ensuing coefficients to get stego output and this can be used for further communication. Since IWT encourages reversible makeover, at the receiving end, secret bit stream is revived with the help of the same secret key applied to the transmitter.

Proposed Method
Schematic diagram for this proffer was publicized in Figure 1(a). IWT is employed to obtain wavelet coefficients for burying the secret message. Key 1 ranges from numbers one to four deciding the total bits to be infixed in cover file and by varying its increased capacity it can be attained. Randomized embedding of encrypted secret bits was done through SFC patterns [15], namely, Hilbert and Moore, which are shown in Figures 1(b) and 1(c) in HH, HL, and LH bands of every 4 × 4 coefficient block. For apiece traversing trails, least MSE and utmost PSNR were computed and the one which gave the best result was chosen for final embedding. Two separate keys, 00 (Moore) and 01 (Hilbert), were assigned for the two paths; for every block, a key was set according to the best path.

Algorithm
(1) Read the cover image of size 128 × 128 × 3 and secret data.
(3) Divide them into red, blue, and green planes.
(4) Choose one block using pseudorandom number generator.
(5) Apply Haar wavelet transform to the randomly selected block to form subbands.
(6) Calculate bit length to estimate the embedding capacity of each coefficient.
Assign key 1 for k-bit embedding. (8) Assign key values for the two scan patterns. Let it be key 2.
(9) For every 8 × 8 coefficient, apply the two scan patterns and determine MSE in each plane for every pattern.
(10) Select the minimum MSE value between two patterns and using that particular pattern, embed the secret data by LSB substitution using k-bit embedding.
(11) Take inverse IWT to reproduce the stego block (12) Repeat the process till the last bit of secret content gets entrenched.
(13) Store the result as stego image. (14) Communicate the two keys to the receiver.

IWT Hardware Implementation
The proposed IWT based data hiding architecture is shown in Figure 2. The design comprises of the following major blocks in FPGA architecture; finite state machine based control unit, address generation unit, SRAM controller, on chip memory, IWT coefficient generation unit, embedding unit and Mean Square Error Module.

Finite State Machine (FSM) Control
Unit. The state diagram of FSM control unit is shown in Figure 3. This controls the address generator module, SRAM controller, IWT coefficient generation unit, data embedding unit, inverse IWT, and embedding block. FSM consists of the following status registers which hold the current state and the next state of the process, pixel counter that counts the number of     processed pixels, message counter that counts the embedded message bits, row address counter that counts the number of processed rows, column address counter to count the number of processed columns, and block counter for counting the processed M × N blocks, and column and row pointers hold the current column and row address. Memory pointer directs the address generator to the next memory location from where it is to receive pixel data and encrypt message bit.

Address
Generator. The hardware model of address generator is shown in Figure 4. It generates address for SRAM controller to read the pixel value and encrypted message. It consists of address counter, linear feedback shift registers, pattern lookup table, and BMP header lookup table. LFSR is the combination of sequential shift, register and feedback logic. The address counter is a simple counter that generates the memory address to read the pixel value from SRAM. BMP header lookup table was used to read the header file information (Table 1) from BMP image file stored in external SRAM and copy it into internal cache memory. This header file information was used to know the image's dimensions. LFSR engenders random sequence for user's given value to choose one M × N pixel block among an N number of blocks. Also the same sequence is generated at the recipient side.

SRAM Controller.
The SRAM controller communicates with the 256 K × 16 asynchronous CMOS static RAM (SRAM) chip on ALTERA DE2 board. The SRAM controller enables users to read or write the SRAM from a master device (such as the FPGA) as a normal memory operation. For 8-bit or 16-bit data, there will be, respectively, 2 clock cycles and 1 clock cycle of latent period for read and write operations. It has 16 bit data bus, 18 bit address bus, and three control signals for read and write operations and one for word or byte mode selection. Timing diagram of SRAM is shown in Figures 5 and 6. The SRAM controller supports a clock frequency of 50 MHz.

On-Chip Embedded Memory.
The FPGA embedded memory presented in Figure 7 contains columns of M4K memory blocks to configure as on-chip memory for storing  M × N pixel values, encrypted secret, and M × N stego pixels in addition to RGB plane values. Simple dual-port mode abides concurrent read and write operations. Here, the memory blocks possess one write enable and one read enable signal. The above illustrated waveform is the result of the design's normal read conditions. The read occurs at the mounting edge of the enabled clock cycle. To read in simple dual port mode, Read enable port is ought to be enabled. Figure 8. It is comprised of two libraries of parameterized The Scientific World Journal 5    4.6. Embedding Block. The embedding module's functional diagram is given in Figure 9. It inhabits function registers A, B along with cascaded AND -OR logic modules of 24 bits wide each. The former is useful in storing secret message bits and 2D IWT coefficients during substitution process. After enshrouding the data into 2D IWT coefficient values, they get laid in inverse IWT Block. This block rejuvenates the pixels from 2D IWT coefficient values. The same functional diagram is pertinent to the inverse process.

Mean Square Error Module.
The MSE hardware is the collective squared error between the original and stego images. Hardware model of MSE is shown in Figure 10; it consists of mega function LPM ADD SUB, LPM multiplier, latch, and divider. LPM ADD SUB unit produces an output containing the difference of the input values, and the LPM MULT unit carries out the square root functions; it squares the difference value of LPM ADD SUB unit. The parallel adder unit carries out the summation process by summing the squared difference value with previous difference value; latch is used to store the summing output and its output is fed back into one of the inputs of parallel adder. Divider unit divides the summation output with M × N value and produces MSE result.

Hardware Synthesis and Performance Analysis Results
The two-dimensional IWT reconfigurable stego processor architecture was developed using IEEE standard Verilog HDL and is trialed on Cyclone II EP2C35F672C6 FPGA. Its compilation report is shown in Table 2. The design consumes 34% of the logic elements, 22% of the dedicated logic registers, and 2% of the embedded multipliers of a Cyclone II FPGA. The end results for RTL view and Chip planner are shown in Figures 11(a) and 11(b). Time taken for 2D IWT coefficient and data embedding in coefficient was calculated with the help of zero plus logic analyzer tool and the results are shown in Figure 11(c). The implemented algorithm consumed 1.6 s for IWTcoefficient generation, embedding the data in coefficients and MSE calculation. It took 6.08 s to read 8 × 8 blocks and RGB separation.

Results and Discussions
In this effectuation, color digital images Lena and Baboon of dimension 128 × 128 were chosen as covers, as in Figure 12 and Figure 13(a). This work was vindicated through MSE and PSNR: Here M and N stand for the number of pixels in horizontal and vertical dimensions of cover file (image); , and , give the number of pixels in original and stego image accordingly. PSNR is In this analysis key-2 was used to find the low MSE scan pattern for random embedding of the data in coefficients. Table 3 shows comparison of the proposed system with other spatial techniques (Moore, Hilbert, and adaptive random spatial data hiding technique) and its output stego images are shown in Figures 12(b)-12(e) and 13(b)-13(e). From the table it is vivid that adaptive IWT technique provides high PSNR and low MSE for k = 1-3 bit embedding.

Conclusion
This study exhibits an adaptive integer wavelet transform based data hiding plot rendering soaring payload simultaneously asseverating absolute stego-image visual quality. When  likened with the available literature, PSNR is increased in this system with intelligent use of key-1 and key-2. Moreover, these keys not only provide high security but also increase the capacity. The main drawback of the IWT based data hiding is the computational overhead but this present implementation overcomes this problem, using field programmable gate arrays (FPGA) which provides high speed implementation because of parallelism. This work is currently being extended to develop a consecrated stego processor by means of FPGA chip.