^{1}

^{2}

^{3}

^{1}

^{1}

^{1}

^{2}

^{3}

Being easy to understand and simple to implement, substitution technique of performing steganography has gain wide popularity among users as well as attackers. Steganography is categorized into different types based on the carrier file being used for embedding data. The audio file is focused on hiding data in this paper. Human has associated an acute degree of sensitivity to additive random noise. An individual is able to detect noise in an audio file as low as one part in 10 million. Given this limitation, it seems that concealing information within audio files would be a pointless exercise. Human auditory system (HAS) experiences an interesting behavior known as masking effect, which says that the threshold of hearing of one type of sound is affected by the presence of another type of sound. Because of this property, it is possible to hide some data inside an audio file without being noticed. In this paper, the research problem for optimizing the audio steganography technique is laid down. In the end, a methodology is proposed that effectively resolves the stated research problem and finally the implementation results are analyzed to ensure the effectiveness of the given solution.

Currently, audio steganography is limited to providing solutions related to copyright and assurance of the integrity of content [

The effectiveness of a steganography technique is analyzed on the parameters, namely, capacity, imperceptibility, and robustness. In most of the existing techniques, adherence with one of these parameters is usually achieved at the cost of comprising with other parameters. Low-bit encoding or LSB technique is the most commonly used technique [

LSB method is prone to intentional attacks as data are embedded in LSBs only

Unintentional attacks like noise disturbances are the cause of data loss

The literature relevant to audio steganography techniques was studied [

Comparison of audio steganography techniques.

Techniques | Parameters | |||
---|---|---|---|---|

Imperceptibility | Capacity | Robustness | Complexity | |

LSB | Medium | High | Low | Low |

Parity coding | Medium | Medium | Low | Low |

Phase coding | High | Low | High | Medium |

Spread spectrum | Low | Medium | High | High |

Echo hiding | Low | Medium | Medium | Medium |

Tone insertion | Low | Low | Medium | High |

The objective is to design an optimized method that should be having a capacity similar to the LSB technique, but robustness should be high, unlike the LSB method. However, there is always a tradeoff in achieving capacity along with keeping the high value of robustness. Thus, the research problem is stated using the following points:

An audio steganography technique is desired that should have the capacity comparable to the LSB technique but maintaining a high value of robustness and imperceptibility.

Most of the techniques of implementing audio steganography that possesses satisfactory robustness have less hiding capacity. Therefore, the selected technique is customized to meet the requirements.

As a consequence of the customization for high capacity, more distortion will get induced in the cover file after embedding. To minimize that distortion, an optimization algorithm is used in coordination with the embedding algorithm.

The sequence of embedding message bits in the audio file should not be obvious. The embedding pattern is expected to be as less predictable as possible to make the technique more secure.

The proposed methodology resolves the research problem in an efficient manner. A coordinated approach using spread spectrum steganography, SITO, and chaos theory works in a way that the objectives of the research problem are satisfied. The underlying representation of the proposed method is shown in Figure

Proposed solution representation.

The proposed method comprised the following basic modules:

Chaotic system: chaotic map is used to provide required randomness as the pseudorandom numbers generated through some random library function result in a repetition of some pattern to be used for substitution. The cross correlation between two consecutive numbers generated through chaotic map approaches to zero. The key is obtained using logistic maps of chaos theory.

Spread spectrum: spread spectrum technique of implementing audio steganography is used as one of the prime components of the proposed solution. Because of its inherent robustness, difficult to intercept, and hard to interference, the technique is the first choice in solving the given research problem.

Social impact theory optimization: to minimize the distortion induced in audio samples because of embedding, social impact theory optimization (SITO) is used. In recent researches, SITO has outperformed genetic algorithm (GA) and particle swarm optimization (PSO) in achieving optimum results. The algorithm evolves iteration by iteration based on some objective functions and guarantees a nearly optimal solution at the end. The stopping criterion for the algorithm in the current research is the number of iterations.

The methodology starts with a capacity check operation to verify the capability of the chosen cover audio file to embed the secret message of given size successfully. After that, the cover audio file is transformed into its equivalent spread spectrum. Simultaneously, the secret message is converted into its binary equivalent. The required random sequence that acts as a key to guide the spreading of message bits over the entire spectrum of the cover file is obtained using the logistic map of chaotic theory. The embedding step is repeated until the complete message is hidden inside the cover file. The amount of data that is embedded per sample of the audio file is comparatively more so that capacity is increased. The initial values are calculated for evaluation parameters as well as the objective function. The proposed algorithm is given in Algorithm

Step 1: procedure Embed_Optimize(aud_file, img_file)

Step 2: if(capacity_check(aud_file, img_file) = TRUE) then

Step 3: Temp_aud⟵transform(aud_file)

Step 4: Temp_img⟵Binary(img_file)

Step 5: seed⟵random()

Step 6: key⟵seed

Step 7: count⟵0

Step 8: while no_of_iteration ≤ 100 do

Step 9: if count < length(Temp_img) then

Step 10: key⟵SITO_OPT(key)

Step 11: hide(Temp_aud, key, Temp_img)

Step 12: count⟵update(count)

Step 13: else

Step 14: Write“Message got embedded successfully”.

Step 15: repeat

Step 16: return Temp_aud

Step 17: else

Step 18: exit()

Step 19: return

In order to optimize the value of the objective function, SITO is used. The values of evaluation parameters are improved with the iterations of the optimization algorithm. The SITO terminates when an acceptable value of evaluation parameters is achieved or the number of iterations is over. The flow chart given in Figure

Flow chart of the proposed methodology.

The objective is to minimize MSE and maximize PSNR and SSIM. The aggregate function used here is a minimizing function given as

MATLAB is a multiparadigm environment developed for numerical computing primarily. With time, it got equipped with a list of toolboxes developed for specific needs. Initially, the command window was the only way to interact for executing files. Later, an additional feature of graphical interface benefitted the users. Because of its broad applicability, popularity, and effectiveness, the proposed algorithm is implemented using MATLAB. The open-source libraries are used for implementing chaos theory and SITO. To compare and analyze the effectiveness of the proposed algorithm, the comparison graphs are generated for PSNR, MSE, and SSIM. MATLAB 2014 is used as the simulation environment. The details regarding secret message embedded, cover audio file, and libraries used are given in the following:

Secret message: (i) JPG image of size 8.78 KB and dimension 244 × 250; (ii) PNG image of size 38.6 KB and dimension 512 × 512

Cover audio: MP4 file of size 5.07 MB and length 3 min 21 sec, bit rate 206 kbps, channels 2, and audio sample rate 44.100 kHz

CODO library: library used for random number generator using chaos theory

SITO library: library used for optimizer using social impact theory

The effectiveness of the proposed solution of the formulated problem is evaluated through the following methods:

Histogram analysis of original message and extracted message

Frequency spectrum analysis of cover audio and stego audio

Peak signal to noise ratio (PSNR)

Mean square error (MSE)

Structural similarity index (SSIM)

Visual inspection of the original message and extracted message

Robustness measurement using correlation coefficient

Comparison of variation of PSNR, MSE, and SSIM with some existing techniques

An image is taken as secret data to be embedded in audio. Histogram representation of an image is a popular way of analyzing images.

Either the histograms can be taken of the single channel out of RGB or it can be taken of RGB in aggregate. The given histograms are obtained for blue channels of the image before embedding and extracted image. Similarly, analysis of histograms of red and green color can be done. The histogram comparisons in Figure

Histograms of (a) original message and (b) extracted message.

Obtaining the frequency spectrum is a universal and straightforward technique to analyze an electric signal. Figure

Frequency spectrum of cover audio (a) and stego audio (b).

PSNR, MSE, and SSIM have been widely accepted as a measure of quality. To verify the quality of the proposed method, the variation of PSNR, MSE, and SSIM values is obtained. PSNR and MSE are to calculate absolute error, but SSIM gives a measure of error in the structure.

It also looks for intensity, brightness, and other parameters that are related to the structure of an image. The unit of measuring PSNR is dB, and the acceptable value is greater than 45 dB, whereas MSE has the unit square of the unit of the quantity being measured. The desired value of MSE should be approaching to “0.” The value of SSIM lies between “−1” and “+1.” The desired value of SSIM should be approaching to “+1.” “0” value indicates no similarity while value “+1” indicates that both images are identical. The results show that the improvement is significant and it would be increasing with the increase in some iteration of the proposed method. Figure

(a) PSNR, (b) MSE, and (c) SSIM variation.

The secret messages before embedding and the hidden messages extracted from stego file are compared and analyzed for any visually noticeable differences in Figure

(a) Original message and (b) extracted message.

Robustness is that property of the watermarked data that make its presence silent even after attacked with general signal processing attacks [_{avg}) comes out to be 0.85.

It is essential to examine and compare the efficacy of the proposed research with the preexisting techniques in a similar domain. The proposed work of optimizing audio steganography using social impact theory optimization is compared with some of the existing methods on different parameters like PSNR, MSE, and SSIM. The comparison is made using the following: (i) the traditional LSB method, (ii) another optimized steganography technique where algorithm used for optimization is GA, and (iii) the implementation work done by Chen and Huang [

PSNR variation is compared in Figure

Comparison of PSNR variation with (a) LSB, (b) GA, and (c) the method of Chen and Huang [

However, the graph diverges further when the number of iterations increases. The proposed work outperforms the LSB technique and the method of Chen and Huang [

The comparison of PSNR variation also shows that LSB technique behaved in the least effective way, while GA was comparable to the proposed method and the third one behaved in a moderate way.

It was observed while comparing the proposed work with the LSB method, GA, and the method of Chen and Huang [

Starting from a very high value of MSE, the proposed algorithm abruptly comes down and sooner attains the acceptance value. One more fact observed from the graphs given in Figure

Comparison of MSE variation with (a) LSB, (b) GA, and (c) the method of Chen and Huang [

Structural similarity index (SSIM) is a way of measuring degradation in the quality of an image because of some processing task.

The acceptable value of SSIM should be close to one. From Figure

Comparison of SSIM variation with (a) LSB, (b) GA, and (c) the method of Chen and Huang [

The attainment of imperceptibility and robustness depends largely on scaling parameter chosen for embedding. Most of the audio steganography implementations use static value of the scaling parameter for simplicity. In the work done by Su et al. [

PSNR attainment.

MSE attainment.

SSIM attainment.

The problem of finding an audio steganography technique with acceptable values of capacity, robustness, and imperceptibility is resolved using the proposed methodology. The characteristics of spread spectrum that makes it secure against interception and interference provide the required robustness. The prime security of steganography lies in difficulty to know the hiding pattern used for embedding. Thus, the security is ensured by making the embedding pattern truly random by utilizing chaos theory using logistic maps. Capacity is increased by spreading a greater number of bits over the entire spectrum of a sample. This enhancement would increase the distortion too. The SITO maintains the distortion at an optimum level and optimizes the audio steganography technique to achieve the satisfactory value of the objective function. It is observed from the analysis of different graphs that significant improvement has been achieved by using social impact theory optimizer. The performance of the proposed research work improves further with the increase in the number of iterations of SITO execution. Various quality measures (PSNR, MSE, and SSIM) achieved values up to satisfaction. In some cases, GA performed somewhere close to SITO but the performance gain observed by using SITO as an optimization algorithm is more significant. The proposed methodology successfully achieves the research objectives of optimizing an audio steganography algorithm in such a way that each parameter out of robustness, capacity, and imperceptibility is attained to the satisfaction.

The data used to support the findings of this study are included within the article.

The authors declare that there are no conflicts of interest regarding the publication of this paper.