Fourier transform infrared (FTIR) microspectroscopy images contain information from the whole infrared spectrum used for microspectroscopic analyses. In combination with the FTIR image, visible light images are used to depict the area from which the FTIR spectral image was sampled. These two images are traditionally acquired as separate files. This paper proposes a histogram shifting-based data hiding technique to embed visible light images in FTIR spectral images producing single entities. The primary objective is to improve data management efficiency. Secondary objectives are confidentiality, availability, and reliability. Since the integrity of biomedical data is vital, the proposed method applies reversible data hiding. After extraction of the embedded data, the FTIR image is reversed to its original state. Furthermore, the proposed method applies authentication tags generated with keyed Hash-Based Message Authentication Codes (HMAC) to detect tampered or corrupted areas of FTIR images. The experimental results show that the FTIR spectral images carrying the payload maintain good perceptual fidelity and the payload can be reliably recovered even after bit flipping or cropping attacks. It has been also shown that extraction successfully removes all modifications caused by the payload. Finally, authentication tags successfully indicated tampered FTIR image areas.
Capturing image information from the whole infrared spectrum limited only by the near diffraction limit of light has now allowed scientists to further exploit the possibilities of microspectroscopic analysis. To give an example, analyzing the infrared spectrum enables the acquisition of biochemical information out of a tissue for the diagnosis and the assessment of cell functionality [
Data management efficiency refers to storing and transferring multiple data entities as efficiently as possible. In this case, by multiple entities, we refer to the FTIR images and their corresponding visible light images, as well as other possible Electronic Patient Records (EPR). Because of regular transfer of biomedical material, there is a risk of data loss. Constant transfer and storage of biomedical data raises other security issues. Some among the most important are data confidentiality, availability, and reliability [
This paper proposes a data hiding method for embedding the visible light image in the FTIR spectral image, including authentication tags of the original FTIR image generated using Hash-Based Message Authentication Codes (HMAC) for authentication and tamper proofing purposes. Moreover, all necessary side information is also included in the payload. Prior to embedding, all payload data is encoded using error correcting codes. Because of the sensitivity of the host, after the extraction of the payload from the FTIR image, all modifications are removed and therefore its original state is recovered. The error correcting codes used in this method were designed to recover data from two attack scenarios. The first one is least significant bit flipping due to noise and the other one is removal attacks. Both attacks might take place in one, more, or all spectral components of FTIR pixels. All experiments in this paper used research data collected from Oulu university hospital taken from human articular cartilage samples, that is, the connective tissue that provides resistance to compressive forces during joint movements [
In this method, by combining all the entities associated with the FTIR microspectroscopy in a single package, data management efficiency is increased since the access to the different entities is done through a single FTIR image. Furthermore, storing all components in the same structure reserves more space and the linkage between files is not necessary any more as everything is included in this same structure. Access to embedded data is only given to entitled users with access to the extracting code. Furthermore, all information such as patient details or EPR can be encrypted before it is hidden and therefore confidentiality is maintained. Availability is guaranteed since all the data is combined in the same entity and thus access to the visible light image for a given FTIR segment is always available. Last, reliability is assured since tampered areas can be revealed using authentication tags. To sum everything up, for data management efficiency and security purposes, this paper proposes a method of embedding related data in FTIR spectral images, including authentication tags. This enables constant availability of data and tamper detection capabilities.
The paper is organized as follows. Section
Before the reference to biomedical applications, this section begins with the background of hyperspectral image data hiding and watermarking. Hyperspectral imaging concerns the collection and process of information originating from the whole electromagnetic spectrum. The purpose of hyperspectral imaging is to obtain the spectrum for each pixel in the image of a scene in order to find objects, identify materials or tissues, and detect processes [
In 2003, Tamhankar et al. [
In 2003, Kaarna and Toivanen [
In 2006, Kbaier and Belhadj [
The literature review did not reveal any hyperspectral watermarking techniques proposed for biomedical data. Nevertheless, reversible data hiding for data tampering and recoverability is considered by Liew and Zain [
Other techniques more similar to FTIR spectral image data hiding technique are biomedical video watermarking methods. That is because the FTIR structure is similar to the video, but instead of the time dimension of the video, the third dimension is the one going over spectral components. Two examples are the data hiding method proposed by Dey et al. in 2012 [
The proposed data hiding method makes use of a few key technological components. Some of those are established methods like those used for error correction and others were developed specifically for use on the currently proposed data hiding method. This section gives the necessary information about them to clarify the step by step description of the data hiding method in the next section.
The data hiding method includes the payload authentication tags created using HMAC. Those are generated from the original FTIR before the data is inserted. After extraction and reversion, authentication tags are generated again from the FTIR and compared with the ones that have just been acquired from the extracted payload. This comparison indicates whether some locations or spectral components of the FTIR have been possibly tampered. If there are no such indications, it confirms that reversion runs successfully and that the modifications of hidden data have been removed. Furthermore, they can indicate tampering that was caused because of noise or other forms of attack in the host FTIR spectral image.
In cryptography, a keyed Hash Message Authentication Code or, for short, HMAC, is a specific type of message authentication code (MAC) involving a cryptographic hash function (hence, the “H”) in combination with a secret cryptographic key. They are used for data integrity and authentication of messages.
Cryptographic hash functions are typically used to ensure the integrity of data. A cryptographic hash function can be applied on a message of arbitrary length to generate a short fixed sized message digest that can be subsequently used for integrity testing. MACs are keyed cryptographic hash functions that additionally guarantee authenticity of the data. For a MAC, a secret key is needed in order to generate a valid digest. The same key is also needed to check the validity of the digest.
In our case, both the data and the message digest are embedded, which means that a hash function is not sufficient (an attacker could easily modify both the data and its digest). We apply a keyed HMAC using the SHA256 hash function to ensure the authenticity of the data. HMAC [
HMAC-SHA256 is, in this case, used to create unique identifiers, that is, authentication tags for bundles of spectral components, as well as groups of pixels to locate spatial and spectral location of tampered data.
This paper is going to analyze two basic threat scenarios. The first one included attacks that caused bit flips and the second one cropping and removal attacks causing missing bits. For the first scenario, Reed-Solomon codes are the most appropriate solution.
Reed-Solomon codes are a group of error correcting codes introduced by Reed and Solomon in 1960 [
The second threat scenario that will be analyzed in this paper is the one with cropping or removal attacks. In this case, since we are using FTIR spectral images, one, more, or all spectral components are removed out of certain pixels. Due to this process, it is most likely that bits will be missing out of the extracted bit stream. The end result is a deletion channel.
For the deletion channel correction, the proposed method uses the implementation by Duda using correction trees. The correction trees are big trees that contain all the possible solutions. The encoding method that was used of this implementation is based on encoding used in binary symmetric channel in the past [
This scenario may have exceptions, including bit flips instead of missing bits. Those errors can be accepted in most of the payload data but not in the side information. A single error in the extracted side information necessary in each iteration of the extracting procedure can cause it to halt and therefore a solution had to be designed. This solution was based on a time-consuming trial and error approach and thus this is why it was only applied for side information and not for the whole payload. Given a chunk of data, what the method does is detecting the existence of bit errors that the deletion channel correcting code is unable to repair. Then, it attempts to proceed by removing parts of this chunk until the problematic bits are removed and replaced by the original ones. This relies on the fact that the problem has been converted to a deletion channel.
Reversibility and high capacity were the most important requirements in his method. Therefore, a histogram shifting-based approach was preferred. This concept was introduced by Ni et al. [
(a) The original histogram. (b) The histogram as a result of shifting to the right the segment for each index
Tsai et al. [
In digital images, in order to form digital function, the gray-level values have to be converted into discrete quantities. This process of assigning the intensity levels to discrete values is called quantization [
The tested FTIR image samples were in 24-bit floating-point format, which for the difference matrix of two spectral components means
This section describes the proposed reversible data hiding method step by step. The proposed method includes the two basic data hiding procedures, that is, the embedding procedure, where the payload is hidden in the FTIR spectral image, and the extracting/reversing procedure, where the payload is extracted and the FTIR image reversed.
The payload
Embedding the payload
The payload
Extracting the payload
Using the double histogram shifting approach with the modification described in Section
For the attack scenario of bit flipping by increasing or decreasing values of one or more spectral components of pixels by one unit, Reed-Solomon codes were selected. For each component of the payload, Reed-Solomon
Last, concerning the authentication tags, the HMAC-SHA256 is used to create two separate groups of tags out of the input FTIR image. The first group is created out of sets of spectral components to later determine the wavelength where tampering has taken place. In our implementation, we have used one authentication tag for every set of consecutive five spectral components. The second group of authentication tags is generated out of spatial areas to determine the pixels that have undergone tampering. In our case, we used tags for every 8 × 8 block of pixels. This will allow a double level description of the tampered area including both spatial and spectral information. The tags which were generated from the original FTIR image and now extracted with the rest of the payload, compared to the ones generated from the reversed FTIR image can show areas that failed to get to their original state or had modifications because of attacks. That is done indicating those ranges of five spectral components that contain the error and the 8 × 8 block of pixels where this occurred.
The embedding procedure embeds data by accessing pairs of consecutive spectral components in reverse order, starting from (
Demonstration of the embedding procedure by accessing spectral components in reverse order.
Step by step, the embedding procedure works as follows.
Read FTIR image Read visible light image
Convert Encode For every set of 5 spectral components
generate authentication tags. For every 8 × 8 block of pixels
generate authentication tags. Encode all tags using Reed-Solomon ( Encode all possible additional payload using Reed-Solomon; Combine everything in a single payload
Initiate Loop
Calculate spectral component difference (see (
Calculate quantization step Using step
for Calculate histogram Calculate left peak and zero points and left peak and zero indexes Calculate right peak and zero points and right peak and zero indexes
From second iteration onwards if
if
encode else if
encode else
encode Payload
Convert current
(a) Original histogram. (b) Shifted histogram. (c) Histogram with payload. (d) Reversed histogram.
For (i) if if (ii) if if
(i) For 1 ≤ (1) (2) if (a) if (b) if (ii) Use an alternative method to store the peak and zero points for the two last spectral components hiding it in
Return
The extracting procedure extracts hidden data and reverses the image by accessing pairs of consecutive spectral components in increasing order, starting from (
Read FTIR image
Extract the initial Initiate Loop (1) Calculate spectral component difference (2) Calculate (3) Capacity
(i) (ii) For 1 ≤ (1) (2) If (a) if if (b) if if (I) (II) (3) If (a) if if (b) if if (I) (II)
(i) For (1) if if (2) if if (ii) Get the image payload data and add it in (iii)
In the reverse way to embedding Step For every set of 5 spectral components
generate authentication tag from For every 8 × 8 block of pixels
generate authentication tag from
Return
Additionally, to Reed-Solomon codes, we studied and verified the effectiveness of deletion channel correction codes for robustness against removal and cropping attacks; the method remains the same but every application of Reed-Solomon codes is now replaced by deletion channel correction codes. Those are applied as described in Section
Note that in this case there is a generated key for each single use of the deletion channel error correcting codes. Those keys are stored in a file and used as input for decoding during the extracting/reversing procedure. Thus, unlike the default version where the only side information required is for the generation of the authentication tags, in here, side information is required for error corrections.
For the experimental purposes, FTIR image and visible light image samples have been collected from our university hospital following research procedures with informed consent. All the FTIR image samples were of size 64 × 64, including 1,556 spectral components and a bit depth of 24 bits. The embedded visible light images were 100 × 100 RGB image blocks with a bit depth of 8 bits. Seven different attack test scenarios with different characteristics were studied. Every test was repeated five times using different data each time, that is, different pairs of FTIR images and visible light images. Firstly, the functionality of the method was tested by running it without applying any attacks, to verify reversibility. Secondly, in order to test robustness and tamper detection, there were three different bit flipping attacks and four cropping/removal attacks. More details will follow in Section
Prior to attacks, on the host FTIR image, we confirmed reversibility by extracting/reversing an intact FTIR image that contained hidden data. In this test, the payload image was extracted intact, but most importantly, the authentication tags did not indicate any possibly tampered areas or tampered spectral components, proving that the method extracted the payload successfully but also reversed to its exact initial state. That was also confirmed by manually operating a simple subtraction between the original FTIR image and the reversed one.
Capacity depends on the type of error correcting codes that were utilized for the different attack scenarios. This is because the side information size is different, depending on whether Reed-Solomon or deletion channel correction encoding is applied. The following capacity results include the mean capacity and the standard deviation from five repetitions with different data.
Beginning with the use of Reed-Solomon codes by running five different sample images, full capacity is 5,842,880 ± 34,320 bits, while capacity with the side information subtracted is 2,758,483 ± 32,613 bits. The encoded payload visible light image’s size is 1,760,000 bits and the authentication tags’ size is 765,000 bits.
In the second case where deletion channel correction codes are employed for robustness against cropping and removal attacks, full capacity remained the same but the real capacity after subtracting the side information is 5,051,443 ± 34,160 bits. This time, the size of the encoded visible light image is 1,920,000 bits, while the size of the authentication tags is 768,000 bits.
Fidelity was good as the FTIR image containing hidden data was very close to the original one. Specifically, Peak Signal to Noise Ratio (PSNR) values using five sample images were 34.4 ± 1.5 dB when Reed-Solomon codes were employed. As a demonstration, a single spectral component before and after data hiding is shown in Figure
(a) Original FTIR image’s spectral component (2900 cm−1). (b) The same spectral component (2900 cm−1) after data has been hidden.
(a) Original FTIR spectrum from a single pixel. (b) Spectrum from the same pixel after the data has been hidden.
The data hiding method offers tamper detection through the comparison of authentication tags generated from the original FTIR image using HMAC-SHA256 and embedded in the FTIR image with tags generated from the reversed FTIR image after the payload has been extracted.
Specifically, tamper detection is realized by comparing the authentication tag produced of each set of five consecutive spectral components but also for each set of 8 × 8 blocks of image pixels. For instance, the tamper detection function could output that there is a possibility of tampering in spectral components 350–354 and the block of pixels with coordinates
The following experiments were performed to test all possible attacks described in Table
FTIR image attacks.
# | Error probability | Modified values | Error correction |
---|---|---|---|
|
0.01% |
|
Reed-Solomon |
|
0.1% |
|
Reed-Solomon |
|
1% |
|
Reed-Solomon |
|
0.001% |
|
Deletion channel |
|
0.01% |
|
Deletion channel |
|
0.025% |
|
Deletion channel |
|
0.0004% |
|
Deletion channel |
Experiments
Beginning with the first component of the payload, that is, the authentication tags, certain cases included errors resulting in false positives, as already described in Section
Another component inside the payload was the side information which enabled the selected histogram shifting data hiding method to run successfully. In this case, there was zero error tolerance and that is the reason why the solution presented in Section
Table
Payload and FTIR images’ recoverability after attacks.
# | Payload image (dB) | Intensity value errors (spectral component pixels) |
---|---|---|
1 | Intact | 626 ± 32 |
2 | 42 ± 0.0 | 6355 ± 62 |
3 | 37.5 ± 0.7 | 63865 ± 216 |
4 | 21.4 ± 8.3 | 167 ± 23 |
5 | 11.8 ± 0.5 | 2166 ± 824 |
6 | 11 ± 1.7 | 1553 ± 7 |
7 | 18.6 ± 5.8 | 165 ± 13 |
The extracted visible light image from 7 experiments.
As it has been demonstrated by the experimental results, the presented method is able to hide data with high capacity and good fidelity and reliably extract it and reverse the FTIR image in its initial state to make it safe for analyses. In case of attacks or modifications in FTIR images, the authentication tags will indicate possibly tampered areas and thus analyses can then be avoided from those locations.
High capacity was necessary for hiding visible light image data. Furthermore, error correcting codes multiply the needs for capacity. Thus, this was the reason that a double histogram shifting, that is, using two histogram point pairs, on difference matrices was proposed as the data hiding scheme. The choice of two histogram pairs combined with its application on the difference matrix and the use of quantization comes with a cost to fidelity in favor of capacity. However, it is demonstrated in the experimental results that fidelity was enough to make the FTIR images perfectly acceptable for viewing. For the analyses’ purposes, the images are reversed to their initial state and thus the level of modifications in the images that contain hidden data does not have any effect on the diagnoses. For reference, experimental results showed that, using the above scheme, the current approach offered an average pure capacity of 0.92 bits/value, where value stands for the pixel of spectral component unit. After subtraction of side information, the capacity was 0.43 bits/value and 0.79 bits/value with the use of Reed-Solomon and deletion channel correction codes, respectively. As for the fidelity between the original FTIR image and the one containing hidden data, maintaining the above capacity, the PSNR was over 34 dB.
Since the method was based on difference matrices of spectral components and FTIR images have higher bit depth and are overall different, there can be no direct comparison with the other histogram shifting schemes. However, some information can be acquired by comparing the capacity per unit and PSNR. Ni et al. [
In the first attack scenario, Reed-Solomon codes were able to extract the payload with high precision and the false positives from the extracted authentication tags were limited. Errors in the extracted visible light image were more prevalent in cropping/removal attacks and the employment of deletion channel correcting codes, although most of those errors were shifts due to missing bits; thus the content of the image was visible but the images colors distorted. The point where this problem was more significant was at the extraction of the authentication tags. Tags were able to detect that there were errors in the FTIR image but they could not indicate only the tampered areas as they also revealed that the whole or most of the FTIR image structure was tampered. This is an issue that can be fixed with a solution similar, for instance, to the one applied for the side information, as described in Section
It was shown that by comparing authentication tags the FTIR image can be reversed successfully and that there are no error artifacts in the spectral or spatial domain. It should be always taken into account, however, that there might be some false negatives since errors might be present in the extracted authentication tags, which will indicate the unlikely event of failed reversion in a specific area with that area having been completely recovered.
This paper presented a method for hiding visible light images or other information such as EPR in FTIR images with a main purpose of providing efficiency in data management and storage. Furthermore, the paper addressed security issues, that is, confidentiality, availability, and reliability of content. Tamper proofing and recoverability capabilities are additionally provided and small alterations on the host can be detected. The data hiding method can guarantee with the proper quantization settings high capacity and good fidelity between the original FTIR structure and its version that carries hidden data. Moreover, reversibility is available. Along extraction of data, the method reverses the FTIR image that carries hidden data to its original state. Last, two different approaches for error correction were suggested to enable corrections of the payload data after extraction.
What shall be researched in future work is a more optimized method for error correction. It was noted that bit errors cause problems in deletion channel correction encoding. The solution presented here is time-consuming and thus it was used only for the part of the payload created out of the side information which was of the highest priority. Specifically, with the current setup this solution added ~40 sec delay in the iterations where bit errors were detected. Improvements in here would allow great quality increase in the extracted visible light images, as well as the precision of tamper detection.
The authors declare that there are no conflicts of interest regarding the publication of this article.
This research was supported by Infotech Oulu and the Nokia Foundation. The authors further thank Dr. Lassi Rieppo from the Research Unit of Medical Imaging, Physics and Technology of the University of Oulu for providing the hospital research data for the experiments.