Image hashing has attracted much attention of the community of multimedia security in the past years. It has been successfully used in social event detection, image authentication, copy detection, image quality assessment, and so on. This paper presents a novel image hashing with low-rank representation (LRR) and ring partition. The proposed hashing finds the saliency map by the spectral residual model and exploits it to construct the visual representation of the preprocessed image. Next, the proposed hashing calculates the low-rank recovery of the visual representation by LRR and extracts the rotation-invariant hash from the low-rank recovery by ring partition. Hash similarity is finally determined by L2 norm. Extensive experiments are done to validate effectiveness of the proposed hashing. The results demonstrate that the proposed hashing can reach a good balance between robustness and discrimination and is superior to some state-of-the-art hashing algorithms in terms of the area under the receiver operating characteristic curve.
Innovation Project of Guangxi Graduate EducationYCSW2020109Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent ProcessingNatural Science Foundation of Guangxi Province2017GXNSFAA198222Guangxi Talent Highland Project of Big Data Intelligence and ApplicationGuangxi “Bagui Scholar” Team for Innovation and ResearchNational Natural Science Foundation of China6196600461762017619620081. Introduction
With the popularity of the platforms of the social network, such as Facebook and Twitter, more and more digital images are transmitted via the Internet and stored in the cyberspace. Therefore, efficient techniques are required for processing massive images and protecting content security. For example, when an important event happens, such as an opening ceremony of the Olympic Games, many people would like to forward the same image of the event in their social network. These forwarded images may undergo some digital operations, such as compression and enhancement. Consequently, there are many image copies of a hot event in the cyberspace. It is an important task to find a hot event of the social network by detecting image copies [1]. In recent years, a useful technique called image hashing [2, 3] attracts much attention of the community of multimedia security, which can extract a short code called hash based on the visual content of the input image regardless of its detailed bits. It can not only be applied to social event detection [1],but also can be used in many other applications [4–8], e.g., image retrieval, image authentication, image copy detection, and image quality assessment. In practice, the hash is used to represent the input image. As the hash is a short representation, the use of image hashing can achieve efficient data processing. In this paper, we study a novel hashing algorithm based on the low-rank representation model and ring partition.
The most important properties of the image hashing algorithm are robustness and discrimination [9]. The requirement of robustness is that the image hashing algorithm must map visually similar images to the same or similar hashes no matter whether their bit representations are the same or not. In other words, the hashing algorithm should be robust to normal digital operations, e.g., compression, filtering, and enhancement. This is because they alter the bit representations of digital images but keep their visual appearances unchanged. The requirement of discrimination is that the image hashing algorithm must map different images to completely different hashes. Since the number of different images is much bigger than that of similar images in practice, good discrimination means a low error rate of judging different images as similar images. This property is helpful to many applications, such as social event detection and image retrieval. Note that the two properties restrict each other. A high-performance algorithm should reach a good balance between them.
In the past years, many researchers have devoted themselves to developing image hashing algorithms. Some typical techniques are briefly introduced below. For example, Swaminathan et al. [10] combined the Fourier-Mellin transform and randomization method to develop secure image hashing. Monga and Evans [11] proposed to use feature points detected by the end-stopped wavelet to build the hash. These two hashing algorithms [10, 11] are resilient to small-angle rotation. Lv and Wang [12] combined SIFT features and Harris points to construct the hash. This scheme is robust to large-angle rotation and brightness adjustment, but its robustness against additive noise and blurring must be improved. Zhao et al. [13] derived the robust image hash for authentication from global features determined by Zernike moments and local features based on the position information of salient features. Tang et al. [14] calculated the image hash by jointly using the color vector angle (CVA) and discrete wavelet transform (DWT). Both the above hashing methods [13, 14] only resist small-angle rotation. Laradji et al. [15] extracted the hash of the color image by using the hypercomplex numbers and quaternion Fourier transform (QFT). This approach has good discrimination, but its robustness against rotation needs to be improved. Wang et al. [16] jointly exploited the Watson model and SIFT features to design the hashing algorithm for authentication. This method has better performance than the hashing method [13].
Recently, Yan et al. [17] introduced the quaternion techniques called quaternion Fourier transform and quaternion Fourier-Mellin moments to image hashing design. In another study, Yan et al. [18] exploited the adaptive local image features to design multiscale hashing. Both the hashing schemes [17, 18] reach good performance in tampering detection. Tang et al. [19] constructed the feature matrix via the DCT (discrete cosine transform) and learned hash code from the DCT-based matrix by local linear embedding. This hashing only resists image rotation within 5°. In [20], Tang et al. proposed to extract local statistical features from image rings and compressed statistical features by calculating vector distance. As the contents of image rings are rotation-invariant, rotation robustness of this hashing [20] is thus enhanced. Davarzani et al. [21] combined SVD (singular value decomposition) with CSLBP (Center-Symmetric Local Binary Patterns) to make the hashing scheme for authentication. The scheme is robust to additive noise and JPEG compression, but it is sensitive to rotation. Huang et al. [22] introduced the random walk to hash generation for improving security. In another work, Qin et al. [23] used SVD to conduct preprocessing and extracted hybrid features based on the circle-based feature and block-based feature to construct the hash. The hybrid feature-based hashing has good robustness against compression and filtering, but its computational cost should be reduced. Tang et al. [24] first proposed to construct a three-order tensor with image blocks and extracted the image hash from the three-order tensor by Tucker decomposition. This approach can resist small-angle rotation. Shen and Zhao [25] proposed to compute the image hash by using the color opponent component and quadtree structure features. This method shows good performance in image authentication, but it is fragile to image rotation.
From the above reviews, it can be found that many algorithms do not make a good balance between rotation robustness and discrimination. Aiming at this problem, we propose a new image hashing based on low-rank representation and ring partition. The main contributions of the proposed algorithm are as follows:
We calculate the visual representation of the preprocessed image based on the saliency map extracted by the spectral residual model. Since the saliency map can indicate visual attention of human beings, the visual representation using the saliency map can effectively describe salient regions of the image. Consequently, hash generation with the visual representation can improve robustness of the proposed algorithm
We propose to incorporate low-rank representation into ring partition. The low-rank representation can capture the global structure of data, which is helpful to make the discriminative hash. Since ring partition can produce a set of image rings invariant to rotation, hash code extraction based on image rings can reach good rotation robustness
Many experiments are done to validate effectiveness of the proposed algorithm. The results illustrate that the proposed algorithm can resist many digital operations, including image rotation with a large angle. Performance comparisons with some state-of-the-art algorithms are also done. The receiver operating characteristic (ROC) curve results show that the proposed algorithm has better classification performance than those of the compared algorithms in discrimination and robustness.
The structure of the rest of the paper is as follows. Section 2 describes the image hashing algorithm proposed in this paper. Sections 3 and 4 discuss experimental results and performance comparisons, respectively. Section 5 summarizes this paper.
2. Proposed Image Hashing
The proposed image hashing can be divided into four steps: preprocessing, visual representation calculation, low-rank representation, and ring partition. Figure 1 is the diagram of our algorithm. The preprocessing is to produce a normalized image, and the visual representation calculation is to generate an image representation which can indicate salient regions of the image. The use of low-rank representation is to extract principal image features for making the discriminative hash, and the use of ring partition can make the extracted feature code invariant to image rotation. These steps are described in detail in the following sections.
Diagram of the proposed hashing algorithm.
2.1. Preprocessing
This step includes three operations: bilinear interpolation, Gaussian low-pass filtering, and color space conversion. The bilinear interpolation is to resize the input image to a standard size n×n. This operation can make our algorithm resilient to image scaling. The 3×3 Gaussian low-pass filtering is then applied to the resized image. Such operation can alleviate the influence of some digital operations on the resized image, such as image noise and JPEG compression. Finally, the filtered image in RGB color space is converted to HSV color space [26] and the brightness component in the HSV color space is used to denote the input image. The HSV color space is also called the hexagonal cone model and has shown good performances in many existing hashing algorithms [26]. Let H1 be the hue of the pixel and S1 and V1 be the saturation and brightness of the pixel in the HSV color space, respectively. Thus, the calculation formulas for converting from RGB space to HSV space are as follows:
(1)H1=−B1+G1∗π/3MaxR1,G1,B1−MinR1,G1,B1ifR1=MaxR1,G1,B1,−R1+B1∗π/3MaxR1,G1,B1−MinR1,G1,B1ifG1=MaxR1,G1,B1,−G1+R1∗π/3MaxR1,G1,B1−MinR1,G1,B1ifB1=MaxR1,G1,B1,UndefinedifR1=G1=B1,S1=MaxR1,G1,B1−MinR1,G1,B1MaxR1,G1,B1ifMaxR1,G1,B1≠0,0ifMaxR1,G1,B1=0,V1=MaxR1,G1,B1,where R1 is the red component of the pixel, G1 and B1 are the green and blue components of the pixel, respectively, MaxR1,G1,B1 and MinR1,G1,B1 represent the maximum value and the minimum value of R1, G1, and B1, respectively.
Figure 2 shows a practical example of the preprocessing. Figure 2(a) is an input image, Figure 2(b) is the resized image, and Figure 2(c) represents the filtered image. Figures 2(d)–2(f) represent the hue, saturation, and brightness components of the filtered image in the HSV color space, respectively.
Practical example of preprocessing.
Input image
Resized image
Filtered image
Hue
Saturation
Brightness
2.2. Visual Representation Calculation
In this work, a well-known model of saliency detection called the spectral residual model (SRM) [27] is exploited to find the saliency map of the image. Then, the visual image representation can be determined by combining the saliency map and the brightness component of the preprocessed image. Here, we select the SRM as the method of saliency detection. This is because the SRM is better than the conventional method such as Itti’s method [28] in detection performance and computational speed [27].
Saliency map calculation of the classical spectral residual model is based on the spectral residual of the image Rf, which is defined as follows:
(2)Rf=Lf−hnf∗Lf,where hnf is a matrix of size n×n and Lf is the log spectrum of the image. More specifically, hnf is defined as follows:
(3)hnf=1n211⋯111⋯1⋮⋮⋯⋮11⋯1.
In addition, Lf is determined by the below formula:
(4)Lf=lnAf,where Af is the amplitude spectrum of the image, which can be determined by the following formula:
(5)Af=FIx,whereIx denotes a given image, F∙ is the Fourier transform, and |∙| denotes the amplitude of the image. Finally, the saliency map Sx in the spatial domain can be constructed by using the inverse Fourier transform as follows:
(6)Sx=gx∗F−1expRf+Pf2,where gx is a low-pass filter for smoothing the output saliency map of the inverse Fourier transform for better visual effects (a circular averaging filter radius 3 is used here), F−1∙ is the inverse Fourier transform, and Pf is the phase spectrum of the image defined in the below equation:
(7)Pf=φFIx,where φ∙ denotes the phase of the image. In [27], it is stated that image width (or height) with 64 pixels can reach a good estimation of the scale of normal visual conditions. Following this, we resize the n×n brightness component to 64×64 and convert the calculated saliency map to the original size n×n by bilinear interpolation. More details of SRM can be referred to [27].
Figure 3 shows an instance of saliency map detection via the spectral residual model. Figure 3(a) is the enlarged view of Figure 2(f), and Figure 3(b) is the saliency map of Figure 3(a) detected by the spectral residual model. From the results, it can be found that salient regions of the image are successfully extracted.
An instance of saliency map detection via SRM.
Enlarged view of Figure 2(f)
Saliency map
Suppose that V is the brightness component of the preprocessed image and Vi,j is the element of V in the ith row and the jth column. Also, assume that S denotes the saliency map of V and Si,j is the element of S in the ith row and the jth column. Therefore, the visual representation X can be obtained by the following equation:
(8)Xi,j=Vi,j×Si,j,where Xi,j is the element of X in the ith row and the jth column.
2.3. Low-Rank Representation
Low-rank representation (LRR) is a useful technique for capturing the global structure of data [29]. The LRR is robust to noise and can extract the lowest-rank representation of all data [29, 30]. It has been widely used in many applications, such as subspace segmentation [31], image segmentation [32], and image classification [33]. Suppose that X is an observation matrix corrupted by noiseE. Thus, LRR calculation can be solved by the regularized rank minimization problem [29–31] as follows:
(9)minZ,EZ∗+λE2,1,s.t.X=XZ+E,in which ∙∗ is the nuclear norm of a matrix (sum of the singular values of the matrix), λ>0 is a parameter for balancing effects of the two parts, ∙2,1 is the l2,1 norm, and E2,1 is defined as follows:
(10)E2,1=∑j=1n∑i=1nEij2.
Let Y be the low-rank recovery of X. Assume that the minimizer of (9) is Z∗,E∗. Thus, it can be obtained by Y=XZ∗ or Y=X−E∗. In practice, problem (9) can be converted to an equivalent optimization problem as follows:
(11)minZ,E,JJ∗+λE2,1,s.t.X=XZ+E,Z=J.
This optimization problem can be solved by the below ALM (Augmented Lagrange Multiplier) problem:
(12)minZ,E,J,Y1,Y2J∗+λE2,1+trY1TX−XZ−E+trY2TZ−J+μ2X−XZ−EF2+Z−JF2,in which Y1 and Y2 are the Lagrange multipliers and μ>0 is the penalty parameter. Problem (12) can be solved by the inexact ALM method [30]. More details of LRR can be referred to [29–31].
In this work, the input of LRR is the visual representation X and the low-rank recovery Y is taken for hash code extraction. The reasons of our use of LRR for image hashing design are as follows. The influences of digital operations (e.g., compression, filtering, and noise) on image are viewed as noises added to the image. Since LRR is robust to noise, hash generation with low-rank recovery can improve robustness of our algorithm. In addition, LRR can efficiently capture the global structure of input data. Therefore, the use of LRR can ensure the discriminative capability of the proposed algorithm.
2.4. Ring Partition
To make the proposed algorithm resilient to image rotation, a well-known technique of image segmentation called ring partition (RP) [9, 34] is exploited here. RP takes the image center as the circle center and divides the inscribed circle of the image into a set of rings. Figure 4 presents an example of RP with 4 image rings. Clearly, the contents of image rings are unchanged after image rotation. Therefore, we can calculate the hash code resistant to rotation by using the mean values of these rings. Details of hash code extraction based on RP are explained as follows.
Example of ring partition with 4 rings.
An image
Rotated image with 10°
Suppose that the low-rank recovery Y is divided into m rings. Note that the size of Y is n×n and the area of each ring is kept the same. Obviously, the elements of image rings can be determined by using two adjacent radii except those of the innermost ring. Assume that ri is the ith radius (i=1,2,⋯,m) labeled from the small value to the big value. Therefore, r1 is the radius of the innermost circle and rm is the radius of the outmost circle. It is clear that rm=n/2, where · means downward rounding. To calculate the other radii, the average area of the image ring should be first determined by the below equation:
(13)μa=C/m,where C=πrm2 is the area of the inscribed circle. Thus, the radius of the innermost circle can be calculated by the following equation:
(14)r1=μaπ.
Next, other ri (i=2,3,⋯,m−1) can be determined by the below formula:
(15)ri=μa+πri−12π.
After all radii are obtained, the elements of Y can be classified into image rings by using radii and the distances from these elements to the image center. Let Ui be the set of the elements of the ith ring (i=1,2,⋯,m) and px,y be the element of Y in the xth row and yth column. Assume that the coordinates of the image center are xc,yc. Therefore, xc=n/2+0.5 and yc=n/2+0.5 if n is an even number. Otherwise, xc=n+1/2 and yc=n+1/2. Thus, the distance from px,y to the image center xc,yc can be obtained by the below equation:
(16)dx,y=x−xc2+y−yc2.
Consequently, the set Ui can be determined by one of the following equations:
(17)U1=px,y∣dx,y≤r1,Ui=px,y∣ri−1<dx,y≤rii=2,3,⋯,m.
For each set, the mean of its elements is selected as compact feature. Let vi be the mean of the elements in Ui (i=1,2,⋯,m). Thus, it is quantized to an integer for reducing the storage by the below equation:
(18)hi=vi×100+0.5i=1,2,⋯,m,where · is the rounding operation. Finally, our hash is obtained by concatenating these integers as follows:
(19)h=h1,h2,⋯,hm.
Therefore, the length of our hash is m integers.
2.5. Hash Similarity Computation
As our hash is composed of some integers, the well-known distance metric called the L2 norm is exploited to measure similarity between two hashes. Assume that h1=h11,h12,⋯,h1m and h2=h21,h22,⋯,h2m are the hash sequences of two images. Thus, the L2 norm of the two hashes is defined as follows:
(20)dnorm=∑i=1mh1i−h2i2,in which h1i is the ith element of h1 and h2i is the ith element of h2. Generally, a smaller L2 norm means more similar hashes of the evaluated images. If the L2 norm is bigger than a threshold T, the evaluated images corresponding to the input hashes are judged as different images. Otherwise, they are viewed as similar images.
3. Experimental Results
The parameter settings of the proposed algorithm are as follows. The λ of LRR is 0.9, the input image is resized to 512×512, and the ring number is 64, i.e., n=512 and m=64. In the following experiments, Sections 3.1 and 3.2 validate the properties of robustness and discrimination, respectively. Section 3.3 analyzes our hash storage in binary form. Section 3.4 discusses the influence of the ring number on our algorithm performance.
3.1. Robustness
An open database called the Kodak dataset [35] is exploited to construct a test database of similar images. The Kodak dataset is composed of 24 color images of different categories with the size of 768×512 or 512×768. To produce similar images of these color images, some commonly used operations are used to conduct robustness attacks. These operations are achieved by Photoshop, MATLAB, and StirMark. More specifically, Photoshop provides brightness and contrast adjustments (parameters are ±10 and ±20). MATLAB provides 3×3 Gaussian low-pass filtering (standard deviations range from 0.3 to 1.0 with a step 0.1), gamma correction (parameters are 0.75, 0.9, 1.1, and 1.25), salt-and-pepper noise, and speckle noise (both parameters range from 0.001 to 0.01 with a step 0.001). StirMark provides JPEG compression (quality factors range from 30 to 100 with a step 10), watermark embedding (strengths range from 10 to 100 with a step 10), image scaling (scaling ratios are 0.5, 0.75, 0.9, 1.1, 1.5, and 2.0), and image rotation (rotation angles are ±1, ±2, ±5, ±10, ±15, ±30, ±45, and ±90). Note that image rotation will increase image size and some padded pixels are added to the rotated images. In this experiment, only the 361×361 central parts of 24 original images and their rotated images are taken for evaluating rotation robustness. Therefore, the number of the used operations is 10, which totally contribute 80 manipulations. This implies that each original image has 80 similar images. So the total number of visual similar images is 24×80=1920, and the number of the used image is 1920+24=1944.
Figure 5 demonstrates robustness experiments under different operations, where the x-axis is the parameter value of digital operation and the y-axis is the L2 norm. Note that the curves in Figure 5 are the mean values of the L2 norms between hashes of 24 color images and their similar images. From Figure 5, it can be seen that the mean L2 norms are all smaller than 15, except two values of rotation operation. For image rotation, the maximum value is 17.29, which is a little bigger than those of other operations. It is found that, if the threshold is selected as T=15, our algorithm can correctly detect 92.19% similar images. If there is no rotated image, our algorithm can recognize all similar images. A high correct detection rate illustrates good robustness of our algorithm.
Robustness test.
Brightness adjustment
Contrast adjustment
Gamma correction
3×3 Gaussian low-pass filtering
Speckle noise
Salt-and-pepper noise
JPEG compression
Watermark embedding
Image scaling
Image rotation
3.2. Discrimination
A famous database named UCID [36] is selected to test discrimination of our algorithm. This database contains 1338 color images with the size of 512×384 or 384×512. Hashes of these 1338 color images are firstly calculated, and the L2 norm between each pair of hashes is then computed. Therefore, the total number of valid distances is C13382=1338×1337/2=894453. Figure 6 illustrates the distribution of these distances, where the x-axis is the value of the L2 norm and the y-axis is the frequency of the L2 norm. It can be observed that the maximum L2 norm is 163.25 and the minimum L2 norm is 4.80. Moreover, the mean value of these L2 norms is 42.72 and the standard deviation is 18.48. From Figure 6, it can be seen that most distances are bigger than the abovementioned threshold T=15, indicating our good performance in discrimination. Actually, the performances of discrimination and robustness are closely related to the selected threshold. Different thresholds will lead to different performances. Table 1 demonstrates the performances of robustness and discrimination under different thresholds, where the robustness is measured by the correct detection rate and the discrimination is indicated by the false detection rate. Note that the correct detection rate is the ratio between the number of similar images correctly detected and the total number of similar images. The false detection rate is the ratio between the number of different images falsely judged as similar images and the total number of different images. From Table 1, we can select T=15 as the recommended threshold since it can make the minimum total error rate.
Distribution of L2 norms based on 1338 images.
Performances under different thresholds.
Threshold
Correct detection rate (d1)
False detection rate (d2)
Total error rate (1−d1+d2)
50
100%
70.89%
70.89%
45
99.74%
61.73%
61.99%
40
99.69%
50.97%
51.28%
35
99.58%
39.01%
39.43%
30
99.17%
26.64%
27.47%
25
98.39%
15.32%
16.93%
20
96.88%
6.58%
9.70%
15
92.19%
1.65%
9.46%
10
80.31%
0.13%
19.82%
3.3. Hash Storage
To analyze the required bits for storing our hash, the hashes of 1338 images in UCID are selected as the data source. Note that each hash generated by our algorithm is composed of 64 integers. Therefore, there are 1338×64=85632 integers in the data source. Figure 7 illustrates the distribution of these 85632 hash elements, where the x-axis is the element value and the y-axis is the frequency of the element value. It can be found that the minimum value of these hash elements is 1 and the maximum value is 49. Since 6 bits can represent a decimal number in the range 0,26−1=63, the storage of a hash element only requires 6 bits. Therefore, our hash storage requires 64×6=384 bits in total.
Distribution of 85632 hash elements.
3.4. Influence of the Ring Number on Hash Performance
To discuss the influence of the ring number on hash performance, different ring numbers are used to conduct robustness experiments and discriminative experiments. The used ring numbers include m=8, m=16, m=32, m=64, and m=80. In the experiments, only the ring number is altered, and the other parameters are unchanged. The two image databases used in Sections 3.1 and 3.2 are also selected here. To make visual and quantitative comparisons, the receiver operating characteristic (ROC) graph [37] is taken. In the ROC graph, the x-axis represents FPR (false positive rate) and the y-axis represents TPR (true positive rate). Let PFPR and PTPR be the FPR and the TPR, respectively. Thus, they can be defined as follows:
(21)PTPR=M1M2,PFPR=M3M4,in which M1 is the number of similar images correctly judged as similar images, M2 represents the total number of similar images, M3 is the number of different images falsely identified as similar images, and M4 denotes the total number of different images. Obviously, PFPR and PTPR can indicate discrimination and robustness, respectively. In the ROC graph, a curve is plotted by using a set of points with coordinates PFPR,PTPR. High performances of discrimination and robustness mean a low PFPR and a big PTPR. Therefore, it can intuitively conclude that the curve near the top-left corner has better classification performance than the curve far away from it. To make quantitative analysis, the AUC (area under the ROC curve) is taken. The range of AUC is [0, 1]. In general, a bigger AUC means better classification of the evaluated algorithm.
Figure 8 shows ROC curves of different ring numbers, where the curves in the upper-left area are zoomed in for easy comparison. Obviously, the five ROC curves are all close to the upper-left corner, indicating good classification of the proposed algorithm. In addition, the curves of ring numbers m=64 and m=80 are overlapping and both of them are above the curves of the other ring numbers. This means that the ring numbers m=64 and m=80 are slightly better than other ring numbers in terms of classification performance between robustness and discrimination. To make quantitative analysis, the AUC of each ring number is also calculated. It is found that the AUCs of ring numbers m=8, m=16, m=32, m=64, and m=80 are 0.98605, 0.98863, 0.99012, 0.99069, and 0.99087, respectively. Clearly, the AUCs of m=64 and m=80 are slightly different. They are all bigger than those of other ring numbers. This also verifies that m=64 and m=80 are better than other ring numbers.
ROC curves of different ring numbers.
To view computational time of different ring numbers, the proposed algorithm is coded by using MATLAB language and the used computer is a workstation with 2.10 GHz Intel Xeon Silver 4110 CPU and 64 GB RAM. The total time of extracting 1338 image hashes is computed to find the average time of generating a hash. It is found that the average time of m=8, m=16, m=32, m=64, and m=80 is 25.407, 27.133, 26.957, 29.361, and 33.611 seconds, respectively. Moreover, hash lengths of different ring numbers are also compared. Note that our hash length is equal to the ring number when the length is measured by integers. Therefore, the hash lengths of m=8, m=16, m=32, m=64, and m=80 are 8, 16, 32, 64, and 80 integers, respectively. Table 2 summarizes performances of different ring numbers. From the viewpoint of the whole performance, it can be observed that the ring number m=64 reaches a good balance among the three performance indices.
Performances of different ring numbers.
m
AUC
Hash length (integer)
Computational time (s)
8
0.98605
8
25.407
16
0.98863
16
27.133
32
0.99012
32
26.957
64
0.99069
64
29.361
80
0.99087
80
33.611
4. Performance Comparisons
To demonstrate advantages of our hashing algorithm, we compare it with some popular hashing algorithms. The selected hashing algorithms are CVA-DWT hashing [14], SVD-CSLBP hashing [21], random walk-based hashing [22], and hybrid feature-based hashing [23]. The datasets used in the above robustness experiment and discrimination test are also taken here, i.e., 1920 pairs of similar images and 1338 different images. To make fair comparison, the parameters of the selected algorithms are set to the same values as their original papers. Their hash similarity metrics are kept unchanged, i.e., the L2 norm for CVA-DWT hashing and hybrid feature-based hashing, the correlation coefficient for SVD-CSLBP hashing, and the normalized Hamming distance for random walk-based hashing. The result of our hashing with m=64 is taken for comparison.
Figure 9 illustrates the ROC curves of the evaluated hashing algorithms. To see more details, the ROC curves in the upper-left area are zoomed in and placed in the bottom-right part of Figure 9. It is observed that the curves of our hashing and hybrid feature-based hashing intersect with each other. Moreover, both of them are above the curves of other evaluated hashing algorithms. To conduct quantitative analysis, the AUCs of the evaluated hashing algorithms are also calculated. It is found that the AUCs of CVA-DWT hashing, SVD-CSLBP hashing, random walk-based hashing, hybrid feature-based hashing, and our hashing are 0.97563, 0.74522, 0.95758, 0.97545, and 0.99069, respectively. From the results, it can be seen that our hashing is better than the compared algorithms in classification performance.
ROC curves of the evaluated hashing algorithms.
Average time of calculating a hash is also compared. The abovementioned computer is used again, and the compared algorithms are also implemented with MATLAB. It is found that the average time of CVA-DWT hashing, SVD-CSLBP hashing, random walk-based hashing, hybrid feature-based hashing, and our hashing are 0.05, 0.19, 0.02, 6.12, and 29.36 seconds, respectively. The speed of our hashing is slower than those of the compared algorithms. This is because the computational cost of the LRR method is relatively high. Moreover, the hash lengths of all algorithms are also compared. The hash lengths of CVA-DWT hashing and random walk-based hashing are 960 and 144 bits, respectively. The hash lengths of SVD-CSLBP hashing and hybrid feature-based hashing are 64 and 104 floating-point numbers. As a floating-point number requires 32 bits for storage according to the IEEE standard [38], the hash lengths of SVD-CSLBP hashing and hybrid feature-based hashing are 2048 and 3328 bits, respectively. The length of our hashing is 384 bits. It is longer than that of random walk-based hashing, but it is much shorter than those of other compared hashing algorithms. Table 3 summarizes performance indices of the evaluated hashing algorithms, where the text in italic is the best result of the corresponding column. Our hashing outperforms the compared hashing algorithms in classification performance in terms of AUC, but it runs slower than the compared algorithms. As to hash length, it is better than all compared algorithms, except random walk-based hashing.
Performance comparisons among different algorithms (text in italic is the best result).
Algorithm
AUC
Average time (s)
Hash length (bit)
CVA-DWT hashing
0.97563
0.05
960
SVD-CSLBP hashing
0.74522
0.19
2048
Random walk-based hashing
0.95758
0.02
144
Hybrid feature-based hashing
0.97545
6.12
3328
Our hashing
0.99069
29.36
384
5. Conclusions
In this paper, we have proposed a novel image hashing with LRR and RP. An important contribution is the calculation of the visual representation based on the saliency map determined by SRM. Hash generation based on the visual representation can improve robustness performance. Another significant contribution is the combination of LRR and RP, which can make the discriminative hash invariant to rotation. Many experiments with two well-known databases have been carried out. The results have shown that the proposed hashing is robust and discriminative. ROC curve comparisons have illustrated that the proposed hashing outperforms the compared hashing algorithms in classification performance. In addition, hash length comparisons have shown that the proposed hashing is better than all compared algorithms, except random walk-based hashing. As to running speed, the proposed hashing runs slower than the compared algorithms due to the high computational cost of the LRR method. In the future, we plan to design fast hashing algorithms, deep learning-based hashing algorithms, and hashing algorithms for image authentication.
Data Availability
The image datasets used to support the findings of this study can be downloaded from the public websites whose hyperlinks are provided in this paper.
Conflicts of Interest
The authors declare that there is no conflict of interest regarding publication of this paper.
Acknowledgments
This work is partially supported by the National Natural Science Foundation of China (61962008, 61762017, and 61966004), the Guangxi “Bagui Scholar” Team for Innovation and Research, the Guangxi Talent Highland Project of Big Data Intelligence and Application, the Guangxi Natural Science Foundation (2017GXNSFAA198222), the Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing, and the Innovation Project of Guangxi Graduate Education (YCSW2020109).
McParlaneP. J.McMinnA. J.JoseJ. M.“Picture the scene...”;:visually summarising social media eventsProceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management - CIKM '142014Shanghai, China1459146810.1145/2661829.26619232-s2.0-84937597330VenkatesanR.KoonS.-M.JakubowskiM. H.MoulinP.Robust image hashingProceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)2000Vancouver, BC, Canada66466610.1109/ICIP.2000.899541TangZ.HuangZ.ZhangX. Q.LaoH.Robust image hashing with multidimensional scaling201713724025010.1016/j.sigpro.2017.02.0082-s2.0-85013639319WuD.LinZ.LiB.YeM.WangW.Deep supervised hashing for multilabel and large-scale image retrievalProceedings of the 2017 ACM on International Conference on Multimedia Retrieval2017Bucharest, Romania15015810.1145/3078971.30789892-s2.0-85021847207TangZ.HuangZ.YaoH.ZhangX. Q.ChenL.YuC.Perceptual image hashing with weighted DWT features for reduced-reference image quality assessment201861111695170910.1093/comjnl/bxy0472-s2.0-85061978957OuyangJ.LiuY.ShuH.Robust hashing for image authentication using SIFT feature and quaternion Zernike moments20177622609262610.1007/s11042-015-3225-x2-s2.0-84954491531ZhouZ.WuQ. M. J.YangY.SunX.Region-level visual consistency verification for large-scale partial-duplicate image search2020162, article 5412510.1145/3383582ZhouZ.MuY.WuQ. M. J.Coverless image steganography using partial-duplicate image retrieval201923134927493810.1007/s00500-018-3151-82-s2.0-85044450997TangZ.ZhangX. Q.ZhangS.Robust perceptual image hashing based on ring partition and NMF201426371172410.1109/TKDE.2013.452-s2.0-84894472419SwaminathanA.MaoY.WuM.Robust and secure image hashing20061221523010.1109/TIFS.2006.8736012-s2.0-33744764278MongaV.EvansB. L.Perceptual image hashing via feature points: performance evaluation and tradeoffs200615113452346510.1109/tip.2006.8819482-s2.0-3375034410817076404LvX.WangZ.Perceptual image hashing based on shape contexts and local feature points2012731081109310.1109/TIFS.2012.21905942-s2.0-84861117192ZhaoY.WangS.ZhangX. P.YaoH.Robust hashing for image authentication using Zernike moments and local features201381556310.1109/TIFS.2012.22236802-s2.0-84872026878TangZ.DaiY.ZhangX. Q.HuangL.YangF.Robust image hashing via colour vector angles and discrete wavelet transform20148314214910.1049/iet-ipr.2013.03322-s2.0-84896809449LaradjiI.GhoutiL.KhiariE.Perceptual hashing of color images using hypercomplex representations2013 IEEE International Conference on Image Processing2013Melbourne, VIC, Australia4402440610.1109/icip.2013.67389072-s2.0-84897769013WangX.PangK.ZhouX.ZhouY.LiL.XueJ.A visual model-based perceptual image hash for content authentication20151071336134910.1109/TIFS.2015.24076982-s2.0-84930204407YanC.PunC.YuanX.Quaternion-based image hashing for adaptive tampering localization201611122664267710.1109/TIFS.2016.25941362-s2.0-84991088877YanC.PunC.YuanX.Multi-scale image hashing using adaptive local feature extraction for robust tampering detection201612111610.1016/j.sigpro.2015.10.0272-s2.0-84947799354TangZ.LaoH.ZhangX. Q.LiuK.Robust image hashing via DCT and LLE20166213314810.1016/j.cose.2016.07.0062-s2.0-84979743147TangZ.ZhangX. Q.LiX.ZhangS.Robust image hashing with ring partition and invariant vector distance201611120021410.1109/TIFS.2015.24851632-s2.0-84964888387DavarzaniR.MozaffariS.YaghmaieK.Perceptual image hashing using center-symmetric local binary patterns20167584639466710.1007/s11042-015-2496-62-s2.0-84923885188HuangX.LiuX.WangG.SuM.A robust image hashing with enhanced randomness by using random walk on zigzag blocking2016 IEEE Trustcom/BigDataSE/ISPA2016Tianjin, China141810.1109/trustcom.2016.00402-s2.0-85015223397QinC.SunM.ChangC.Perceptual hashing for color images based on hybrid extraction of structural features201814219420510.1016/j.sigpro.2017.07.0192-s2.0-85025708445TangZ.ChenL.ZhangX. Q.ZhangS.Robust image hashing with tensor decomposition201931354956010.1109/TKDE.2018.28377452-s2.0-85047013520ShenQ.ZhaoY.Perceptual hashing for color image based on color opponent component and quadtree structure2020166, article 10724410.1016/j.sigpro.2019.107244TangZ.LiX.SongJ.WeiM.ZhangX. Q.Colour space selection in image hashing: an experimental study201634444044710.1080/02564602.2016.12005002-s2.0-85014788359HouX.ZhangL.Saliency detection: a spectral residual approach2007 IEEE Conference on Computer Vision and Pattern Recognition2007Minneapolis, MN, USA1810.1109/cvpr.2007.3832672-s2.0-35148814949IttiL.KochC.NieburE.A model of saliency-based visual attention for rapid scene analysis199820111254125910.1109/34.7305582-s2.0-0032204063LiuG.LinZ.YuY.Robust subspace segmentation by low-rank representationProceedings of the 27th international conference on machine learning (ICML-10)2010Haifa, Israel663670LiuG.LinZ.YanS.SunJ.YuY.MaY.Robust recovery of subspace structures by low-rank representation201335117118410.1109/TPAMI.2012.882-s2.0-8487019751722487984ChenJ.YangJ.Robust subspace segmentation via low-rank representation20144481432144510.1109/TCYB.2013.22861062-s2.0-8490466689924196982ChengB.LiuG.WangJ.HuangZ.YanS.Multi-task low-rank affinity pursuit for image segmentation2011 International Conference on Computer Vision2011Barcelona, Spain2439244610.1109/iccv.2011.61265282-s2.0-84863052482WangQ.HeX.LiX.Locality and structure regularized low rank representation for hyperspectral image classification201957291192310.1109/TGRS.2018.28628992-s2.0-85051675429TangZ.ZhangX. Q.HuangL.DaiY.Robust image hashing using ring-based entropies20139372061206910.1016/j.sigpro.2013.01.0082-s2.0-84875880417Kodak lossless true color image suiteJune 2020, http://r0k.us/graphics/kodak/SchaeferG.StichM.UCID - an uncompressed colour image database5307Proceedings of SPIE, Storage and Retrieval Methods and Applications for Multimedia 20042004San Jose, California, USA47248010.1117/12.5253752-s2.0-8844280843FawcettT.An introduction to ROC analysis200627886187410.1016/j.patrec.2005.10.0102-s2.0-33646023117IEEE Std754–20082008IEEE170