Fast Image Search with Locality-Sensitive Hashing and Homogeneous Kernels Map

Fast image search with efficient additive kernels and kernel locality-sensitive hashing has been proposed. As to hold the kernel functions, recent work has probed methods to create locality-sensitive hashing, which guarantee our approach's linear time; however existing methods still do not solve the problem of locality-sensitive hashing (LSH) algorithm and indirectly sacrifice the loss in accuracy of search results in order to allow fast queries. To improve the search accuracy, we show how to apply explicit feature maps into the homogeneous kernels, which help in feature transformation and combine it with kernel locality-sensitive hashing. We prove our method on several large datasets and illustrate that it improves the accuracy relative to commonly used methods and make the task of object classification and, content-based retrieval more fast and accurate.


Introduction
In Web 2.0 applications era, we are experiencing the growth of information and confronted with the large amounts of user-based content from internet. As each one can publish and upload their information to the internet, it is urgent for us to handle the information brought by these people from internet. In order to organize and be close to these vision data from Internet, it has caused considerable concern of people. Therefore, the task of fast search and index for large video or image databases is very important and urgent for multimedia information retrieval such as vision search especially now the big data in some certain domains such as travel photo data from the website and social network image data or other image archives.
With the growth of vision data, we focus on two important aspects of problem including nearest neighbor search and similarity metric learning. For metric learning, many of the researchers have proposed some algorithms such as Information-Theoretic metric learning [1]. As for nearest neighbors search, the most common situation and task for us is to locate the most similar image from an image database. Among all the methods, given the similarity of example and query item, the most common method is to find all the vision data among the vision database and then sort them. However time complexity of this algorithm is too large and also impractical. When we handle image or video data, especially, this complexity will not be calculated, because it is very difficult for us to compute the distance of two items in higher dimensional space and also vision datum is sparse, so we cannot complete it by limited time.
Many researchers believe that linear scanning can solve this problem; although we believe it is a common approach and not suitable for computing in large-scale datasets, it promotes the development of ANN. LSH was used in ANN algorithms. To get fast query response for high-dimensional space input vectors [1][2][3][4][5], when using LSH, we will sacrifice the accuracy. To assure a high probability of collision for similar objects, randomized hash function must be computed; this is also referred to in many notable locality-sensitive hashing algorithms [6,7].
Although, in object similarity search task, the LSH has played an important role, some other issues and problems have been neglected. In image retrieval, recognition, and search tasks, we find that they are very common: To verify our idea, we did several experiments in object search task. For example, we show our results on the Caltech-101 [8] dataset and demonstrate that our approach is superior to the existing hashing methods as our proposed algorithm.
In order to test our algorithm performance on dataset, we design some experiments on certain visual task such as Caltech-101 [8] and demonstrate that the performance of algorithm in our paper is beyond the traditional LSH approaches on the dataset, as hash functions can be calculated beyond many kernels. Arbitrary kernel in ANN is suitable in our scheme; actually we can find that a lot of similarity hashing functions can be accessed in the task of vision search tasks based on content retrieval.

Homogeneous Kernel
In our paper, we mainly focus on some similar kernels like intersection, Jensen-Shannon, Hellinger's, and 2 kernels. In the fields of machine learning and vision search, we often use these kernels as learning kernels. These kernels have two common attributes: being homogeneous and additive. The idea of kernel signature has been smoothly connected to these kernels in this section. Meanwhile we can use pure functions to represent these kernels. Also these attributes will be applied in Section 3 to obtain kernel feature maps. Through the kernel feature map, we can get their approximate expression.

Homogeneous Kernels. A kernel
: When = 1, we believe that ( , ) is homogeneous. Let = 1/ √ ; we can obtain a -homogeneous kernel and we can also write the formula as Here the pure function is called the kernel signature.
In the field of machine learning or computer vision, most of the homogeneous kernels are composed of the Jensen-Shannon, intersection, 2 , and Hellinger's kernels. So we can also view them as additive kernels. In the next section, we will focus on these kernels and their kernel maps. Table 1 shows the details [9]. 2 Kernel. We define ( , ) = 2 /( + ) as the 2 kernel [10,11]. Here the 2 distance is then defined as 2 ( , ) = 2 ( , ).
-Homogeneous Parameters. In previous research paper, we can see that the homogeneous kernels are used by parameters = 1 and = 2. When = 2, the kernel becomes ( , ) = . Now, in our paper, we can derive the -homogeneous kernel by formula (2).

Homogeneous Kennel Map
When handling low-dimensional data which is inseparable, we should create kernel feature map ( ) for the kernel The Scientific World Journal 3 Table 1: Common kernels, signature, and their feature maps.
so that we can map our input data information in lowdimensional space to relatively high-dimensional (Hilbert) information space with ⟨⋅, ⋅⟩: In order to compute the feature maps and get approximate kernel feature maps expression for the homogeneous kernels, we should use Bochner's theorem by expanding the configuration of -homogeneous expression. Here we notice that if a homogeneous kernel is Positive Definite [13], its signature will also be Positive Definite expression. The assumption condition is suitable for a stationary kernel. So, depending on formulae (2) and Bochner's theorem (9), we can derive the ( , ) and closed feature map. We can compute the kernel density and feature map closed form [9] for most machine learning kernels. Table 1 illustrates the results. Consider

Kernelized Locality-Sensitive Hashing
To create and conduct the data association, we take the approach of Kernelized LSH [14] which is also a hash  [14]. Similar to LSH, constructing the hash functions for KLSH has been the key problem for us. That means if we intend to compute the collision probabilities of input query and the database points, we should compute the extent of similarity between them in the database as proposed by [15].
KLSH Principle. Any locality-sensitive hashing algorithm is based on the probability of distribution of hash function clusters. So we should compute the collision probability of a bundle of points, for example, and : We can also view the problem as the issue of computing the similarity of objects between and . Here sim( , ) in the algorithm is the measure function of calculating the similarity, while ℎ( ) and ℎ( ) are randomly selected from the hash function cluster . The instinct beyond this is that we find the fact that and will collide in the same hash bucket. So those objects which are significantly similar will be more possible to be memorized in the hash table and this eventually results in confliction [1].
We can derive the similarity function expression according to the vector inner product: In [15,16], the definition of LSH function has been extended from formula (10) as Here we create a random hyper plane vector ⃗ . The distribution of ⃗ fit has a zero-mean multi-Gaussian (0, Σ ) distribution. The dimensionality of ⃗ is the same with the input vector . This demonstrates that the statistical characteristic of input vector is uniquely matched with each hash function.

4
The Scientific World Journal Meanwhile this verification has been detailedly reported in the LSH attribute in [17]. When we project on a point , actually the sigh function we obtain in this process is a hash function and then we repeat it times; a couple of hashes can be created. We can also call this couple of hashes hash bucket. The hash bucket can be formed as ( ) = ⟨ℎ 1 ( ) , . . . , ℎ ( ) , . . . , ℎ ( )⟩ .
When given the value of , we can get all the the hash functions located in the bucket; we can see the following: Due to the fact that we compute the similarity measure function in high-dimensional kernel space, the similarity function can also be extended and written as sim (( , )) = ( , ) = ( ) ( ) .
In formula (16), we use kernel function ( ) to construct ( , ) to complete the kernel mapping for the points of m and . And ( ) ( ) is a product of projection on hash function from the R space. The problem is that nothing is known about the data while in kernel space to generate ⃗ from (0, Σ ). Therefore, in order to construct the hash function, ⃗ needs to be created so that we can quickly compute the ⃗ ( ) function based on the kernel. Similar to normal ⃗ , we could use only the kernel of ( ) to approximately compute the function of ⃗ ( ). We should select a subset of database to construct ⃗ . By the large number of central limit theory, if we intend to choose parts of database items from the whole database to form the dataset , the sample of kernel data must be satisfied by the distribution with mean and variance Σ. The variable can be written as With the growth of variable , the theory tells us that the vector̃= √ ( − ) has also been satisfied by the distribution of normal Gaussian.
We used the whitening transform to obtain ⃗ : The LSH function has been yielded: As analyzed above, we use kernel function to represent the database data; then the statistical data like variance and mean are uncertain. If we intend to estimate and calculate and Σ, we could sample the data from the database by KPCA and eigen decomposition in [18] and we let Σ = Λ and Σ −1/2 = Λ −1/2 ; therefore we can obtain the hash function ℎ( ( )): From the above, we can see how to construct the hash function for the kernel matrix input vectors. In this case, we let the kernel matrix input be = Ω by decomposing the matrix.
Here Ω and Λ have the same nonzero eigenvalue; it is also viewed as another form of kernel matrix input. From [18], we compute the projection Here and V are, respectively, the th eigenvector of the kernel matrix and its covariance matrix.
Through the above derived formula ( ) we can obtain ⃗ = ∑ =1 ( ) ( ) which obeys random Gaussian distribution, then we can substitute (17) into ( ): We neglect the term of √ , and finally the simplified ( ) yields (27). represents the unit vector for . And therefore hash function for kernel input will finally be is the kernel mapping matrix for points and in space. After several iterations, the hash function will form a hash bucket.
In order to get the suitable parameters in this process, we implement the query matching for several iterations. The detailed algorithm is illustrated finally in Algorithm 1.

Experimental Result
In the experiment, we proposed the homogenous kernelhashing algorithm and verified the high efficiency on the dataset. In our scheme, homogenous kernel-KLSH method makes it possible to get the unknown feature embeddings. We use these features to conduct vision search task to locate the most similar items in the database, and the neighbors we find in the task will give their scores on the tags. The method proved to be more effective and accurate than the linear scan search.
In this part, we design and verify our algorithm on the Caltech-101 dataset in Figure 1. Caltech-101 dataset is a benchmark on image recognition and classification, which has 101 categories objects and each category has about 100 images, so 10000 images totally. In recent years, many researchers have done useful research on this dataset such us proposing some important and useful image represent kernels [19]. Also there are many published papers that focused on this dataset, some of which are very valuable and significantly historic. For example, papers [20][21][22], respectively, state their contribution to the dataset. The author of [21] proposed the matching method for pyramid kernel of images histograms, while Berg [20] proposed and created the CORR kernel of  image local feature using geometric blur for matching local image similarity. In our paper, we apply our algorithm to complete the vision classification and similar search task. The platform of our benchmark is based on Intel 4 core 3.6 GHZ CPU and 16 GB of memory and 2 TB hard disk.
We used 2 kernel for -homogeneous kernel maps ( = 1/2) and applied the nonlinear RBF-2 kernel designed in [19,23] to the SIFT-based local feature. Meanwhile we applied and learnt the homogenous kernel map beyond it. Compared with the nonlearnt kernel, our learnt kernel has been more accurate. And we use KNN classifier, respectively, for KLSH and linear scan to compute the accuracy of classification. We also compare it with CORR [24] and the result proves to be better than them, here we use 15 images per class for training task.
From Figure 2 we can see that the growth of parameters is closely related with accuracy. As is seen, the accuracy increased with the increase of , while it has little relationship with the number of and . The value of ( , , ) is chosen as = 300, = 300, = 30 as the best parameters through a series of experiments.
We find that the combination of these parameters can result in better performance than the large-scale dataset. Meanwhile it can be seen that our approach with homogenous kernel map has higher accuracy than CORR-KLSH with metric learning [25]. Figure 3 illustrates that our method is superior to other existing approaches [25][26][27][28] tested on this dataset. Comparing with other kernel classifiers, our classifier with RBF-2 kernel for local features performs better. In Table 2 we can see that the result of ours has higher accuracy with = 15  and = 30 than other papers' results including better than [24] which obtains the result by 61% for = 15 and 69.6% for = 30. More clearly, it has improved the result by 16% several years ago. In order to find the best parameters in our experiment for NN search for our scheme, we should take into account the balance between performance and CPU time. Therefore here we conducted to analyze the performance and CPU time of different of ( = 2, 3, . . . , 20) for NN search. Figure 4 illustrates the accuracy and CPU time by each in our dataset.
The author of [26] proposed the method by combining KPCA and normal LSH. That means computing hashing beyond the KPCA. However this method has apparent disadvantage because KPCA will bring on the loss of input information although it can reduce the dimensionality in the processing, while KLSH can solve this problem to assure the integrity of input information to compute the LSH. Therefore  we found that our method has high accuracy and better performance than the algorithm in [26].

Conclusions
In our paper, we properly use the concept of homogeneous kernel maps to help us to solve the problem of approximation of those kernels, including those commonly used in machine learning such as 2 , JS, Hellinger's, and intersection kernels. Combining with the KLSH scheme, it enables us to have access to any kernel function for hashing functions. Although our approach is inferior to linear scan search in time but it can guarantee that the search accuracy will not be affected. Moreover we do not need to consider the distribution of input data; to some extent, it can be applicable for many other databases as Flicker and Tiny Image. Experimental results demonstrate that it is superior to standard KLSH algorithm.