Fast image search with efficient additive kernels and kernel locality-sensitive hashing has been proposed. As to hold the kernel functions, recent work has probed methods to create locality-sensitive hashing, which guarantee our approach’s linear time; however existing methods still do not solve the problem of locality-sensitive hashing (LSH) algorithm and indirectly sacrifice the loss in accuracy of search results in order to allow fast queries. To improve the search accuracy, we show how to apply explicit feature maps into the homogeneous kernels, which help in feature transformation and combine it with kernel locality-sensitive hashing. We prove our method on several large datasets and illustrate that it improves the accuracy relative to commonly used methods and make the task of object classification and, content-based retrieval more fast and accurate.
1. Introduction
In Web 2.0 applications era, we are experiencing the growth of information and confronted with the large amounts of user-based content from internet. As each one can publish and upload their information to the internet, it is urgent for us to handle the information brought by these people from internet. In order to organize and be close to these vision data from Internet, it has caused considerable concern of people. Therefore, the task of fast search and index for large video or image databases is very important and urgent for multimedia information retrieval such as vision search especially now the big data in some certain domains such as travel photo data from the website and social network image data or other image archives.
With the growth of vision data, we focus on two important aspects of problem including nearest neighbor search and similarity metric learning. For metric learning, many of the researchers have proposed some algorithms such as Information-Theoretic metric learning [1]. As for nearest neighbors search, the most common situation and task for us is to locate the most similar image from an image database. Among all the methods, given the similarity of example and query item, the most common method is to find all the vision data among the vision database and then sort them. However time complexity of this algorithm is too large and also impractical. When we handle image or video data, especially, this complexity will not be calculated, because it is very difficult for us to compute the distance of two items in higher dimensional space and also vision datum is sparse, so we cannot complete it by limited time.
Many researchers believe that linear scanning can solve this problem; although we believe it is a common approach and not suitable for computing in large-scale datasets, it promotes the development of ANN. LSH was used in ANN algorithms. To get fast query response for high-dimensional space input vectors [1–5], when using LSH, we will sacrifice the accuracy. To assure a high probability of collision for similar objects, randomized hash function must be computed; this is also referred to in many notable locality-sensitive hashing algorithms [6, 7].
Although, in object similarity search task, the LSH has played an important role, some other issues and problems have been neglected. In image retrieval, recognition, and search tasks, we find that they are very common:
in the sample feature space, traditionally LSH approaches can only let us get a relatively high collision probability for items nearby. As a lot of vision datasets contained much rich information, we can find that the category tags are attached to YouTube and Flickr data and the class labels are attached to Caltech-101 images. However the low-level and high-level of vision samples have great gap, which means that the gap low-level features and high-level semantic information exist. To solve this problem, we intend to utilize the side additional information for constructing hash table;
As to manipulate nonlinear data which is linear inseparable, we commonly use kernel method in vision task because of its popularity. For instance, in vision model, objects are often modeled as BOF and kernel trick is an important approach in classifying these data from low-dimension space to high-dimension space. However, how to create hash table in kernel spaces is a tough problem for us.
To verify our idea, we did several experiments in object search task. For example, we show our results on the Caltech-101 [8] dataset and demonstrate that our approach is superior to the existing hashing methods as our proposed algorithm.
In order to test our algorithm performance on dataset, we design some experiments on certain visual task such as Caltech-101 [8] and demonstrate that the performance of algorithm in our paper is beyond the traditional LSH approaches on the dataset, as hash functions can be calculated beyond many kernels. Arbitrary kernel in ANN is suitable in our scheme; actually we can find that a lot of similarity hashing functions can be accessed in the task of vision search tasks based on content retrieval.
2. Homogeneous Kernel
In our paper, we mainly focus on some similar kernels like intersection, Jensen-Shannon, Hellinger’s, and χ2 kernels. In the fields of machine learning and vision search, we often use these kernels as learning kernels. These kernels have two common attributes: being homogeneous and additive. The idea of kernel signature has been smoothly connected to these kernels in this section. Meanwhile we can use pure functions to represent these kernels. Also these attributes will be applied in Section 3 to obtain kernel feature maps. Through the kernel feature map, we can get their approximate expression.
Homogeneous Kernels. A kernel kl:R0+×R0+→R is γ-homogenous if(1)∀m≥0:klma,mb=mγkla,b.When γ=1, we believe that kl(ma,mb) is homogeneous. Let m=1/ab; we can obtain a γ-homogeneous kernel and we can also write the formula as(2)kla,b=m-γkl(ma,mb)=abγ/2klba,ab=abγ/2κlogb-loga.Here the pure function(3)κ(λ)=kl(eλ/2,e-λ/2),λ∈Ris called the kernel signature.
Stationary Kernels. A kernel ks:R0+×R0+→R is called stationary kernels if(4)∀l∈R:ks(l+a,l+b)=ks(a,b).Let l=-(a+b)/2; the ks(a,b) is represented as(5)ksa,b=ks(l+a,l+b)=ksb-a2,a-b2=κb-a,(6)κλ=ksλ2,-λ2,λ∈R.Here we call formula (6) kernel feature.
In the field of machine learning or computer vision, most of the homogeneous kernels are composed of the Jensen-Shannon, intersection, χ2, and Hellinger’s kernels. So we can also view them as additive kernels. In the next section, we will focus on these kernels and their kernel maps. Table 1 shows the details [9].
Common kernels, signature, and their feature maps.
Kernel
k(a,b)
Signature κ(θ)
κ(w)
Feature ψω(a)
Hellinger’s
ab
1
δ(ω)
a
χ2
2aba+b
sech(θ2)
sech(πω)
eiwlogaasech(πω)
Intersection
min{a,b}
e-|θ|/2
2π11+4ω2
eiwloga2aπ11+4ω2
JS
a2log2a+ba+b2log2a+bb
eθ/22log2(1+e-θ)+e-θ/22log2(1+eθ)
2log4sech(πω)1+4ω2
eiwloga2log4sech(πω)1+4ω2
χ2 Kernel. We define k(a,b)=2ab/(a+b) as the χ2 kernel [10, 11]. Here the χ2 distance is then defined as D2(a,b)=χ2(a,b).
Jensen-Shannon (JS) Kernel. We define k(a,b)=(a/2)log2(a+b)/a+(b/2)log2(a+b)/b as the JS kernel. Here the JS kernel distance D2(a,b) can be obtained by D2(a,b)=KL(a∣(a+b)/2)+KL(b∣(a+b)/2), where we import the concept of Kullback-Leibler divergence computed by KL(a∣b)=∑l=1dallog2(al/bl).
Intersection Kernel. We defined k(a,b)=min{a,b} as the intersection kernel [12]. The distance metric D2(a,b)=a-b1 is l1 distance between variants a and b.
Hellinger’s Kernel. We defined k(a,b)=ab as the Hellinger’s kernel and specified distance metric D2(a,b)=a-b22 as Hellinger’s distance between variants a and b. The function expression κ(λ)=1 is the signature of the kernel, which is constant.
γ-Homogeneous Parameters. In previous research paper, we can see that the homogeneous kernels are used by parameters γ=1 and γ=2. When γ=2, the kernel becomes k(a,b)=ab. Now, in our paper, we can derive the γ-homogeneous kernel by formula (2).
3. Homogeneous Kennel Map
When handling low-dimensional data which is inseparable, we should create kernel feature map ψ(x) for the kernel so that we can map our input data information in low-dimensional space to relatively high-dimensional (Hilbert) information space with ·,·:(7)∀a,b∈RD:K(a,b)=ψ(a),ψ(b).
In order to compute the feature maps and get approximate kernel feature maps expression for the homogeneous kernels, we should use Bochner’s theorem by expanding the configuration of γ-homogeneous expression. Here we notice that if a homogeneous kernel is Positive Definite [13], its signature will also be Positive Definite expression. The assumption condition is suitable for a stationary kernel. So, depending on formulae (2) and Bochner’s theorem (9), we can derive the k(a,b) and closed feature map.
We can compute the kernel density and feature map closed form [9] for most machine learning kernels. Table 1 illustrates the results. Consider(8)ka,b=abθ/2∫-∞+∞e-iwλκ(ω)dω,θ=logba=∫-∞+∞e-iwlogaaθκω∗e-iwlogbbθκωdω,(9)ψwa=e-iwlogaaγκω.
4. Kernelized Locality-Sensitive Hashing
To create and conduct the data association, we take the approach of Kernelized LSH [14] which is also a hash table-based algorithm. KLSH is proposed based on LSH algorithm, which is more efficient and accurate for query search and matching. When searching the input query, the KLSH approach can quickly locate the possible similar and nearest neighbor items in the hash table and match it. In addition, KLSH has another characteristic: traditional LSH methods can only find a part of hashes in the kernel space, while KLSH can locate all the possible hash tables in kernel space. Moreover KLSH has been applied in the vision search tasks by large scale datasets such as Tiny Image and other datasets [14].
Similar to LSH, constructing the hash functions for KLSH has been the key problem for us. That means if we intend to compute the collision probabilities of input query and the database points, we should compute the extent of similarity between them in the database as proposed by [15].
KLSH Principle. Any locality-sensitive hashing algorithm is based on the probability of distribution of hash function clusters. So we should compute the collision probability of a bundle of points, for example, m and n:(10)Prhm=hn=simm,n.
We can also view the problem as the issue of computing the similarity of objects between m and n. Here sim(m,n) in the algorithm is the measure function of calculating the similarity, while h(m) and h(n) are randomly selected from the hash function cluster H. The instinct beyond this is that we find the fact that m and n will collide in the same hash bucket. So those objects which are significantly similar will be more possible to be memorized in the hash table and this eventually results in confliction [1].
We can derive the similarity function expression according to the vector inner product: (11)simm,n=mTn.In [15, 16], the definition of LSH function has been extended from formula (10) as(12)hr→(m)=1,ifr→Tm≥0,0,else.Here we create a random hyper plane vector r→. The distribution of r→ fit has a zero-mean multi-Gaussian N(0,Σp) distribution. The dimensionality of r→ is the same with the input vector m. This demonstrates that the statistical characteristic of input vector is uniquely matched with each hash function. Meanwhile this verification has been detailedly reported in the LSH attribute in [17]. When we project on a point m, actually the sigh function we obtain in this process is a hash function and then we repeat it k times; a couple of hashes can be created. We can also call this couple of hashes hash bucket. The hash bucket can be formed as (13)gm=h1m,…,htm,…,hkm.From (13), we can see that, after repeating k times, we can get one column of hash bucket (14); then repeating b times, we can finally obtain the hash bucket g(m):(14)gjm=h1m,…,htm,…,hkm1<j<b.When given the value of b, we can get all the the hash functions located in the bucket; we can see the following:(15)gm=h1,1r→mh2,1r→m⋯hs,1r→m⋯ht,1r→m⋮⋮⋮⋮⋮⋮h1,jr→ph2,jr→m⋯hs,jr→m⋯ht,jr→m⋮⋮⋮⋮⋮⋮h1,br→mh2,br→m⋯hs,br→m⋯ht,br→m,1<j<b;1<s<t.Due to the fact that we compute the similarity measure function in high-dimensional kernel space, the similarity function can also be extended and written as(16)simmi,mj=κmi,mj=ϕmiTϕmj.In formula (16), we use kernel function ϕ(m) to construct κmi,mj to complete the kernel mapping for the points of mi and mj. And ϕ(mi)Tϕ(mj) is a product of projection on hash function from the R space. The problem is that nothing is known about the data while in kernel space to generate r→ from N(0,Σp). Therefore, in order to construct the hash function, r→ needs to be created so that we can quickly compute the r→Tϕm function based on the kernel. Similar to normal r→T, we could use only the kernel of ϕ(m) to approximately compute the function of r→Tϕ(m). We should select a subset of database to construct r→. By the large number of central limit theory, if we intend to choose parts of database items from the whole database to form the dataset S, the sample of kernel data must be satisfied by the distribution with mean μ and variance Σ. The variable za can be written as(17)za=1k∑i∈sϕmi.
With the growth of variable a, the theory tells us that the vector z~a=t(za-μ) has also been satisfied by the distribution of normal Gaussian.
We used the whitening transform to obtain r→:(18)r→=Σ-1/2z~a.The LSH function has been yielded:(19)h(ϕ(m))=1,ifϕmTΣ-1/2z~a≥0,0,else.As analyzed above, we use kernel function to represent the database data; then the statistical data like variance and mean are uncertain. If we intend to estimate and calculate μ and Σ, we could sample the data from the database by KPCA and eigen decomposition in [18] and we let Σ=VΛVT and Σ-1/2=VΛ-1/2VT; therefore we can obtain the hash function h(ϕ(m)):(20)hϕm=signϕmTVΛ-1/2VTz~a.From the above, we can see how to construct the hash function for the kernel matrix input vectors. In this case, we let the kernel matrix input be K=UΩUT by decomposing the K matrix. Here Ω and Λ have the same nonzero eigenvalue; it is also viewed as another form of kernel matrix input. From [18], we compute the projection(21)vtTϕ(m)=∑i=1n1θtut(i)ϕmiTϕ(m).Here ut and vt are, respectively, the tth eigenvector of the kernel matrix and its covariance matrix.
As mentioned before, we choose n data points from the database to form ϕ(mi); traversing all the t eigenvectors and conducting the computation yields(22)hϕm=ϕmTVΛ-1/2VTz~a=∑t=1nθtvtTϕmTvtTz~a.Substituting (21) into (22) yields(23)∑t=1nθt(∑i=1n1θtut(i)ϕmiTϕ(m))·∑i=1n1θtutiϕmiTz~a.Simplifying (23) yields(24)∑i=1n∑j=1n(ϕmiTϕ(m))·ϕmjTz~a∑t=1n1θtutiutj.Since Kij-1/2=∑k=1n1/θkuk(i)uk(j), we further simplify the (24) yields(25)h(ϕ(m))=∑i=1nw(i)(ϕmiTϕ(m)),where w(i)=∑j=1nKij-1/2ϕxjTz~a.
Through the above derived formula w(i) we can obtain r→=∑i=1nw(i)ϕ(xi) which obeys random Gaussian distribution, then we can substitute (17) into w(i):(26)wi=1a∑j=1n∑l∈sKij-1/2Kjl.We neglect the term of a, and finally the simplified w(i) yields (27). es represents the unit vector for S.
And therefore hash function for kernel input will finally be (27)w=K1/2·esk,(28)hϕm=sign∑i=1nwiκm,mi.κ is the kernel mapping matrix for points m and mi in space. After several iterations, the hash function will form a hash bucket.
In order to get the suitable parameters in this process, we implement the query matching for several iterations. The detailed algorithm is illustrated finally in Algorithm 1. Consider(29)gjm=h1ϕm,h2,jϕm,…,ht,jϕm,…,hk,jϕm,fffffffffffffffffffffffffffff1<l<t,1<j<b.
Algorithm 1: KLSH algorithm.
(1) Process Data
(a) Obtain K matrix from database throughout the n points.
(b) Obtain es by randomly sampling a subset from the {1,2,…,n}
(c) Project on ath subset to obtain ha(ϕ(m)).
(d) Obtain w=K1/2·es/a
(e) Project w(i) onto the points in kernel space
(f) Obtain hash bucket gj(m)
(2) Query Processing
(a) Obtain the same hash bucket in (29) from the database
(b) Use Ann search for query matching.
5. Experimental Result
In the experiment, we proposed the homogenous kernel-hashing algorithm and verified the high efficiency on the dataset. In our scheme, homogenous kernel-KLSH method makes it possible to get the unknown feature embeddings. We use these features to conduct vision search task to locate the most similar items in the database, and the neighbors we find in the task will give their scores on the tags. The method proved to be more effective and accurate than the linear scan search.
In this part, we design and verify our algorithm on the Caltech-101 dataset in Figure 1. Caltech-101 dataset is a benchmark on image recognition and classification, which has 101 categories objects and each category has about 100 images, so 10000 images totally. In recent years, many researchers have done useful research on this dataset such us proposing some important and useful image represent kernels [19]. Also there are many published papers that focused on this dataset, some of which are very valuable and significantly historic. For example, papers [20–22], respectively, state their contribution to the dataset. The author of [21] proposed the matching method for pyramid kernel of images histograms, while Berg [20] proposed and created the CORR kernel of image local feature using geometric blur for matching local image similarity.
Datasets: Caltech-101 Example.
In our paper, we apply our algorithm to complete the vision classification and similar search task. The platform of our benchmark is based on Intel 4 core 3.6 GHZ CPU and 16 GB of memory and 2 TB hard disk.
We used χ2 kernel for γ-homogeneous kernel maps (γ=1/2) and applied the nonlinear RBF-χ2 kernel designed in [19, 23] to the SIFT-based local feature. Meanwhile we applied and learnt the homogenous kernel map beyond it. Compared with the nonlearnt kernel, our learnt kernel has been more accurate. And we use KNN classifier, respectively, for KLSH and linear scan to compute the accuracy of classification. We also compare it with CORR [24] and the result proves to be better than them, here we use 15 images per class for training task.
From Figure 2 we can see that the growth of parameters is closely related with accuracy. As is seen, the accuracy increased with the increase of n, while it has little relationship with the number of t and b. The value of (n,t,b) is chosen as n=300, b=300, t=30 as the best parameters through a series of experiments.
Hashing using a RBF-χ2 kernel for SIFT based on homogenous kernels χ2(γ=1/2). We choose t=30, n=300, and b=300 in our experiment.
We find that the combination of these parameters can result in better performance than the large-scale dataset. Meanwhile it can be seen that our approach with homogenous kernel map has higher accuracy than CORR-KLSH with metric learning [25].
Figure 3 illustrates that our method is superior to other existing approaches [25–28] tested on this dataset. Comparing with other kernel classifiers, our classifier with RBF-χ2 kernel for local features performs better. In Table 2 we can see that the result of ours has higher accuracy with T=15 and T=30 than other papers’ results including better than [24] which obtains the result by 61% for T=15 and 69.6% for T=30. More clearly, it has improved the result by 16% several years ago.
Accuracy of Caltech-101.
#train
Ours
[18]
[29]
[30]
[31]
[32]
[33]
15
68.5
59.05
56.4
52
51
49.52
44
30
75.2
66.23
64.6
N/A
56
58.23
63
Comparison against existing techniques on the Caltech-101.
In order to find the best parameters in our experiment for NN search for our scheme, we should take into account the balance between performance and CPU time. Therefore here we conducted to analyze the performance and CPU time of different of k(k=2,3,…,20) for NN search. Figure 4 illustrates the accuracy and CPU time by each k in our dataset.
Classification beyond CPU load performance.
The author of [26] proposed the method by combining KPCA and normal LSH. That means computing hashing beyond the KPCA. However this method has apparent disadvantage because KPCA will bring on the loss of input information although it can reduce the dimensionality in the processing, while KLSH can solve this problem to assure the integrity of input information to compute the LSH. Therefore we found that our method has high accuracy and better performance than the algorithm in [26].
6. Conclusions
In our paper, we properly use the concept of homogeneous kernel maps to help us to solve the problem of approximation of those kernels, including those commonly used in machine learning such as χ2, JS, Hellinger’s, and intersection kernels. Combining with the KLSH scheme, it enables us to have access to any kernel function for hashing functions. Although our approach is inferior to linear scan search in time but it can guarantee that the search accuracy will not be affected. Moreover we do not need to consider the distribution of input data; to some extent, it can be applicable for many other databases as Flicker and Tiny Image. Experimental results demonstrate that it is superior to standard KLSH algorithm.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors would like to thank Brian Kulis from UC Berkeley University for helpful comments on experiments and also they should thank Professor Yan Shuicheng for advice on their paper. The work was supported in part by the National Program on Key Basic Research Project (973 Program) 2010CB731403/2010CB731406.
KulisB.JainP.GraumanK.Fast similarity search for learned metrics200931122143215710.1109/TPAMI.2009.1512-s2.0-70350618771ShakhnarovichG.ViolaP.DarrellT.Fast pose estimation with parameter-sensitive hashingProceedings of the 9th IEEE International Conference on Computer Vision (ICCV '03)October 2003Nice, France75075710.1109/ICCV.2003.12384242-s2.0-0345414554ShakhnarovichG.DarrellT.IndykP.2006Cambridge, Mass, USAThe MIT PressBarderaA.RigauJ.BoadaI.FeixasM.SbertM.Image segmentation using information bottleneck method20091871601161210.1109/tip.2009.2017823MR27510592-s2.0-67649845516ChumO.PhilbinJ.ZissermanA.Near duplicate image detection: min-Hash and tf-idf weightingProceeding of the British Machine Vision Conference (BMVC '08)September 2008CharikarM. S.Similarity estimation techniques from rounding algorithmsProceedings of the 34th Annual ACM Symposium on Theory of Computing (STOC '02)May 2002Montreal, Canada38038810.1145/509907.509965GionisA.IndykP.MotwaniR.Similarity search in high dimensions via hashingProceedings of the 25th International Conference on Very Large Data Bases (VLDB '99)1999518529Fei-FeiL.FergusR.PeronaP.Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object cateoriesProceedings of the Workshop on Generative Model Based Vision2004HeinM.BousquetO.Hilbertian metrics and positive defi-nite kernels on probability measuresProceedings of the 10th Workshop on Artificial Intelligence and Statistics (AISTAT '05)January 2005PuzichaJ.BuhmannJ. M.RubnerY.TomasiC.Empirical evaluation of dissimilarity measures for color and textureProceedings of the 7th IEEE International Conference on Computer Vision (ICCV '99)September 1999116511722-s2.0-0033284807MartinD. R.FowlkesC. C.MalikJ.Learning to detect natural image boundaries using local brightness, color, and texture cues200426553054910.1109/TPAMI.2004.12739182-s2.0-3042525106BarlaA.OdoneF.VerriA.Histogram intersection kernel for image classificationProceedings of the International Conference on Image Processing (ICIP '03)September 20035135162-s2.0-0345134164ScholkopfB.SmolaA. J.2002The MIT PressKulisB.GraumanK.Kernelized locality-sensitive hashing for scalable image searchProceeding of the 12th International Conference on Computer Vision (ICCV '09)October 2009Kyoto, Japan2130213710.1109/ICCV.2009.54594662-s2.0-77953184849ChenZ.SamarabanduJ.RodrigoR.Recent advances in simultaneous localization and map-building using computer vision2007213-423326510.1163/1568553077801320812-s2.0-33847795077WeemsM.2011GoemansM. X.WilliamsonD. P.Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming19954261115114510.1145/227683.227684MR1412228ZBL0885.68088SchölkopfB.SmolaA.MüllerK.Nonlinear component analysis as a kernel eigenvalue problem19981051299131910.1162/0899766983000174672-s2.0-0347243182VedaldiA.GulshanV.VarmaM.ZissermanA.Multiple Kernels for object detectionProceedings of the 12th International Conference on Computer Vision (ICCV '09)October 200960661310.1109/ICCV.2009.54591832-s2.0-77953196456BergA. C.BergT. L.MalikJ.Shape matching and object recognition using low distortion correspondencesProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '05)June 2005San Diego, Calif, USA263310.1109/CVPR.2005.320GraumanK.DarrellT.Discriminative classification with sets of image featuresProceedings of the 10th IEEE International Conference on Computer Vision (ICCV '05)October 20051458146510.1109/ICCV.2005.2392-s2.0-33745855044HolubA. D.WellingM.PeronaP.Combining generative models and fisher kernels for object recognitionProceedings of the 10th IEEE International Conference on Computer Vision (ICCV '05)October 200513614310.1109/ICCV.2005.562-s2.0-33745933878VarmaM.RayD.Learning the discriminative power-invariance trade-offProceedings of the IEEE 11th International Conference on Computer Vision (ICCV '07)October 200710.1109/ICCV.2007.44088752-s2.0-50649115912ZhangH.BergA. C.MaireM.MalikJ.SVM-KNN: discriminative nearest neighbor classification for visual category recognitionProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '06)June 20062126213610.1109/CVPR.2006.3012-s2.0-33845566162TorralbaA.FergusR.WeissY.Small codes and large image databases for recognitionIEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '08)20081810.1109/CVPR.2008.4587633YangJ.GaoX.ZhangD.Kernel ICA: an alternative formulation and its application to face recognition200538101784178710.1016/j.patcog.2005.01.0232-s2.0-22844447009SerreT.WolfL.PoggioT.Object recognition with features inspired by visual cortexProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '05)June 2005994100010.1109/CVPR.2005.2542-s2.0-24644511277RasiwasiaN.VasconcelosN.Latent dirichlet allocation models for image classification201335112665267910.1109/TPAMI.2013.692-s2.0-84884546915LazebnikS.SchmidC.PonceJ.Beyond bags of features: spatial pyramid matching for recognizing natural scene categoriesProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '06)June 20062169217810.1109/CVPR.2006.682-s2.0-33845572523BergA. C.2005Berkeley, Calif, USAComputer Science Division, University of CaliforniaMutchJ.LoweD.Multiclass object recognition using sparse, localized featuresProceedings of the Computer Vision and Pattern Recognition Conference (CVPR '06)2006GraumanK.DarrellT.Pyramid match kernels: discriminative classication with sets of image features2006CSAIL-TR-2006-020MITWangG.ZhangY.LiF.-F.Using dependent regions for object categorization in a generative frameworkProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '06)June 2006New York, NY, USA15971604