International Journal of Enhanced Research in Science, Technology & Engineering
ISSN: 2319-7463, Vol. 4 Issue 7, July-2015
Image Retrieval using Entropy and Earth Mover Distance Measures
Page | 7
International Journal of Enhanced Research in Science, Technology & Engineering
ISSN: 2319-7463, Vol. 4 Issue 7, July-2015
Vyshnavi Katireddi, Dr.Renuka Devi S.M
M.Tech final year student, Department of ECE, G. Narayanamma Institute of Technology and Science Hyderabad,
India,
Associate Professor, Department of ECE, G. Narayanamma Institute of Technology and Science Hyderabad, India,
Page | 7
International Journal of Enhanced Research in Science, Technology & Engineering
ISSN: 2319-7463, Vol. 4 Issue 7, July-2015
Page | 7
International Journal of Enhanced Research in Science, Technology & Engineering
ISSN: 2319-7463, Vol. 4 Issue 7, July-2015
Page | 7
International Journal of Enhanced Research in Science, Technology & Engineering
ISSN: 2319-7463, Vol. 4 Issue 7, July-2015
ABSTRACT
Image similarity or distance measures play a crucial role in content based image retrieval, classification, segmentation and object tracking. The histogram is normally used for image modeling and different similarity measures are studied for long time. Recent approaches include using GMM for image modeling and. image similarity is done between these GMM’s. This paper presents different image similarity measures of both histograms and GMM’s.
Keywords: Dissimilarity measure, Earth mover distance (EMD), Image retrieval, Kullback-Liebler (KL) , sparse EMD
1. INTRODUCTION
The similarity or distance measure between Gaussian mixture models (GMM) plays a crucial role in image matching. The histogram has been commonly used for image modeling, while the similarity measures between them have been studied for decades. These similarity functions mainly include information theoretic measures such as Kullback-Leibler (K-L) or Jesson- Shannon divergence and the statistic-based measures such as χ2- distance, and Lp-norm based ones. The Earth Mover’s Distance (EMD) [1] is a cross bin measure of histograms (or signatures), which has proven its advantages over the conventional bin-to-bin based measures. Recently, many methods have been proposed to improve the efficiency and accuracy of EMD for histogram matching [2,10,9]. An alternative and widely used image modeling method is the parametric Gaussian Mixture Model (GMM) [6,7,5]. The K-L divergence between GMMs is often adopted for distribution matching. As there is no closed form solution, approximating K-L divergence via Monte-Carlo procedure is a natural choice but this is unfortunately computationally intensive. Goldberger et al. [6] also focused an approximation of K-L divergence and studied two strategies, which are based on the analytical-form solution between component Gaussians and on unscented transform, respectively. In [7] a variational approximation based on K-L divergence was presented for GMM matching.
Adopting EMD as a similarity measure between GMMs was first used in [12] for music classification and later in vision fields of texture classification [15] and visual tracking [16]. As opposed to the great advances of EMD in comparing histograms [8,2,10,9], the potentials of EMD in measuring similarity between GMMs remain unclear and appear to be rarely explored. In this paper SR-EMD methodology for GMM based image matching is introduced.
2. Related Work:
The different approaches of image matching come under the category of non parametric, Information theory and ground distance are analyzed.
A. Distance Measures for image matching:
In this paper different distance measures based on histograms and Gaussian mixture models are discussed
Heuristic
Aheuristictechnique approach is problem solving, learning, or discovery using a practical methodology not guaranteed to be optimal or perfect, but sufficient for the immediate goals. When finding an optimal solution is impossible or impractical, heuristic methods can be used to speed up the process of finding a satisfactory solution.
Minkowski-form
dMKD(i,j)=λk=0n-1yi,k-yj,kλ
Minkowski distance [1] is the distance between two data recordsiandjnthe total number of variablesyandλthe order of the Minkowski metric. Order one is normally used for dissimilarity measure and other commonly used are 2 and infinite.
B. Non parametric:
Chi square distance
This is bin by bin dissimilarity measure between two histograms hi and ki [1]
dx2(H,K)=i(hi-mi)mi
mi=hi+ki2
Where H={hi} and k={ki} are two histograms . These measures do not match perceptual similarity well. The major drawback of these measures is that they account only for the correspondence between bins with the same index and do not use information across bins
Kolmogorov-smirnovt distance:
This is cross bin dissimilarity measure between two histograms hi and ki
dksH,K=max(h^i-k^i) where k^i and h^i are cumulative pdf.
C. Information theory:
Information theoryis a branch ofapplied mathematics,electrical engineering, andcomputer scienceinvolving thequantificationofinformation. Information theory was developed byClaude E. Shannonto find fundamental limits onsignal processingoperations such ascompressing dataand on reliablystoringandcommunicatingdata..
Kullback–Leibler divergence:
Ininformation theory, theKullback–Leibler divergence[4] is a non-symmetric measure of the difference between two probability distributionsPandQ. Specifically, the Kullback–Leibler divergence ofQfromP, denoted DKL(P‖Q), is a measure of the information lost whenQis used to approximateP.The Kullback–Leibler divergence measures the expected number ofextrabits required tocodesamples fromPwhen using a code optimized forQ, rather than using the true code optimized forP.
Fordiscrete probability distributionsPandQ, the Kullback–Leibler divergence ofQfromPis defined to be
DKL(PQ=iP(i)InP(i)Q(i)
In words, it is the expectation of the logarithmic difference between the probabilitiesPandQ, where the expectation is taken using the probabilitiesP. The Kullback–Leibler divergence is defined only ifQ(i)=0 impliesP(i)=0, for alli.
D. Ground distance measures:
Histogram intersection:
The histogram intersection [1] is attractive because of its ability to handle partial matches when the areas of the two histograms are different. It is shown in [1] that when the areas of the two histograms are equal, the histogram intersection is equivalent to the (normalized) L1 distance.
dᴖH,K=1-imin(hi,ki)iki
Where Hand K are two histograms
Earth Mover Distance (EMD):
The EMD [1][12] is the minimum cost of changing one mixture into another when the cost of moving probability mass from component m in the first mixture to component n in the second mixture. A common choice of cost is the KL [4] distance between the individual Gaussian components.
Computation of the EMD is based on a solution to the well-knowntransportation problem.Suppose that severalsuppliers, each with a given amount of goods, are required to supply several consumers, each with a given limited capacity. For each supplier-consumer pair, the cost of transporting a single unit of goods is given. The transportation problem is then to find a least-expensive flow of goods from the suppliers to the consumer that satisfies the consumers demand. Signature matching can be naturally cast as a transportation problem by defining one signature as the supplier and the other as the consumer, and by setting the cost for a supplier-consumer pair to equal the ground distance between an element in the first signature and an element in the second. Intuitively, the solution is then the minimum amount of ‘work’ required to transform one signature into the other.
This can be formalized as the following linear programming problem. Let P= {(p1,wp1.).....( pm,wpm.)...... }be the first signature withmclusters, wherepiis the cluster representative, andwp1.is the weight of the cluster& similarly Q= {(q1,wq1.).....( qm,wqn.)...... }the second signature withnclusters; andD=[dij]the ground distance matrix wheredijis the ground distance between clusterspiandqj. We want to find a flowbetweenpiandqj, that minimizes the overall cost
WORK (P, Q, F) =i=1mj=1ndijfij
Subject to the following constraints:
fij≥0 1≤i≤m,1≤j≤n (1)
j=1nfij≤wpi 1≤i≤m (2)
j=1nfij≤wqi 1≤j≤m (3)
i=1mj=1nfij=min(i=1mwpii=1mwpi) (4)
The first constraint allows moving supplies fromPtoQand not vice versa. The next two constraints limits the amount of supplies that can be sent by the clusters inPto their weights, and the clusters inQto receive no more supplies than their weights; and the last constraint forces to move the maximum amount of supplies possible. We call this amount thetotal flow. Once the transportation problem is solved, and we have found the optimal flowF, the earth mover's distance is defined as the work normalized by the total flow
EMD (P,Q)=i=1mj=1ndijfiji=1mj=1nfij
E. Sparse representation based EMD:
Adopting EMD as a similarity measure between GMMs was first used in [12] for music classification and later in vision fields of texture classification [15] and visual tracking [16]. As opposed to the great advantages of EMD in comparing histograms [17, 18,19], the potentials of EMD in measuring similarity between GMMs remain unclear and appear to be rarely explored. In [14] have proposed a novel EMD based on sparse representation.
SR-EMD:
The solution to TP is sparse, enabling sparse representation-based EMD.
1. Tolerable to noise
2. Low complexity
Suppose that we have two GMMs Gp=gip,wipi=1….np and Gq=gjq,wjqi=1….nq.Let di,jpq denote the ground distance betweengipand gjq The EMD is the minimal cost moving the “goods” from GMM Gp to GMM Gqwith unit transportation cost of di,jpq . It is modeled as a classical transportation problem, which can
be written in the standard matrix form
EMD (Gp, Gq) =mincpqTzpq,
s.t Apqzpq = ypq, zpq>= 0,
Where cpq is an npnq-dimensional vector which is the vectorization of ground distance matrix
SR-EMD(Gp,Gq)=min12ypq-Apqcpq-1xpq22+λxpq
3. Different distance measures on histograms:
We have analyzed few distance measures in table 1 which are based on bin to bin dissimilarity and cross bin dissimilarity measures. Minkowski-form, histogram intersection, kl divergence, chi square come under bin to bin dissimilarity measure. Quadratic form, Kolmogorov smirnov and EMD come under cross bin dissimilarity measure.
Table 1 Different distance measures on histograms
Different distance measures on histogramsDifferent distance measures / Description
Minkowski-Form / · It can be considered as a generalization of both theEuclidean distanceand theManhattan distance
· With order k=2 it is Euclidean distance and for k=1 it is Manhattan distance
c 2 (Chi Square) / · Drawback is perceptual dissimilarity and sensitivity to bin size
Kolmogorov-smirnov / · It is statistical measure
· Cross bin dissimilarity measure
· Measure Only for one dimension
Kullback -Liebler / · It is non symmetric and sensitive to histogram binning
Quadratic form distance / · Perceptually more meaningful dissimilarity measure
Earth mover
distance / · More computational Complexity
Different distance measures on GMM’s
In table (2) gives distance measures based on GMM. First the image features are extracted and then GMM is constructed by using Expectation-maximization algorithm. Then after dissimilarity measure is done between those GMM’s
Table 2 Different distance measures on GMM’s
Different distance measures on GMM’sDistance measures / Formula / Description
KL divergence / Dvariational(f//g)=aloga'πe-D(fa//fa')bwbe-D(fa//gb) / · There is no close form expression for kl so approximation is done for kl divergence
· Different approximation methods include Variational, unscented, Monte carlo, Gaussian approximations, etc
· This approximation has good results then other like unscented transformation
Earth mover distance / W=i=1mj=1ndpiqjfpiqj
dpiqj=piqj+qjpi+(µpi-µqj)^2 *(1pi+1qj)
EMD(P,Q)= i=1mj=1ndpiqjfpiqji=1mj=1nfpiqj / · Have good results when compared with kl
· Main drawback is transportation problem
Sparse Earth mover distance / SR-EMD(Gp,Gq)=min12ypq-Apqcpq-1xpq22+λxpq / · EMD is not clear for GMM’s so sparse representation EMD is taken for image retrieval
· Here transportation problem is reduced
4. Experimental evaluation:
In this paper Corel Wang database is taken for performance evaluation in image retrieval. Corel Wang database consists of 1,000 images from 10 different classes. As this database contains large intra-class variations, it is considered as very difficult benchmark database. Here gray level histograms are taken for image retrieval. To demonstrate the goodness of our weight factor we experimented with a database of different number of semantic categories. An image in the retrieved list is considered to be relevant if that image comes from the same category as the query image, otherwise non-relevant. The more diverse and larger the collection of images in the database the more chance that it contains images similar to the query.
Precision=no of relavent retrieved imagesno of retrieved images Recall=no of relavent retrieved imagesno of images similar to query in database
Table 3 Performance measures between histograms
Performance measures between histogramsDifferent distance measures between histograms / Precision (%) / Recall (%)
20 / 40 / 60 / 20 / 40 / 60
Minkowski-form / 51.5 / 50.3 / 49.6 / 11.3 / 19.15 / 25.91
Chi square distance / 52.6 / 51.2 / 49.8 / 11.9 / 20.83 / 27.7
Kolmogorov-Smirnov / 52.9 / 51.85 / 50.25 / 12.14 / 20.93 / 28.46
Kullback-Liebler / 53.67 / 52.8 / 51.65 / 12.67 / 21.9 / 29.94
Jeffrey –divergence / 53.98 / 52.95 / 51.79 / 13.78 / 22.1 / 30.23
Histogram intersection / 55.67 / 53.6 / 52.05 / 14.89 / 22.89 / 31.78
Earth mover distance / 55.98 / 53.97 / 52.89 / 15.14 / 23.16 / 32.67
Fig 1 Precision for different distance measures on histogram
We produce a GMM of 5 component Gaussians for each single image and after that dissimilarity measures are done.
The performance evaluation is done by calculating precession and recall. In this paper L2ECM features are taken for constructing GMM’s. Given an image I(x; y), the raw feature image f (x; y) is first extracted, then the tensor-valued image C(x; y) is obtained by computing the covariance matrix for every pixel, then after the logarithm of C(x; y), the symmetric matrix logC(x; y) is vectorized to get the 6-D vector-valued image denoted by vlogC(x; y)
Table 4 Performance measures between GMM’s
Performance measures between GMM’sDifferent distance measures between GMM / Precession (%) / Recall (%)
20 / 40 / 60 / 20 / 40 / 60
Kullback- Liebler divergence / 66.78 / 63.7 / 62.89 / 16.56 / 24.89 / 34.26
Earth mover distance / 78.89 / 74.89 / 73.15 / 17.46 / 26.17 / 35.89
Sparse earth mover distance / 83.14 / 80.13 / 79.16 / 18.2 / 28.34 / 37.43
Fig 2 Precision for different distance measures on gmm
Figure 3 is a PR (Precision-Recall) graph for all the three methods Kullback-Liebler divergence, Earth Mover Distance, Sparse Earth Mover Distance between GMM’s. Where recall helps to analyze how early the image is retrieved in a given method.
Figure 3 precision vs recall
5. CONCLUSION:
This paper presents various distance measures for image retrieval on histogram and GMM. Various distance measures on histogram are analyzed which are bin to bin and cross bin dissimilarity measures. Earth mover distance on GMM is studied but it is not clear between GMM so a novel Earth mover distance is introduced which has sparse representation EMD. This novel method has two new things one is sparse representation and another is novel ground distance between Gaussian components based on information geometry .unlike the traditional ones, the new ground distance can characterize the intrinsic distance of the underlying Riemannian manifold of the space of Gaussian.