Mining User Queries with Markov Chains Application to Online Image Retrieval

Mining User Queries with Markov Chains Application to Online Image Retrieval

banner jpg

Mining User Queries with Markov Chains: Application to Online Image Retrieval


We propose a novel method for automatic annotation, indexing and annotation-based retrieval of images. The new method,that we call Markovian Semantic Indexing (MSI), is presented in the context of an online image retrieval system. Assuming such asystem, the users’ queries are used to construct anAggregate Markov Chain(AMC) through which the relevance between thekeywords seen by the system is defined. The users’ queries are also used to automatically annotate the images. A stochastic distancebetween images, based on their annotation and the keyword relevance captured in theAMC is then introduced. Geometricinterpretations of the proposed distance are provided and its relation to a clustering in the keyword space is investigated. By means ofa new measure of Markovian state similarity, themean first cross passage time(CPT), optimality properties of the proposed distanceare proved. Images are modeled as points in a vector space and their similarity is measured with MSI. The new method is shown topossess certain theoretical advantages and also to achieve better Precision versus Recall results when compared to Latent SemanticIndexing (LSI) and probabilistic Latent Semantic Indexing (pLSI) methods in Annotation-Based Image Retrieval (ABIR) tasks.


Annotation-Based Image Retrieval (ABIR) systems are anattempt to incorporate more efficient semantic content intoboth text-based queries and image captions (i.e. GoogleImage Search, Yahoo! Image Search). The Latent SemanticIndexing (LSI)-based approaches that were initially appliedwith increased success in document indexing and retrievalwere incorporated into the ABIR systems to discover a morereliable concept association.


While theformer gap brings in the issue of users’ interpretations ofimages and how it is inherently difficult to capture them invisual content, the latter gap makes recognition from imagecontent challenging due to limitations in recording anddescription capabilities.

A reason for this lies in thesparsity of the per-image keyword annotation data incomparison to the number of keywords that are usuallyassigned to documents.


We introduce the Markovian Semantic Indexing (MSI), anew method for automatic annotation and annotationbased image retrieval. The properties of MSI make itparticularly suitable for ABIR tasks when the per imageannotation data is limited. The characteristics of the methodmake it also particularly applicable in the context of onlineimage retrieval systems.


The targeting ismore accurate, compared to other systems that use externalmeans of non-dynamic or non-adaptive nature to definekeyword relevance.

MSI achievesbetter retrieval results in sparsely annotated image datasets. A comparison to LSI on 64 images gathered from theGoogle Image Search and annotated in a transparent wayby the proposed system, revealed certain advantages for theMSI method, mainly in retrieving images with deeperdependencies than simple keyword co-occurrence



  • System: Pentium IV 2.4 GHz.
  • Hard Disk: 40 GB.
  • Monitor: 15 inch VGA Colour.
  • Mouse: Logitech Mouse.
  • Ram: 512 MB
  • Keyboard: Standard Keyboard


  • Operating System: Windows XP.
  • Coding Language: ASP.NET, C#.Net.
  • Database: SQL Server 2005


Konstantinos A. Raftopoulos,Member, IEEE, Klimis S. Ntalianis, Dionyssios D. Sourlas, andStefanos D. Kollias,Member, IEEE “Mining User Queries with Markov Chains:Application to Online Image Retrieval” -IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 25, NO. 2, FEBRUARY 2013.