|
|
|
Content Based Image Retrieval Dhavraj Kumar Natacha Gueorguieva
Department of Computer Science College of Staten Island/CUNY
Abstract
The need to retrieve images form a collection comes from a variety of fields, including medicine, crime prevention, design engineering, art history, fashion and publishing. While the requirements of users can vary, images can be categorized [1,2] into three levels of abstraction primitive features such as color or shape, logical features. Content Based Image Retrieval (CBIR) utilizes the visual contents of the images in the process of searching and retrieving images from an image database. CBIR operates on a totally different principle of indexing. The images are indexed by features directly derived form the visual content of the images. These features contain low level information such as color, texture, shapes or spatial relations the image contains. The CBIR systems now available are IBM's QBIC, ImageMiner, Netra and Excalibur's Image Retrieval ware. The prime application areas include crime prevention (finger print and face recognition), journalism, advertising and web searching [3]. Current CBIR systems' effectiveness is limited by the fact that they can operate only at the primitive image feature level through combining primitive image features with text, keywords or hyperlinks can overcome some of the problems. Some form of cataloguing and indexing is necessary because in locating a desired image in a large and varied collection of images is difficult, while it is feasible to identify a desired image from a small collection by browsing more effective techniques are needed with collections containing thousands of items. Images can be categorized into different levels based on the nature of image queries. Potentially these images have many types of attributes that could be used for retrieval. Level 1: Comprises retrieval by primitive features such as colors, texture shape or spatial location of image, image star elements, e.g.: - finding images containing blue stars arranged in a ring. This level of retrieval uses features (such as shade of blue) which is a low level direct feature. Level 2: Comprises retrieval by derived features involving some degree if logical inference of the objects. (Retrieval of individual objects or persons). Level 3: Comprises retrieval by abstract attributes, involving a significant amount of high-level reasoning. (Retrieval of named events or types of activity, for e.g.: - Find pictures of a baseball game). As the level of these image categories increases, retrieval becomes more and more complex. CBIR is effective in retrieving images of Level 1 based on color, shape and texture or spatial location. The following are some of the Image Retrieval Systems
2.1. Excalibur: This system (Fig. 1) supports search by color, shape and texture. Each attribute is represented in a feature vector. The feature vector is constructed with a set of Gaussian derivative filters. It uses Shadow Play algorithm to match images. The algorithm takes an image imprint (extracting feature vector), and then searches a database to find other images with similar imprint patterns (comparing distance). Users can give their order of preference regarding the image attributes and also specify the brightness at each point in the image. Other query features include the measure of the hue, saturation, and the brightness at each point in the image, and the measure of the ratio of the images width to its height.
Figure 1. Demo: http://vrw.excalib.com/cgi-bin/sdk/cst/cst2.bat
2.2. ImageMiner: ImageMiner automatically analyzes images and generates textual content descriptions of images. The analysis module processes images in respect to color, texture and contour features. This information triggers the recognition of objects, which can also be composed to complex objects. The retrieval module uses text retrieval products IBM SearchManager to manage the integrated retrieval of images. The color attributes are represented in color rectangles, those of texture in texture rectangles, shape attributes are represented through contour-based shape descriptions and the objects are recognized based on the generated annotations of the three low level features: color, texture and shape. ImageMiner uses a weighed correlation measure to compute the similarity between two feature vectors. It offers four levels of detail to combine a query in respect to color (using text, examples, or a color editor), texture (using text or examples), contours (using text or examples) and concepts (using text). Users can query on multiple attributes. Search results have relevance ranking. The search engine supports hybrid queries, which enables mixing Boolean terms with texts. Users can look for images using conceptual queries like " War Scene", " Country Side: etc.
Demo: http://www.tzi.uni-bremen.de/BV/imageminer/Gui/queryland.html
2.3. Netra: Netra, a system developed at the University of California [5], supports image retrieval by color, contour (shape) and texture and locations. For color, a compact color feature is used, which represents each image region by a subset of colors from a color codebook. It applies distance-based similarity measure to compute the color distance of image regions. Advanced queries such as " Find all image regions that have 40% yellow and 30% blue" are supported. Netra supports three types of contour representations, which include curvature function, centroid distance and complex coordinate functions. Fourier transforms are used to get these representations and thereby define the contour. The texture feature of Netra is based on Gabor filters, which are considered as orientation and scale tunable edge and line (bar) detectors. The statistics of these micro features in a given region are often used to characterize the underlying texture information. Mean Character Difference measure us used to compute the difference between two texture vectors. For a query consisting of more than one of the image features, the intersection of the results of search using individual features is computed and then sorted based on a weighted similarity measure. Netra uses an implicit ordering of the image features, based on user specifications, to narrow down the search space.
Demo: http://vivaldi.ece.ucsb.edu/Netra/netra.html
2.4. Query By Image Content (QBIC): This system [4] supports query on image contents, including colors, textures, shapes and locations of user specified objects.
The location features are the x and y centroid of the object. Weighed Euclidean distance measure is used in similarity matching of two feature vectors [6, 9]. In QBIC, the returned results are ranked and are shown in order with the best result in the leftmost position, the next best in the next position and so on.
Figure2. Demo: http://www.qbic.almaden.ibm.com/cgi=bin/photo.demo
2.5. PicSom: A system developed at the Helsinki University of Technology, Finland uses tree structured self-organizing maps (TS-SOMs) as the method for retrieving images similar to a given set of reference images in a database. In PicSOM, the image Queries are performed through the World Wide Web. The queries are iteratively refined as the system exposes more images from its database to the user. During the process, PicSOM tries to adapt to the user preferences regarding to the similarity of images. This is accomplished with the use of separate TS-SOMS for every type of future vectors extracted from the images. Depending on how close to each other the images accepted by the user are mapped onto a particular Som layer, the more the system favors the images proposed by that SOM. The system marks the images selected by the user with a positive value and the non-selected images with a negative value in its internal data structure, these values are then summed up in their best matching SOM units in each of the TS-SOM maps. The inherent property of PicSOM to use more than one reference image as input information for retrieval is important. This feature makes PicSom different from other content-based image retrieval systems.
Figure3. Demo: http://www.cis.hut.fi/~markus/picsom.fcgi/ftp.suet.se/98
3. Comparisons
Table1: Comparisons of Image retrieval systems.
4. Conclusions From the study of the image retrieval systems, it is observed that common content attributed including color, texture and shape are supported by all the systems. While each system differs from the others in some methods of operation, all of them are powerful systems having widespread applications. An ideal system should require the least amount of knowledge about the query language from users and yet provide the desired image(s). The future image retrieval systems would be effective if they included Query based on spatial relationship, they should integrate the regions features and their spatial relationship into a unified representation.
Combining various low-level image attributes (color, texture, shape, spatial relationship etc) to construct a visual thesaurus, which can index image database, would be an interesting idea. Advanced query like drawing shapes and the user feedback is preferred at the same time the response time is another consideration.
References: [1] Bird, C et al (1996) "User interfaces for content-based image retrieval" in Proceedings of IEE Colloquium on Intelligent Image Databases, IEE, London, 8/1-8/4 [2] Carson C S et al (1997) "Region-based image querying" in Proceedings of IEEE Workshop on Content-Based Access of Image and Video Libraries, San Juan, Puerto Rico, 42-49 [3] Jain, A K and Vailaya, A (1996) "Image retrieval using color and shape" Pattern Recognition 29(8), 1233-1244 [4] Niblack, W et al (1993) "The QBIC project: querying images by color, texture and shape" IBM Research Report RJ-9203 [5] Ma W Y and Manjunath, B S (1997) "Netra: a toolbox for navigating large image databases" Proc IEEE International Conference on Image Processing (ICIP97), 1, 568-571 [6] Flickner, M et al (1995) "Query by image and video content: the QBIC system" IEEE Computer 28(9), 23-32 [7] Gudivada V N and Raghavan V V (1995a) "Content-based image retrieval systems" IEEE Computer 28(9), 18-22 [8] Hermes, T et al (1995) "Image retrieval for information systems" in Storage and Retrieval for Image and Video Databases III (Niblack, W R and Jain, R C, eds), Proc SPIE 2420, 394-405 [9] W.Niblack. R.Barber, W.Equitz, M.Flickner, E.Glasman, D.Petkovic, and P.Yanker ", Proc.SPIE San Jose, Feburary (1993). The QBIC project: "Querying images by content using color, texture and shape". [10] G.Salton and M.J.Mc Gill. Mc.Graw Hill, (1983) Introduction to modern Information Retrieval |
|