Query by Image Content Using Color Histogram Techniques

DOI : 10.17577/IJERTV2IS111006

Download Full-Text PDF Cite this Publication

Text Only Version

Query by Image Content Using Color Histogram Techniques

Prof. Sweety M Maniar

Research Scholar, Gujarat Technological Univetsity, Assistant Professor, V.V.P.Engineering College.

Darshita Pathak

PG Student, V.V.P.Engineering College, GujaratTechnological Univesity.

Mahipat Kadvani

PG Student, V.V.P.Engineering College, Gujarat Technological University.

Dr. J. S. Shahsir

Guide, Principal, Gujarat Technological University

Abstract

The extensive digitization of images, diagrams and paintings, traditional keyword based search has been found to be inefficient for retrieval of the required data. Content- Based Image Retrieval (CBIR) system responds to image queries as input and relies on image content, using techniques from computer vision and image processing to interpret and understand it, while using techniques from information retrieval and databases to rapidly locate and retrieve images suiting an input query.

In this paper, we aim to evaluate and present the Content Based Image Retrieval (CBIR) system. Various methods have been proposed for CBIR using image low level image features like color with color histogram, color layout, texture and shape. This paper CBIR is proposed with color histogram feature. To compare the histogram and find the errors for that histogram if the error is beyond the threshold then not retrieval of images otherwise it is retrieval of images. After retrieval the precision and recall are calculated for each query image and retrieve the best output.

Index Terms Content based image Retrieval (CBIR), Color Histogram, Precision, Recall.

  1. INTRODUCTION

    The amount of image data that has to be stored, managed, searched, and retrieved grows continuously on many fields of industry and research. Searching for the images in most commonly used search engine like Google, the search is text based which retrieves images based on keyword that we give in the text of the image. In text based retrieval methods, to write a name of image like water lilies so it can retrieve the image of lilies only And write a functionality of image like lilies flowers in pond that is identify a image that want by user. So we have to give all description of image in the text based retrieval. Some limitation of text-based approach that is given below. The first is problem of image annotation some large volumes of database cannot retrieve in text based approach and language must be known to retrieve an image by user. The second is problem of human perception that is identify a problem like

    subjectivity of human perception and more responsibility on the end user. Third one is problem of deeper needs that is defined queries that cannot be described at all into the visual features of images to identify.

    Figure 1. Text based query image

    Content-Based Image Retrieval (CBIR) system searches based on query by image not by text so the retrieval images based on the content of image. The Example given in below figure.

    Figure 2. CBIR examples

  2. WORKING OF CBIR SYSTEM

The CBIR system performs two major tasks. The first one is feature extraction (FE), where a set of features, called feature vector, is generated to accurately represent the content of each image in the database. The second task is similarity measurement (SM), where a distance between the query image and each image in the database using their feature vectors is used to retrieve the top closest images.

Feature Extraction

Feature extraction is a special form of dimensionality reduction. When the input data to an algorithm is too large to be processed and it is suspected to be notoriously redundant (much data, but not much information) then the input data will be transformed into a reduced representation set of features (also named features vector). Transforming the input data into the set of features is called features extraction. The number of technique to be present for feature extraction as below.

  1. Color

    1. Color Average:

      Average R,G,B values are calculated in five separate zones of the image. This division of the image area increases the power by providing a simple color layout scheme. The result of 15- dimensional color feature vector that is not describe the average color of the image but also identify a information on the color composition

      Figure 3. Zone based representation of image

      Figure 4.Separation of image in five different zones

    2. Color Histogram:

    A color histogram is a representation of the distribution of colors in an image. In digital images, a color histogram represents the number of pixels that have colors in each of a fixed list of color ranges, that span the image's color space, the set of all possible colors.

    The color histogram can be built for any kind of color space, In three-dimensional spaces like RGB or HSV. For multi-spectral images, where each pixel is represented by an arbitrary number of measurements (for example, beyond the

    Figure 5. Color histogram

    three measurements in RGB), the color histogram is N- dimensional, with N being the number of measurements taken. It is its own wavelength range of the light spectrum; some is outside the visible spectrum.

    If the set of possible color values is sufficiently small, each of those colors may be placed on a range by itself; then the histogram is merely the count of pixels that have each possible color. Most often, the space is divided into an appropriate number of ranges, often arranged as a regular grid, each containing many similar color values. The color histogram may also be represented and displayed as a smooth function defined over the color space that approximates the pixel counts. Histogram of a digital image is a discrete function h(rk) = nk where rk: the kth intensity value and nk: number of pixels in the image with intensity rk. Normalized histogram p(rk) = nk / n , for k = 0, 1, 2, , L -1. n is the total no of pixel in image and p(rk) : estimate of the probability of occurrence of intensity level rk in the image.Sum of all components of a normalized histogram is equal to 1.

  2. Texture

    Texture is defined in two type of feature:

    1. Structured Approach:

      A structured approach is an image texture as a set of primitive texel in some regular or repeated pattern. This type image identifies a artificial textures. To define a structured approach description a identify of the spatial relationship of the texels.

      Figure 6.. Example of Texture

    2. Statistical Approach:

    A statistical approach sees an image texture as a quantitative measure of the arrangement of intensities in a region. In general this approach is easier to compute and that is more widely used, the natural textures are made of patterns of irregular sub elements.

  3. Shape

The shape of an object is a binary image representing of the objects. Shape representations techniques used in similarity retrieval are generally characterized as being region-based and Boundary-based. The shape identifies a set of two-dimensional regions, while the latter presents the shape by its outline. Region-based feature vectors often result in shorter feature vectors and simpler matching algorithms. Generally they fail to produce efficient similarity retrieval. On the other hand, feature vectors extracted from boundary- based representations provide a better description of the shape. The Shape Attribute are given below. Description of the geometric properties of a regio can be obtained measure properties of points belonging to the region. Those properties are for example:

  1. Area : can be measured as the count of internal pixels.

  2. Bounding rectangle : is the minimum rectangle enclosing the object.

  3. Aspect ratio: is invariant to the scale of the object, since it is computed as the radio of the width and length of the rectangle.

  4. Roundness (also called circularity) is defined as:

    APFormfactorRoundness412== (4.1)

    where P is the perimeter of a contour and A is the area of the enclosed region.

  5. Compactness : is very similar to roundness defined above. It is defined as the ratio of the perimeter of a circle with an area equal to the area of the original object, i.e. PAPPcompcircle2== (4.2)

  6. Elongation: is defined as the ratio between the squared perimeter and area.

  7. Convexity: a convex hull is the minimal cover able to encase the object.

    Similarity Measure

    Many Current Retrieval systems take a simple approach by using typically norm-based distances (e.g., Euclidian distance ) on the extracted feature set as a similarity function. The main premise behind these CBIR systems is that given a good set of features extracted from the images in the database, then for two images to be similar their extracted features have to be close to each other.

    Most commonly Euclidian distance and correlation coefficient are used as similarity measure in CBIR. Correlation coefficient measures the cosine of the angle between two vectors and varies between 0 to 1. When it is 1 both the vectors are aligned but their magnitude may not be

    same. In contrast to this Euclidian measure gives the distance between the vectors, when it is 0 not only the vectors are aligned but their magnitude is also same. Here we have preferred Euclidian distance as a similarity measure. The direct Euclidian distance between an image P and query image Q can be given as below.

    PROPOSED ALGORITHM

    Step-1: Create a database of sample images that can be used to search Query image.

    Step-2: Now input the query image.

    Step-3: Now extract the color histogram feature for database images and query image.

    Step-4: The Euclidian distances between the feature vectors of query image and the feature vectors of images in the database are calculated..

    Step-5: Now apply threshold on with error result and retrieve the matching for database.

    Step-6: The algorithm performance is measured based on the precision and recall of each class of images

    IMPLEMENTATION

    The proposed CBIR methods are tested using a test bed of 650 variable size images spread across 5 categories and taken from image database. The categories and distribution of the images is shown in below figure. Programming is done in MATLAB 7.0 using a computer with Intel Core 2 Duo Processor T8100 (2.1GHz) and 2 GB RAM. Figure 7 gives the sample database images from all categories of images considered in test bed image database.

    Figure 7. Sample of Image database

    We have taken following images as query image as a part of implementation select any one of following image.

    Figure 8.Color Histogram of given image

    We have selected following images for the query image and results of the retrieval are shown as below.

    Figure 9. Selected Query Image For Retrival

    Result of First Query image is shown as below.

    RESULT

    For testing the performance of each proposed CBIR technique, 5 queries are fired on the generic image database of 650 variable size images databases. The query and database image matching is done using Euclidean distance. Precision and recall are used as statistical comparison parameters for the proposed CBIR techniques. The standard definitions of these two measures are given by following equations.

    Precision = Number of relevant images retrieved

    Total number of images retrieved

    Recall = Number of relevant images retrieved Total number of relevant images in database

    The precision and recall are computed by grouping the number of retrieved images sorted according to ascending Euclidean distances with the query image. In the following Table 1 give precision and recall are calculated for that five query image based on the equation given. That results are also given in the form of Graph 1for that five query image. The result for proper image like beach image we get higher precision. The recall get higher for flag image.

    Figure.10.Result of 1stQuery image

    .

    Result of Second Query image is shown as below.

    Figure 11.Result of 2ndQuery image

    This is the output give for two query image and same procedure we can repeat for all other remaining query and retrieve results.

    Table 1. Results

    Graph 1. Resulting Graph

    CONCLUSION

    The Content Based Image Retrieval (CBIR) is systems that retrieve the images from database based on the content of image. CBIR system using the color histogram features give the result for different five query image and the precision and recall for that five images are says that the precision is higher for beach image and lower for red flower image and recall is higher for flag image and lower for beach image. So the result is very based on the color present in image. If it is proper color image then good results retrieve.

    REFERENCES

    1. Markus Kosela, Erkki Oja Content-Based Image Retrieval with Self-Organizing Maps.

    2. J. T. Laaksonen, J. M. Koskela, and E. Oja PicSOM- A framework for CBIR using Self-Organizing Maps.

    3. S. Kaski, J. Kangas, and T. Kohonen. Bibliography of self-organizing map (SOM) papers: 19811997. Neural Computing Surveys, 1(3&4):1176, 1998. Image Classification for Content-Based Indexing Aditya Vailaya, Associate Member, IEEE, Mário A. T. Figueiredo, Member, IEEE.

    4. Laboratory of computer and information science, SOM Toolbox. http://www.cis.hut.fi/projects/somtoolbox.

    5. H. Zhang and D. Zhong, A scheme for visual feature based image indexing, storage and retrieval for image and video databases, SPIE Proceedings Series, 1995.

    6. P. D., Data Preparation for Data Mining. Morgan Kaufman Publishers, 1999.

    7. C. Ramesh Babu Durai and V. Duraisamy , Content Based Image Retrieval using Novel Gaussian Fuzzy Feed Forward-Neural Network Anna University of Technology Coimbatore Coimbatore, 641 047, Tamil Nadu, India.

    8. S. Rajshekaran, Neural Network, Fuzzy logic and Genetic algorithm.

    9. Helmy A.K. and G.S. El-Taweel, 2010. Neural network change detection model for satellite images using textural and spectral characteristics Am. J. Eng. Applied Sci., 3: 604-610. DOI: 1 0.3844/ajeassp.2010.604.61 0

    10. Gonzales, R.C. & Woods, R.E. (1993) Digital Image Processing. New York, USA:Addison-Wesley.

    11. Gonzales, R.C. & Woods, R.E. (1993) Digital Image Processing with MatLab. New York, USA:Addison- Wesley.

Leave a Reply