An Integrated Approach to Content based Image Retrieval

DOI : 10.17577/IJERTV5IS050223

Download Full-Text PDF Cite this Publication

Text Only Version

An Integrated Approach to Content based Image Retrieval

Nitish Saini1,

M Tech Scholar Department of CSE,

UTU Dehradun, Uttrakhnad

Diwaker Mourya2 2Assistant Professor Department of CSE

UTU Dehradun, Uttrakhnad

Abstract – Content Based Image Retrieval (CBIR) is a Challenging Task which retrieves the similar images from the large database according to objective visual contents of the image itself. In this paper Color, Texture and Shape feature is used for retrieval images from data base. The proposed CBIR system uses integration of the ycbcr, Color moment (CM) is used to extract the color feature of image, Local Binary Pattern (LBP) is used to extract the texture feature of image, and shape descriptors as global descriptor with surf salient point as local descriptor to enhance the results. It is tested on a standard image database such as Wang and UCID databases. Experimental work shows that the proposed approach improves the precision and recall of retrieval results compared to other approach reported in recent literature.

1. INTRODUCTION

Image retrieval is the area of research in which images are searched and retrieved from image database. Information retrieval is the process to organize and store the information according to a specified process, and according to the needs of users to find the interrelated information, it is also called Information Storage and Retrieval[1]. In 1970s, database experts began to find effective methods to manage the image data. Due to the Internet broad development, and the availability of various image capturing devices such as digital cameras, smart mobile phones, image scanners, digital image collection is increasing rapidly. Efficient image searching, browsing and retrieval tools are required by users from various domains, including remote sensing, fashion, crime prevention, publishing, medicine, architecture, etc. To achieve this, many image retrieval systems have been developed. Two major research communities known as database management and computer vision usually study the image retrieval from different point of view, one is text-based and the other is visual based [2]. Text-based image retrieval or conventional image retrieval techniques use text descriptors to describe the content of the image which are used in database management system to perform retrieval. These text descriptors often cause ambiguity and inadequacy in performing an image database search and query processing.

To overcome the problems of text based image retrieval (TBIR), Content based image retrieval (CBIR) was introduced in 1980s. Visual based or content-based image retrieval (CBIR) used visual features to describe the content of images. "Content-based" means analyzing the contents of the image not the metadata such as tags,

keywords, or descriptions as name of image, type of image etc attached with the image. The term content refers to low level feature such as textures, colors, shapes, or any other information that can be extracted from the image itself. Basic function of CBIR is to extract the contents or features of image. The crucial difference between content- based and text-based retrieval systems is that the human interaction is an indispensable part of the latter system [3]. Humans mostly use high-level features such as keywords, text, descriptors, to get information of images and measure their similarity. While the features of image which are automatically extracted using computer vision techniques are mostly low-level features. By extracting these features, searching, browsing from a database and similarity matching between the images is performed. Main advantage of CBIR system is that it uses image contents rather than image itself.

    1. Features used in CBIR

      Two types of features are used to retrieve the information from the images which are: Low level feature and High level features. High-level features are used in text based image retrieval. Human has the capability to use high-level features (concepts), such as keywords, text descriptors, to extract information from images and to measure their similarity. Low-level features are used in content based image retrieval. Low-level feature is the basis of CBIR systems. To increase the performance of CBIR, image features can be either extracted from the entire image or from regions. Low-level features are given below:-

      1. Color

        Color is the most basic feature and almost all systems employ colors because it is invariant to image size and orientation. Colors are defined on a selected color spaces. Color feature is used in CBIR because sometimes this provides descriptors that can be used to identify the object in a scene and then perform extraction process to extract needed objects. Sometimes colors in an image have vast information, and this information is very useful to perform image retrieval. Variety of color spaces are available but they often use for different applications. Most of the images are in the red, green, blue (RGB) color space. RGB space is rarely used for indexing and querying as it does not work well to the human color perception. It only seems reasonable to be used for images taken under exactly the same conditions each time such as trademark images. Other

        spaces such as hue, saturation, value (HSV) or the CIE Lab and Luv spaces are much better with respect to human perception and are more frequently used.

      2. Texture

        There is not any formal definition of texture, but texture provides measures for some image property such as coarseness, regularity and smoothness. Texture can be defined as the repeated pixel patterns within an image. Partly due to the incomplete definition and understanding of what visual texture actually is, texture measures have an even larger variety of methods than color measures. Some of measures for extracting the texture of images are wavelets and Gabor filters where the Gabor filters do seem to perform better and correspond well to the properties of the human visual cortex for edge detection. Gabor features can also apply to perform many operations such as segmentation, texture analysis, target detection and image recognition etc. These texture measures try to capture the characteristics of the image or image parts with respect to changes in certain directions and the scale of the changes. This is most useful for regions or images with homogeneous texture

      3. Shape

Basically, shape-based image retrieval is the process to measure the similarity between shapes of two images by using image features. The shape feature is very crucial part

because it uses to the region of interest in images to perform retrieval. A shape is an important crucial visual feature of the object present in an image. Shape features often combined with color and texture because use of only shape feature does not produce the effective results. To describe the shape, techniques are categorized into two parts: contour based and region based descriptors. The region based descriptor use the whole object for extraction process while contour based focuses on the boundary line of objects.

    1. Local Binary Pattern (LBP)

      A local binary pattern was introduced by Ojala et al

      [28] for texture analysis. In simplest form, a basic LBP descriptor [12] is created by thresh-holding the values of the 3 X 3 neighborhood of the pixel against the central pixel. In comparison to other available texture analysis algorithm, LBP descriptor is not computationally expensive. To create LBP representation, first we have to convert the color (RGB color space) iage into grayscale image. Then we sub-block the image into 3×3 blocks. From the block, we have to find out gray level pixel values. Then we calculate LBP value. By using these LBP values we get a LBP image and calculate the histogram of calculated LBP codes as shown in figure 3.

      Feature Histogram

      Below Figure 4.3 shows, a sub-block of size 3X3 gray scale value.

      180

      168

      149

      175

      150

      120

      157

      100

      133

      0

      1

      2

      7

      cp

      3

      6

      5

      4

      180

      168

      149

      175

      150

      120

      157

      100

      133

      0

      1

      2

      7

      cp

      3

      6

      5

      4

      Figure 4.3: A 3X3 sub-block of image Figure 4.4: A 3X3 sub-block bit position of image

      Figure 4.4 shows different bit position with central pixel on the sub-block. All the pixel values are used to compare with central pixel 'cp' to calculate the LBP codes.

      Below Figure 4.3 shows, a sub-block of size 3X3 gray scale value.

      180

      168

      149

      175

      150

      120

      157

      100

      133

      0

      1

      2

      7

      cp

      3

      6

      5

      4

      180

      168

      149

      175

      150

      120

      157

      100

      133

      0

      1

      2

      7

      cp

      3

      6

      5

      4

      Figure 4.3: A 3X3 sub-block of image Figure 4.4: A 3X3 sub-block bit position of image

      Figure 4.4 shows different bit position with central pixel on the sub-block. All the pixel values are used to compare with central pixel 'cp' to calculate the LBP codes.

      Now the entire bit positions bp0, bp1, bp2 etc are compared with the central pixel cp and follow the following formula:

      7

      1:

      = () =

      0:

      {

      =0 7

      =0

      <

      . (1)

      Finally, we have to find the binary string which represents a pattern using the following formula:

      7

      = 2 . . (2)

      =0

    2. Color Moment

Color moment is a technique that is used to extract color from image. It is used to differentiate images based on color feature. Color moment is used to check the color similarity between images. The basis of color moments is that the color distribution of an image can be understood as a probability distribution. If the color of an image has a certain probability distribution, the moments of that distribution can be calculated and used as features to match that image based on color.

For image retrieval, the color moment is a simple and effective method to extract color features. Such color moment as first-order (mean) and second (variance) and third-order (gradient), is proved to be very effective in presenting color distribution of images. If the value of the color channel at the image pixel is , then the three colors moments are defined as below:

Moment 1(Mean) = 1

Mean is the average color value in the image.

=1

…. (4)

Moment 2(Standard deviation) = (1

( )2) (5)

=1

The standard deviation is the square root of the variance of the distribution.

3

3

Moment 3(Gradient or Skewness) Si= (1

( )3)………… (6)

=1

Skewness is a measure of the degree of asymmetry in the distribution.

    1. Similarity Measurement

      The similarity measurement of images has two categories: Distance Measurement and Correlation Measurement. Here we use Distance Measurement only. Distance measurement is the measurement defined under the meaning of a certain distance between two compared images [8]. One of them is Euclidean distance given by following formula-

      (_, _

      ) = { 1(_ () _

      ())2}1/2.. (7)

      =0

      Where _ () =(_ (0), _(1), ._ ( 1)) is the feature vector of query image and _ (i) =(

      _ (0), _ (1), ._ (L-1)) is the feature vector of the database images, L is the dimension of image feature.

      For each preprocessed image Iq do

      1. Apply Color Moment algorithm to extract the Color feature of query image. i.e.

        F1=CM (Iq)

      2. Apply Local Binary Pattern algorithm to extract the Texture feature of query image. i.e. F2=LBP (Iq).

      3. A combined feature vector of query image is created which contains color feature and texture feature vector of an image. i.e.

        F_Vq= {F1, F2}

      4. For each image in database DBIm repeat I to III step to get the feature vectors (F_VDBIm) of database images, Where

        DBIm= {IMDB1, IMDB2, IMDB3, , IMDBn }, n is total number of images in database.

      5. Apply Euclidian Distance algorithm to perform similarity matching.

        1

        (_ , _ ) = { (_ () _ ())2}1/2

        =0

      6. Retrieve topmost k images having similarity specified by a given threshold Th such that

      7. Rs (Image) = {Irs1, Irs2. Irsk}

        Where Ed (Irsk) > Th.

        First of all query image entered through user interface is preprocessed to increase the contrast of image and then feature extraction is applied. To extract the feature from image, color moment is used as a color descriptor and local binary pattern is used as a texture descriptor. After applying these techniques, a feature vector of color and texture is created. Both the feature vectors are combined into a third feature vector. By using this feature vector, histogram is generated and then similarity matching is performed to retrieve the relevant results from database.

    2. EXPERIMETAL RESULTS

      Simulation of proposed work is done in MATLAB 7.12.0 (R2011a). James S. Wang [31] has provided a database which is known as wang database. Wang database is a subset of coral database. Each category of wang database is shown in figure 5.1. We use this complete database in testing phase of our implementation. Classification of images into classes makes the evaluation easier. Our proposed system is implemented by using image processing toolbox.

      Wang database has 10 categories of images as shown in figure 5.1 and each category has 100 images. So wang database contains 1000 images. Two features (color and texture) of all database images are extracted using the proposed algorithm and stored into a cell array for retrieval purpose. The performance of proposed system is analyzed in two ways, first is visual analysis and second is quantitive analysis.

    3. Quantitive Analysis

To perform the quantitive analysis two metrics have been used which are precision and recall. These are thebasic measures used to evaluate retrieved result. Precision is the ratio of retrieved relevant images and the total number of retrieved images. It is expressed in the percentage form. We denote the precision by Pr. Let database contains T number of relevant images and the total number of retrieved relevant images is Ar, total number of retrieved images is At then precision can be calculated by following equation:

r

r

P = Ar (8)

At

We denote recall as Rc and it can be calculated as:

c

c

R = Ar.. (9)

T

Here we perform the operation on some random images of any category to calculate the precision and recall to check system

effectiveness. Quantitive analysis is performed in three phase. First is to calculate precision and recall using retrieved result of color moment only. Second is to calculate precision and recall using retrieved results of local binary pattern only and third is to calculate precision and recall using retrieved results of hybrid approach.

  1. CONCLUSIONS

    In this study, content based image retrieval system is implemented using the content of image color and texture. In this study we have implemented a hybrid technique by using two techniques named as LBP and CM. Color moment (CM) is used to extract the color feature of image. Local Binary Pattern (LBP) is used to extract the texture feature of image. After implementing the two methods separately, an integrated approach is implemented using these two techniques. Finally, Euclidian distance is used to find the similar images from the wang database. The Accuracy of LBP, CM and integrated approach is compared. The experimental results have shown that if we use the color or texture extraction than produced results are not good. So we use the integrated approach which produces results much better. We have applied different feature extraction methods but in case of matching procedure, we have used only Euclidian Distance.

  2. REFERENCES

  1. M.H. Saad, H.I.Saleh, H.Konbor, M.Ashour,Image Retrieval based on Integration between YCbCr Color

    Histogram and Shape Feature, published at international Computer Engineering Conference ICENCO2011.

  2. M.H. Saad, H.I.Saleh, H.Konbor, M.Ashour,Image Retrieval based on Integration between YCbCr Color Histogram and Text Feature, published at international journal of computer theory and Engineering(IJCTE) in Vol.3,No.5,2011.

  3. Aman Chadha, Sushmit Malik, Ravdeep Johar, "Comparative Study and Optimization of Feature-Extraction Techniques for Content based Image Retrieval", International Journal of Computer Applications, Vol.52,

    No.20, August 2012

  4. Dipti Jadhav, Gargi Phadke, Satish Devane, "Novel Weight Allocation Technique for Image Retrieval Based On Higher Order Colour Moments and CCM Texture Features", An International conference on Audio, language and image processing (ICALIP), p.p. 129 133, July 2012.

  5. Rajshree S. Dubey, Rajnish Choubey, Joy Bhattacharjee, "Multi Feature Content Based Image Retrieval", (IJCSE) International Journal on Computer Science and Engineering, Vol. 02, No. 06, pp. 2145-2149, 2010.

  6. R.Senthil Kumar, Dr.M.Senthilmurugan, "Content-Based Image Retrieval System in Medical Applications", International Journal of Engineering Research & Technology (IJERT), Vol. 2 No. 3, 2013.

  7. A.Ramesh Kumar, D.Saravanan, "Content Based Image Retrieval Using Color Histogram", (IJCSIT) International Journal of Computer Science and Information Technologies, Vol.4, No.2, pp.242 245, 2013.

  8. Niket Amoda, Ramesh K Kulkarni, "Efficient image retrieval using region based image retrieval", Signal & image processing: An international journal (SIPIJ), Vol.4, No.3, June 2013

  9. Ying Liu, Dengsheng Zhang, Guojun Lu, Wei-Ying Ma, "A survey of content-based image retrieval with high-level semantics", Pattern Recognition, Vol.40, pp. 262 282, 2007.

  10. Loris Nanni, Alessandra Lumini, Sheryl Brahnam, "Survey on LBP based texture descriptors for image classification", Expert Systems with Applications, Vol. 39, No. 3, pp. 3634 3641, 2012.

Leave a Reply