Review of Image Classification Methods and Techniques

DOI : 10.17577/IJERTV2IS80157

Download Full-Text PDF Cite this Publication

Text Only Version

Review of Image Classification Methods and Techniques

Maneela Jain Pushpendra Singh Tomar

Lnct, Bhopal, Lnct, Bhopal,

Abstract

Unsupervised region become most challenging area in image processing. Data, object and image classification is a very important task in image processing. If any image has noisy content or its contain blurry data, so it is very difficult to classify these kinds of images. Classification is nothing but just the categorization of same kind of data in same category. Digital image processing introduces many techniques which can classify the data, but if image is blurry or noisy so they can not able to provide the satisfactory results. In this survey paper three main classification methods consider, Supervised learning, unsupervised learning, Semi-supervised Learning. The main motive of this literature survey is to give a brief comparison between different image classification techniques and methods. Finally it has shown that Semi-Supervised Biased Maximum Margin Analysis classifies the images more accurately even if they contain blurry or noisy image.

  1. Introduction

    Every day millions of images produced. Every image requires classification in this way, By which they can occur easily and in a higher speed. Humans have the capabilities to classify the images more easily then computers. A simple classification system consists of a camera fixed high above the interested zone where images are captured and consequently process [1]. Classification is a procedure to classify images into several categories, based on their similarities. We can easily understand or analyses our surroundings by classifying the images. But it is not always easy to classify an image especially when it contains noisy or blurry contents. In the classification system user deal with a database and that database contains some patterns or images which are predefined or which are going to be classified. Image classification always is a critical but an important task for many applications.

    Sometimes it is very hard to identify an object in an image. Especially when it contains, noise, Background clutter or bad quality. And if any image includes more than one object so this task becomes more difficult. So we can say that, the main principle of image classification is to recognize the features occurring in an image. We can discuss three major techniques of image classification and some other related technique in this paper.

    First technique is supervised classification. In

    supervised learning labeled data points are used. Or we can say that training is required in the supervised learning. Second are unsupervised classification uses no labeled data that means no training is required. In this case we can take any random data. Third technique is Semi-supervised classification take several advantages over Supervised and Unsupervised classification. It uses unlabeled data points in order to remove the need for extensive domain scientist interaction and deal with bias that is the result of poor representation of labeled data. In this survey paper Semi supervised learning has been taken from [3]. As we know that the main principle of an image classification is to recognize the feature occurring in an image. Generally classification is done by computer. Computer classifies images with the help of different mathematical techniques. Classification will be made according to the following steps which shows in figure 1-

    Definition of Classification Classes

    Classification classes always depend on the property and objective of the image, and it should be clearly defined.

    Selection of Features

    Multi-spectral and multi-temporal properties should be used to fix the differences between the classes. Features of classes differ from one class to another.

    Sampling of Training Data

    To obtain correct decision rule it is necessary to sample the training data. Different classification techniques like Supervised, Unsupervised, and Semi supervised learning will be used according to the data.

    Estimation of Universal Statistics

    Several classification techniques will be compared with the data, and appropriate method will be selected.

    Classification Method

    Appropriate classification method will be used on the data. Some methods which we will discuss in this paper are- SVM, DAG, BMMA, Linear Discriminate Analysis, ANN, Fuzzy Tree.

    Verification of Result

    At last final result will be verified.

    In this paper section 2, examine the related survey work that shows which classification method is most suitable. Section 3 compares between the three sampling methods i.e. Supervised, Unsupervised or Semi-supervised to show which is most suitable. Section 4 compares the other classification methods.

    1.

    1.

    Definition of Classification

    Selection of Features

    Sampling of Training Data

    Estimation of Universe Statistic

    Classification Methods

    Verification of Result

  2. Related Work

    Wei-jiu Zhang, Li Mao and Wen-bo Xu etld[6] Automatic Image Classification Using the Classification Ant-Colony Algorithm To improve the versatility, robustness, and convergence rate of automatic classification of images, An ant-colony based classification is defined in this paper. According to the characteristics of the image classification, traditional Ant-Colony algorithm is adopts and improves by this model. It defines two types of ants that have different search strategies and refreshing mechanisms. The stochastic ants identify new categories; construct the category tables and determining the clustering center of each category. The experiment indicate that ant-colony algorithm improve the efficiency and accurate the result.

    D.Lu and Q. Wend etld [7] did a survey on image classification techniques and methods. Image classification is a complex process that may be affected by many factors. They examine current practices, problems, and prospects of image classification. The emphasis are placed on the summarization of major advanced classification approaches and the techniques used for improving classification accuracy.

    Jipsa Kurian, Vkarunakaran etld[5] did a survey on image classification method and find Image classification is one of the most complex areas in image processing. It is more complex and difficult to classify if it contains blurry and noisy content. There are several methods to classify images and they provide good classification result but they fail to provide satisfactory classification result when the image contains blurry and noisy content. The two main methods for image classification are supervised and unsupervised classification. Both classifications have its own advantage and disadvantage. It is difficult to obtain better result with the noisy and blurry image than with normal image.

    Saurabh Agrawal, N.K. Varma, Prateek Tamrakar and Pradip Sircar etld [8] increase the classification using support vector machine. Traditional classification approaches deal poorly on content based image classification tasks being one of the reasons of high dimensionality of the feature space. In this paper, color image classification is done on features extracted from histograms of color components. The benefit of using color image histograms are better efficiency, and insensitivity to small changes in camera view-point i.e. translation and rotation.

    Quia Du etld [10] they proposed a constrained linear discriminate analysis (CLDA) approach for classifying the remotely sensed hyper spectral images. Its basic idea is to design an optimal linear transformation operator which can maximize the ratio of inter-class to intra-class distance while satisfying the constraint that the different class centres after transformation are aligned along different directions. Its major advantage over the traditional Fishers linear discriminate analysis is that the classification can be achieved simultaneously with the transformation. The CLDA is a supervised approach, i.e., the class spectral signatures need to be known a priori. But, in practice,

    these information may be difficult or even impossible to obtain. So they will extend the CLDA algorithm into an unsupervised version, where the class spectral signatures are to be directly generated from an unknown image scene.

    Mostafa Sabzekar, Mohammad Ghasemigol, Mahmoud Naghibzadeh, H. S. Yazdi [2] proposed Directed Acyclic Graph Support Vector Machines (DAG SVM). It suggests a weighted multi-class classification technique which divides the input space into several subspaces. In the training phase of the technique, for each subspace, a DAG SVM is trained and its probability density function (pdf) is guesstimated. In the test phase, fit in value of each input pattern to every subspace is calculated using the pdf of the subspace as the weight of each DAG SVM. Finally, a fusion operation is defined and applied to the DAG SVM outputs to decide the class label of the given input pattern. Evaluation results show the prominence of our method of multi-class classification compared with DAG SVM.

    S.Moustakidis, G. Mallinis, N. Koutsias, John B. Theocharis [11] fuzzy decision tree is proposed in this paper. Where, the node discriminations are implemented via binary SVMs. The tree structure is determined via a class grouping algorithm. In addition, effective feature selection is incorporated within the tree building process, selecting suitable feature subsets required for the node discriminations individually. FDT-SVM exhibits a number of attractive merits such as enhanced classification accuracy, interpretable hierarchy, and low model complexity. Furthermore, it provides hierarchical image segmentation and has reasonably low computational and data storage demands.

    L. Zhang, L. Wang, W. Lin [12].In this Paper we will discuss the semi supervised biased maximum margin analysis for interactive image classification. A variety of relevance feedback (RF) schemes have been developed as a powerful tool to bridge the semantic gap between low-level visual features and high-level semantic concepts, and thus to improve the performance of CBIR systems. Among various RF approaches, support-vector-machine (SVM)-based RF is one of the most popular techniques in CBIR. Despite

    the success, directly using SVM as an RF scheme has two drawbacks mainly. First, it treats all the feedbacks equally i.e. negative and positive, which is not appropriate since the two groups of training feedbacks have distinct properties. Second, most of the SVM- based RF techniques do not take into account the unlabeled samples, although they are very helpful in constructing a good classifier. To explore solutions to overcome these two drawbacks, in this paper, we propose a biased maximum margin analysis (BMMA) and a semi supervised BMMA (Semi BMMA) for integrating the distinct properties of feedbacks and utilizing the information of unlabeled samples for SVM-based RF schemes. To differentiate positive feedbacks from negative ones based on local analysis BMMA is used, whereas the Semi BMMA can effectively integrate information of unlabeled samples by introducing a Laplacian regularize to the BMMA. We formally formulate this problem into a general subspace learning task and then propose an automatic approach of determining the dimensionality of the embedded subspace for RF. Extensive experiments on a large real-world image database demonstrate that the proposed scheme combined with the SVM RF can significantly improve the performance of CBIR system.

    Ajay Kumar Singh, Shamik Tiwari & V.P. Shukla etld [13] Wavelet based Multi Class image classification using Neural Network, A feature extraction and classification of multiclass images by using Haar wavelet transform and back propagation neural network. The wavelet features are extracted from original texture images and corresponding complementary images. The features are made up of different combinations of sub-band images, which offer better discriminating strategy for image classification and enhance the classification rate.

    Discussion

    Related work proof that all other methods are very good and classify the image efficiently. But semi supervised BMMA method is more efficient then other methods because of 2 reasons. Firstly it is a semi supervised classification, as we know semi supervised classification gives accurate and cost effective result and secondly BMMA overcome the disadvantages of RBF SVM.

  3. Comparison between Data Sampling Methods

    Supervised Classification

    Supervised learning based classification is depends on data which is created from the knowledge of domain. I n supervised learning labelled data points are used. To determine accurate categorization of an image in supervised classification pre-labelled samples are required. In this method training is required or expertise knowledge is required so this technique become time consuming. Thats by in some areas this technique is not suitable. In order to determine a decision rule for classification, it is necessary to know the spectral characteristics or feature with respect to the population of each class [4].

    Advantages

    • Errors can be detected by operators and they often remedy them [5].

    • Expertise knowledge required, so this method will give the accurate result.

      Disadvantages

    • Not suitable to deal with big data, because for each area it requires area experts.

    • Very Time consuming. It takes so much time to identify pre-labelled samples.

      Unsupervised Classification

      Some situation requires little information about the area to be classifies, only image properties are used as-

      1. Randomly sampled datas several groups, will be divided mechanically into the same classes by using clustering techniques.

      2. These clustered classes later used for determining population statistics. This kind of classification is called the unsupervised classification.

      Advantages

    • Scientist spends less time to classify the domain. As a result only required images are classified.

    • This approach is very a suitable to classify large data.

      Disadvantages

    • Any kind of training is not given in this method, so it requires great knowledge about the area or about the method which is suitable for the desired area.

    • With large data sets computation time is large and it creates useless classifier.

      Semi Supervised Classification

      This method is used to deal with the non-labelled samples to assist with the supervised classification method. By this method we will be able to deal with unsupervised classification. This shows that this method is able to deal with the situation where labelled data points are abundance. Sometimes both supervised and unsupervised methods do not able to obtain efficient result, but semi-supervised approach gives the accurate result and focuses completely on efficiency which is the principle of semi-supervised classification. Semi-supervised method does their classification in three steps. Firstly it selects the labelled or un-labelled data points, i.e. data point selection. After selecting the data point it creates the initial classifier, which is useful in third step. And the last step is to clustering the data points to find classifier. Semi- supervised technique is best suited in much application and will give the accurate results.

      Direct Acyclic

      neurons. They connected with each other by a weighted link.

      2

      DAG-

      DAG suffered a

      SVM

      Graph based support Vector machine performs a better classification in

      little bit problems with mapping of space data into feature selection process.

      Performance of result evaluation

      compression of

      shows that SVM-

      another binary multi-class

      DAG is not a better classifier.

      classification.

      Feature data is

      mapping by the

      graph portion

      technique

      applied by

      DAG. The

      mapping space

      of feature data

      mapped

      correctly

      automatically

      improved the

      voting process

      of classification.

      3

      SVM

      SVM is a binary

      Training required,

      Non- parametric

      which is time

      classifier. Some

      consuming.

      SVM supports

      Transparency in

      multiclass

      final result is very

      classifiers also.

      less.

      SVMs [14] are

      learning

      systems that

      use a

      hypothesis

      space of linear

      functions in a

      hyper space.

      SVM is trained

      with a Learning

      algorithm from

      optimization

      theory that

      implements a

      learning bias

      derived from

      neurons. They connected with each other by a weighted link.

      2

      DAG-

      Direct Acyclic

      DAG suffered a

      SVM

      Graph based support Vector machine performs a better classification in

      little bit problems with mapping of space data into feature selection process.

      Performance of result evaluation

      compression of

      shows that SVM-

      another binary multi-class

      DAG is not a better classifier.

      classification.

      Feature data is

      mapping by the

      graph portion

      technique

      applied by

      DAG. The

      mapping space

      of feature data

      mapped

      correctly

      automatically

      improved the

      voting process

      of classification.

      3

      SVM

      SVM is a binary

      Training required,

      Non- parametric

      which is time

      classifier. Some

      consuming.

      SVM supports

      Transparency in

      multiclass

      final result is very

      classifiers also.

      less.

      SVMs [14] are

      learning

      systems that

      use a

      hypothesis

      space of linear

      functions in a

      hyper space.

      SVM is trained

      with a Learning

      algorithm from

      optimization

      theory that

      implements a

      learning bias

      derived from

      Advantages

      • Cost of classification will decreases because labelled data combined with the unlimited un- labelled data.

      • Focuses on accurate results and efficiency.

        Disadvantages

      • Global maximum problem.

        This Comparison shows that semi supervised classification is much better than both the supervised and unsupervised classification.

  4. Comparison between Data Sampling Methods

    S.

    No.

    Method

    Description

    Disadvantages

    1

    ANN

    Artificial Neural

    ANN required

    Network is a

    training which is

    kind of artificial

    costly and time

    intelligence that

    consuming. ANN

    controls human

    is network

    minds function.

    architecture so

    It is a non-

    sometimes it

    parametric

    becomes very

    approach. Non-

    hard to choose

    parametric

    which network is

    approach has

    most suitable for

    no assumption

    over approach.

    about the data

    and where

    correctness

    depends on the

    no. of inputs

    and network.

    Ann learns from

    the

    environment,

    and stores the

    experiential

    knowledge.

    ANN is a

    collection of

    layer basically it

    has 2 layers i.e.

    input and

    output, but

    some system

    has hidden

    layers. Every

    layer has no. of

    distinguish between positive and negative feedback.

    Based on local analysis. It also uses the

    benefits of semi supervised approach. Semi BMMA forming RF by

    combining unlabeled samples, and

    remove the

    over fitting problem of labelled samples.

    problem.

    statistical learning theory. The aim of Classification via SVM is to find a

    computationally efficient way of learning good separating hyper planes in a hyperspace, where good

    hyper planes

    mean ones optimizing the generalizing bounds and by computationally efficient we mean algorithms able to deal with sample sizes of very high [8].

    4

    FDT

    Fuzzy Decision

    Main

    Tree approach

    disadvantage of

    is a non

    this method is

    parametric

    that it does not

    Unsupervised

    require training,

    approach and

    so prior

    based on

    knowledge about

    hierarchical rule

    the desired area

    based method.

    is required. Gives

    Fuzzy nature of

    complicated

    this method

    calculations,

    makes it more

    when various

    reliable,

    undecided

    because fuzzy

    outputs are

    uses stochastic

    correlated.

    approach. FDT

    uses the

    advantages of

    both the

    methods i.e.

    Fuzzy and

    decision tree.

    5

    BMMA

    Semi

    Basically Semi

    supervised

    supervised

    BMMA

    BMMA has many

    enhanced the

    advantages, but

    content based

    the main problem

    image retrieval

    is that, it suffers

    approach. Semi

    with global

    BMMA easily

    maximum

    statistical learning theory. The aim of Classification via SVM is to find a

    computationally efficient way of learning good separating hyper planes in a hyperspace, where good

    hyper planes

    mean ones optimizing the generalizing bounds and by computationally efficient we mean algorithms able to deal with sample sizes of very high [8].

    4

    FDT

    Fuzzy Decision

    Main

    Tree approach

    disadvantage of

    is a non

    this method is

    parametric

    that it does not

    Unsupervised

    require training,

    approach and

    so prior

    based on

    knowledge about

    hierarchical rule

    the desired area

    based method.

    is required. Gives

    Fuzzy nature of

    complicated

    this method

    calculations,

    makes it more

    when various

    reliable,

    undecided

    because fuzzy

    outputs are

    uses stochastic

    correlated.

    approach. FDT

    uses the

    advantages of

    both the

    methods i.e.

    Fuzzy and

    decision tree.

    5

    BMMA

    Semi

    Basically Semi

    supervised

    supervised

    BMMA

    BMMA has many

    enhanced the

    advantages, but

    content based

    the main problem

    image retrieval

    is that, it suffers

    approach. Semi

    with global

    BMMA easily

    maximum

  5. Conclusion

This paper discuss about the image classification techniques and methods. This paper provides detailed information about the different classification techniques and methods. Main classification techniques are divided into three categories such as Supervised Classification, Unsupervised Classification and Semi- supervised classification. And also there related techniques such as ANN and SVM are a supervised approach and having some disadvantages. FDT is an unsupervised classification method and it also has some advantages and disadvantages. BMMA is a semi supervised approach and it is more suitable then all other methods because it takes advantages of both supervised and unsupervised techniques.

10. References

  1. Pooja Kamavisdar, Sonam Saluja, Sonu Agrawal. A survey on image classification approaches and techniques, Department of Computer Science & Applications, SSCST, Bhilai, India, IJARCCE, Vol.2, Issue.1, Jan 2013.

  2. Mostafa Sabzekar, Mohammad Ghasemigol, Mahmoud Naghibzadeh, H. S. Yazdi, Improved DAG SVM a new method for multiclass svm classification, Department of computer science & Engineering, Ferdowsi University of Mashhad, Iran ICAI09I.

  3. Zhu, Xiaojin, Semi supervised learning literature survey,Department of Computer science & Engg, University of Wisconsin-Madison1530,2005,

    http://www.cs.wise.edu/~jerryzhu/pub/ss1_survey.pdf

  4. http://www.jars1974.net/pdf/12_Chapter11.pdf

  5. Jipsa Kurian, V. Karunakaran, A survey on image classification methods, Department of Computer Science, Karunya University, Coimbatore, India, IJARECE, Volume 1, Issue.4, Oct 2012.

  6. Wei-jiu Zhang, Li Mao and Wen-bo Xu. Automatic image classification using the classification ant- colony algorithm, Jiangnan University, Wuxi, China, 2009 International Conference on Environmental Science and Information Application Technology.

  7. D.Lu and Q. Wend, A survey of image classification methods and technology for improving classification performance, Department of Geography, Geology, and Anthropology, Indiana State University, USA, IJRS, Vol.28, No.5, 10 March 2007.

  8. Saurabh Agrawal, Nishchal K Verma, Prateek Tamrakar, Pradip Sircar, Content based color image classification using SVM, Department of Electrical Engg., IIT, Kanpur, India, 2011 8th international conference on information technology.

  9. S.Lazebnik, C. Schmid and J. Ponce,Beyond bags of features spatial pyramid matching for recognizing natural scene categories, IEEE Con. Cmputer. Vis. Pattern Recognition. Vol.2, pp.2169-2178. March 2006.

  10. Qian Du, Unsupervised real time constrained linear discriminate analysis to hyper spectral image classification, Department of Electrical & Computer Engg, Mississippi state university, USA, www.sciencedirect.com, Pattern Reorganization (2005) 361-368.

  11. Serafeim Moustakidis, Giorgos Mallinis, Nikos Koutsias, John B. Theohari, Member IEEE and Vesalius Petridis, Member IEEE, SVM based fuzzy decision tree for classification of high spatial resolution remote sensing images, IEEE transaction on geosciences and remote sensing, Vol.50, No.1, January 2012.

  12. Lining Zhang, Student Member, IEEE, Lipo Wang, Senior Member, IEEE, Weisi Lin, Senior Member, IEEE, Semi supervised biased maximum margin analysis for interactive image retrieval, IEEE transaction of image processing, Vol.21, No.4, April 2012.

  13. M. Pesaresiand and J. A. Benediktsonn, A New approach for the Morphological segmentation of high- resolution satellite imagery, IEEE Trans. Geosci.Remote Sens., Vol 39, No.2, pp. 309-320, Feb 2001.

  14. Christopher J. C. Burges, A Tutorial on support vector machine for pattern recognition, Data Mining and know discovery 2, pp. 121-167, 1998.

Leave a Reply