- Open Access
- Total Downloads : 4142
- Authors : Maneela Jain, Pushpendra Singh Tomar
- Paper ID : IJERTV2IS80157
- Volume & Issue : Volume 02, Issue 08 (August 2013)
- Published (First Online): 16-08-2013
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Review of Image Classification Methods and Techniques
Maneela Jain Pushpendra Singh Tomar
Lnct, Bhopal, Lnct, Bhopal,
Abstract
Unsupervised region become most challenging area in image processing. Data, object and image classification is a very important task in image processing. If any image has noisy content or its contain blurry data, so it is very difficult to classify these kinds of images. Classification is nothing but just the categorization of same kind of data in same category. Digital image processing introduces many techniques which can classify the data, but if image is blurry or noisy so they can not able to provide the satisfactory results. In this survey paper three main classification methods consider, Supervised learning, unsupervised learning, Semi-supervised Learning. The main motive of this literature survey is to give a brief comparison between different image classification techniques and methods. Finally it has shown that Semi-Supervised Biased Maximum Margin Analysis classifies the images more accurately even if they contain blurry or noisy image.
-
Introduction
Every day millions of images produced. Every image requires classification in this way, By which they can occur easily and in a higher speed. Humans have the capabilities to classify the images more easily then computers. A simple classification system consists of a camera fixed high above the interested zone where images are captured and consequently process [1]. Classification is a procedure to classify images into several categories, based on their similarities. We can easily understand or analyses our surroundings by classifying the images. But it is not always easy to classify an image especially when it contains noisy or blurry contents. In the classification system user deal with a database and that database contains some patterns or images which are predefined or which are going to be classified. Image classification always is a critical but an important task for many applications.
Sometimes it is very hard to identify an object in an image. Especially when it contains, noise, Background clutter or bad quality. And if any image includes more than one object so this task becomes more difficult. So we can say that, the main principle of image classification is to recognize the features occurring in an image. We can discuss three major techniques of image classification and some other related technique in this paper.
First technique is supervised classification. In
supervised learning labeled data points are used. Or we can say that training is required in the supervised learning. Second are unsupervised classification uses no labeled data that means no training is required. In this case we can take any random data. Third technique is Semi-supervised classification take several advantages over Supervised and Unsupervised classification. It uses unlabeled data points in order to remove the need for extensive domain scientist interaction and deal with bias that is the result of poor representation of labeled data. In this survey paper Semi supervised learning has been taken from [3]. As we know that the main principle of an image classification is to recognize the feature occurring in an image. Generally classification is done by computer. Computer classifies images with the help of different mathematical techniques. Classification will be made according to the following steps which shows in figure 1-
Definition of Classification Classes
Classification classes always depend on the property and objective of the image, and it should be clearly defined.
Selection of Features
Multi-spectral and multi-temporal properties should be used to fix the differences between the classes. Features of classes differ from one class to another.
Sampling of Training Data
To obtain correct decision rule it is necessary to sample the training data. Different classification techniques like Supervised, Unsupervised, and Semi supervised learning will be used according to the data.
Estimation of Universal Statistics
Several classification techniques will be compared with the data, and appropriate method will be selected.
Classification Method
Appropriate classification method will be used on the data. Some methods which we will discuss in this paper are- SVM, DAG, BMMA, Linear Discriminate Analysis, ANN, Fuzzy Tree.
Verification of Result
At last final result will be verified.
In this paper section 2, examine the related survey work that shows which classification method is most suitable. Section 3 compares between the three sampling methods i.e. Supervised, Unsupervised or Semi-supervised to show which is most suitable. Section 4 compares the other classification methods.
1.
1.
Definition of Classification
Selection of Features
Sampling of Training Data
Estimation of Universe Statistic
Classification Methods
Verification of Result
-
Related Work
Wei-jiu Zhang, Li Mao and Wen-bo Xu etld[6] Automatic Image Classification Using the Classification Ant-Colony Algorithm To improve the versatility, robustness, and convergence rate of automatic classification of images, An ant-colony based classification is defined in this paper. According to the characteristics of the image classification, traditional Ant-Colony algorithm is adopts and improves by this model. It defines two types of ants that have different search strategies and refreshing mechanisms. The stochastic ants identify new categories; construct the category tables and determining the clustering center of each category. The experiment indicate that ant-colony algorithm improve the efficiency and accurate the result.
D.Lu and Q. Wend etld [7] did a survey on image classification techniques and methods. Image classification is a complex process that may be affected by many factors. They examine current practices, problems, and prospects of image classification. The emphasis are placed on the summarization of major advanced classification approaches and the techniques used for improving classification accuracy.
Jipsa Kurian, Vkarunakaran etld[5] did a survey on image classification method and find Image classification is one of the most complex areas in image processing. It is more complex and difficult to classify if it contains blurry and noisy content. There are several methods to classify images and they provide good classification result but they fail to provide satisfactory classification result when the image contains blurry and noisy content. The two main methods for image classification are supervised and unsupervised classification. Both classifications have its own advantage and disadvantage. It is difficult to obtain better result with the noisy and blurry image than with normal image.
Saurabh Agrawal, N.K. Varma, Prateek Tamrakar and Pradip Sircar etld [8] increase the classification using support vector machine. Traditional classification approaches deal poorly on content based image classification tasks being one of the reasons of high dimensionality of the feature space. In this paper, color image classification is done on features extracted from histograms of color components. The benefit of using color image histograms are better efficiency, and insensitivity to small changes in camera view-point i.e. translation and rotation.
Quia Du etld [10] they proposed a constrained linear discriminate analysis (CLDA) approach for classifying the remotely sensed hyper spectral images. Its basic idea is to design an optimal linear transformation operator which can maximize the ratio of inter-class to intra-class distance while satisfying the constraint that the different class centres after transformation are aligned along different directions. Its major advantage over the traditional Fishers linear discriminate analysis is that the classification can be achieved simultaneously with the transformation. The CLDA is a supervised approach, i.e., the class spectral signatures need to be known a priori. But, in practice,
these information may be difficult or even impossible to obtain. So they will extend the CLDA algorithm into an unsupervised version, where the class spectral signatures are to be directly generated from an unknown image scene.
Mostafa Sabzekar, Mohammad Ghasemigol, Mahmoud Naghibzadeh, H. S. Yazdi [2] proposed Directed Acyclic Graph Support Vector Machines (DAG SVM). It suggests a weighted multi-class classification technique which divides the input space into several subspaces. In the training phase of the technique, for each subspace, a DAG SVM is trained and its probability density function (pdf) is guesstimated. In the test phase, fit in value of each input pattern to every subspace is calculated using the pdf of the subspace as the weight of each DAG SVM. Finally, a fusion operation is defined and applied to the DAG SVM outputs to decide the class label of the given input pattern. Evaluation results show the prominence of our method of multi-class classification compared with DAG SVM.
S.Moustakidis, G. Mallinis, N. Koutsias, John B. Theocharis [11] fuzzy decision tree is proposed in this paper. Where, the node discriminations are implemented via binary SVMs. The tree structure is determined via a class grouping algorithm. In addition, effective feature selection is incorporated within the tree building process, selecting suitable feature subsets required for the node discriminations individually. FDT-SVM exhibits a number of attractive merits such as enhanced classification accuracy, interpretable hierarchy, and low model complexity. Furthermore, it provides hierarchical image segmentation and has reasonably low computational and data storage demands.
L. Zhang, L. Wang, W. Lin [12].In this Paper we will discuss the semi supervised biased maximum margin analysis for interactive image classification. A variety of relevance feedback (RF) schemes have been developed as a powerful tool to bridge the semantic gap between low-level visual features and high-level semantic concepts, and thus to improve the performance of CBIR systems. Among various RF approaches, support-vector-machine (SVM)-based RF is one of the most popular techniques in CBIR. Despite
the success, directly using SVM as an RF scheme has two drawbacks mainly. First, it treats all the feedbacks equally i.e. negative and positive, which is not appropriate since the two groups of training feedbacks have distinct properties. Second, most of the SVM- based RF techniques do not take into account the unlabeled samples, although they are very helpful in constructing a good classifier. To explore solutions to overcome these two drawbacks, in this paper, we propose a biased maximum margin analysis (BMMA) and a semi supervised BMMA (Semi BMMA) for integrating the distinct properties of feedbacks and utilizing the information of unlabeled samples for SVM-based RF schemes. To differentiate positive feedbacks from negative ones based on local analysis BMMA is used, whereas the Semi BMMA can effectively integrate information of unlabeled samples by introducing a Laplacian regularize to the BMMA. We formally formulate this problem into a general subspace learning task and then propose an automatic approach of determining the dimensionality of the embedded subspace for RF. Extensive experiments on a large real-world image database demonstrate that the proposed scheme combined with the SVM RF can significantly improve the performance of CBIR system.
Ajay Kumar Singh, Shamik Tiwari & V.P. Shukla etld [13] Wavelet based Multi Class image classification using Neural Network, A feature extraction and classification of multiclass images by using Haar wavelet transform and back propagation neural network. The wavelet features are extracted from original texture images and corresponding complementary images. The features are made up of different combinations of sub-band images, which offer better discriminating strategy for image classification and enhance the classification rate.
Discussion
Related work proof that all other methods are very good and classify the image efficiently. But semi supervised BMMA method is more efficient then other methods because of 2 reasons. Firstly it is a semi supervised classification, as we know semi supervised classification gives accurate and cost effective result and secondly BMMA overcome the disadvantages of RBF SVM.
-
Comparison between Data Sampling Methods
Supervised Classification
Supervised learning based classification is depends on data which is created from the knowledge of domain. I n supervised learning labelled data points are used. To determine accurate categorization of an image in supervised classification pre-labelled samples are required. In this method training is required or expertise knowledge is required so this technique become time consuming. Thats by in some areas this technique is not suitable. In order to determine a decision rule for classification, it is necessary to know the spectral characteristics or feature with respect to the population of each class [4].
Advantages
-
Errors can be detected by operators and they often remedy them [5].
-
Expertise knowledge required, so this method will give the accurate result.
Disadvantages
-
Not suitable to deal with big data, because for each area it requires area experts.
-
Very Time consuming. It takes so much time to identify pre-labelled samples.
Unsupervised Classification
Some situation requires little information about the area to be classifies, only image properties are used as-
-
Randomly sampled datas several groups, will be divided mechanically into the same classes by using clustering techniques.
-
These clustered classes later used for determining population statistics. This kind of classification is called the unsupervised classification.
Advantages
-
-
Scientist spends less time to classify the domain. As a result only required images are classified.
-
This approach is very a suitable to classify large data.
Disadvantages
-
Any kind of training is not given in this method, so it requires great knowledge about the area or about the method which is suitable for the desired area.
-
With large data sets computation time is large and it creates useless classifier.
Semi Supervised Classification
This method is used to deal with the non-labelled samples to assist with the supervised classification method. By this method we will be able to deal with unsupervised classification. This shows that this method is able to deal with the situation where labelled data points are abundance. Sometimes both supervised and unsupervised methods do not able to obtain efficient result, but semi-supervised approach gives the accurate result and focuses completely on efficiency which is the principle of semi-supervised classification. Semi-supervised method does their classification in three steps. Firstly it selects the labelled or un-labelled data points, i.e. data point selection. After selecting the data point it creates the initial classifier, which is useful in third step. And the last step is to clustering the data points to find classifier. Semi- supervised technique is best suited in much application and will give the accurate results.
neurons. They connected with each other by a weighted link.
2
DAG-
Direct Acyclic
DAG suffered a
SVM
Graph based support Vector machine performs a better classification in
little bit problems with mapping of space data into feature selection process.
Performance of result evaluation
compression of
shows that SVM-
another binary multi-class
DAG is not a better classifier.
classification.
Feature data is
mapping by the
graph portion
technique
applied by
DAG. The
mapping space
of feature data
mapped
correctly
automatically
improved the
voting process
of classification.
3
SVM
SVM is a binary
Training required,
Non- parametric
which is time
classifier. Some
consuming.
SVM supports
Transparency in
multiclass
final result is very
classifiers also.
less.
SVMs [14] are
learning
systems that
use a
hypothesis
space of linear
functions in a
hyper space.
SVM is trained
with a Learning
algorithm from
optimization
theory that
implements a
learning bias
derived from
neurons. They connected with each other by a weighted link.
2
DAG-
Direct Acyclic
DAG suffered a
SVM
Graph based support Vector machine performs a better classification in
little bit problems with mapping of space data into feature selection process.
Performance of result evaluation
compression of
shows that SVM-
another binary multi-class
DAG is not a better classifier.
classification.
Feature data is
mapping by the
graph portion
technique
applied by
DAG. The
mapping space
of feature data
mapped
correctly
automatically
improved the
voting process
of classification.
3
SVM
SVM is a binary
Training required,
Non- parametric
which is time
classifier. Some
consuming.
SVM supports
Transparency in
multiclass
final result is very
classifiers also.
less.
SVMs [14] are
learning
systems that
use a
hypothesis
space of linear
functions in a
hyper space.
SVM is trained
with a Learning
algorithm from
optimization
theory that
implements a
learning bias
derived from
Advantages
-
Cost of classification will decreases because labelled data combined with the unlimited un- labelled data.
-
Focuses on accurate results and efficiency.
Disadvantages
-
Global maximum problem.
This Comparison shows that semi supervised classification is much better than both the supervised and unsupervised classification.
-
-
-
Comparison between Data Sampling Methods
S.
No.
Method
Description
Disadvantages
1
ANN
Artificial Neural
ANN required
Network is a
training which is
kind of artificial
costly and time
intelligence that
consuming. ANN
controls human
is network
minds function.
architecture so
It is a non-
sometimes it
parametric
becomes very
approach. Non-
hard to choose
parametric
which network is
approach has
most suitable for
no assumption
over approach.
about the data
and where
correctness
depends on the
no. of inputs
and network.
Ann learns from
the
environment,
and stores the
experiential
knowledge.
ANN is a
collection of
layer basically it
has 2 layers i.e.
input and
output, but
some system
has hidden
layers. Every
layer has no. of
distinguish between positive and negative feedback.
Based on local analysis. It also uses the
benefits of semi supervised approach. Semi BMMA forming RF by
combining unlabeled samples, and
remove the
over fitting problem of labelled samples.
problem.
statistical learning theory. The aim of Classification via SVM is to find a
computationally efficient way of learning good separating hyper planes in a hyperspace, where good
hyper planes
mean ones optimizing the generalizing bounds and by computationally efficient we mean algorithms able to deal with sample sizes of very high [8].
4
FDT
Fuzzy Decision
Main
Tree approach
disadvantage of
is a non
this method is
parametric
that it does not
Unsupervised
require training,
approach and
so prior
based on
knowledge about
hierarchical rule
the desired area
based method.
is required. Gives
Fuzzy nature of
complicated
this method
calculations,
makes it more
when various
reliable,
undecided
because fuzzy
outputs are
uses stochastic
correlated.
approach. FDT
uses the
advantages of
both the
methods i.e.
Fuzzy and
decision tree.
5
BMMA
Semi
Basically Semi
supervised
supervised
BMMA
BMMA has many
enhanced the
advantages, but
content based
the main problem
image retrieval
is that, it suffers
approach. Semi
with global
BMMA easily
maximum
statistical learning theory. The aim of Classification via SVM is to find a
computationally efficient way of learning good separating hyper planes in a hyperspace, where good
hyper planes
mean ones optimizing the generalizing bounds and by computationally efficient we mean algorithms able to deal with sample sizes of very high [8].
4
FDT
Fuzzy Decision
Main
Tree approach
disadvantage of
is a non
this method is
parametric
that it does not
Unsupervised
require training,
approach and
so prior
based on
knowledge about
hierarchical rule
the desired area
based method.
is required. Gives
Fuzzy nature of
complicated
this method
calculations,
makes it more
when various
reliable,
undecided
because fuzzy
outputs are
uses stochastic
correlated.
approach. FDT
uses the
advantages of
both the
methods i.e.
Fuzzy and
decision tree.
5
BMMA
Semi
Basically Semi
supervised
supervised
BMMA
BMMA has many
enhanced the
advantages, but
content based
the main problem
image retrieval
is that, it suffers
approach. Semi
with global
BMMA easily
maximum
-
Conclusion
This paper discuss about the image classification techniques and methods. This paper provides detailed information about the different classification techniques and methods. Main classification techniques are divided into three categories such as Supervised Classification, Unsupervised Classification and Semi- supervised classification. And also there related techniques such as ANN and SVM are a supervised approach and having some disadvantages. FDT is an unsupervised classification method and it also has some advantages and disadvantages. BMMA is a semi supervised approach and it is more suitable then all other methods because it takes advantages of both supervised and unsupervised techniques.
10. References
-
Pooja Kamavisdar, Sonam Saluja, Sonu Agrawal. A survey on image classification approaches and techniques, Department of Computer Science & Applications, SSCST, Bhilai, India, IJARCCE, Vol.2, Issue.1, Jan 2013.
-
Mostafa Sabzekar, Mohammad Ghasemigol, Mahmoud Naghibzadeh, H. S. Yazdi, Improved DAG SVM a new method for multiclass svm classification, Department of computer science & Engineering, Ferdowsi University of Mashhad, Iran ICAI09I.
-
Zhu, Xiaojin, Semi supervised learning literature survey,Department of Computer science & Engg, University of Wisconsin-Madison1530,2005,
http://www.cs.wise.edu/~jerryzhu/pub/ss1_survey.pdf
-
http://www.jars1974.net/pdf/12_Chapter11.pdf
-
Jipsa Kurian, V. Karunakaran, A survey on image classification methods, Department of Computer Science, Karunya University, Coimbatore, India, IJARECE, Volume 1, Issue.4, Oct 2012.
-
Wei-jiu Zhang, Li Mao and Wen-bo Xu. Automatic image classification using the classification ant- colony algorithm, Jiangnan University, Wuxi, China, 2009 International Conference on Environmental Science and Information Application Technology.
-
D.Lu and Q. Wend, A survey of image classification methods and technology for improving classification performance, Department of Geography, Geology, and Anthropology, Indiana State University, USA, IJRS, Vol.28, No.5, 10 March 2007.
-
Saurabh Agrawal, Nishchal K Verma, Prateek Tamrakar, Pradip Sircar, Content based color image classification using SVM, Department of Electrical Engg., IIT, Kanpur, India, 2011 8th international conference on information technology.
-
S.Lazebnik, C. Schmid and J. Ponce,Beyond bags of features spatial pyramid matching for recognizing natural scene categories, IEEE Con. Cmputer. Vis. Pattern Recognition. Vol.2, pp.2169-2178. March 2006.
-
Qian Du, Unsupervised real time constrained linear discriminate analysis to hyper spectral image classification, Department of Electrical & Computer Engg, Mississippi state university, USA, www.sciencedirect.com, Pattern Reorganization (2005) 361-368.
-
Serafeim Moustakidis, Giorgos Mallinis, Nikos Koutsias, John B. Theohari, Member IEEE and Vesalius Petridis, Member IEEE, SVM based fuzzy decision tree for classification of high spatial resolution remote sensing images, IEEE transaction on geosciences and remote sensing, Vol.50, No.1, January 2012.
-
Lining Zhang, Student Member, IEEE, Lipo Wang, Senior Member, IEEE, Weisi Lin, Senior Member, IEEE, Semi supervised biased maximum margin analysis for interactive image retrieval, IEEE transaction of image processing, Vol.21, No.4, April 2012.
-
M. Pesaresiand and J. A. Benediktsonn, A New approach for the Morphological segmentation of high- resolution satellite imagery, IEEE Trans. Geosci.Remote Sens., Vol 39, No.2, pp. 309-320, Feb 2001.
-
Christopher J. C. Burges, A Tutorial on support vector machine for pattern recognition, Data Mining and know discovery 2, pp. 121-167, 1998.