Review of Image Classification Methods and Techniques

Maneela Jain; Pushpendra Singh Tomar

doi:10.17577/IJERTV2IS80157

Volume 02, Issue 08 (August 2013)

Review of Image Classification Methods and Techniques

DOI : 10.17577/IJERTV2IS80157

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 789
Total Downloads : 4142
Authors : Maneela Jain, Pushpendra Singh Tomar
Paper ID : IJERTV2IS80157
Volume & Issue : Volume 02, Issue 08 (August 2013)
Published (First Online): 16-08-2013
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Review of Image Classification Methods and Techniques

Maneela Jain Pushpendra Singh Tomar

Lnct, Bhopal, Lnct, Bhopal,

Abstract

Unsupervised region become most challenging area in image processing. Data, object and image classification is a very important task in image processing. If any image has noisy content or its contain blurry data, so it is very difficult to classify these kinds of images. Classification is nothing but just the categorization of same kind of data in same category. Digital image processing introduces many techniques which can classify the data, but if image is blurry or noisy so they can not able to provide the satisfactory results. In this survey paper three main classification methods consider, Supervised learning, unsupervised learning, Semi-supervised Learning. The main motive of this literature survey is to give a brief comparison between different image classification techniques and methods. Finally it has shown that Semi-Supervised Biased Maximum Margin Analysis classifies the images more accurately even if they contain blurry or noisy image.

Introduction

Every day millions of images produced. Every image requires classification in this way, By which they can occur easily and in a higher speed. Humans have the capabilities to classify the images more easily then computers. A simple classification system consists of a camera fixed high above the interested zone where images are captured and consequently process [1]. Classification is a procedure to classify images into several categories, based on their similarities. We can easily understand or analyses our surroundings by classifying the images. But it is not always easy to classify an image especially when it contains noisy or blurry contents. In the classification system user deal with a database and that database contains some patterns or images which are predefined or which are going to be classified. Image classification always is a critical but an important task for many applications.

Sometimes it is very hard to identify an object in an image. Especially when it contains, noise, Background clutter or bad quality. And if any image includes more than one object so this task becomes more difficult. So we can say that, the main principle of image classification is to recognize the features occurring in an image. We can discuss three major techniques of image classification and some other related technique in this paper.

First technique is supervised classification. In

supervised learning labeled data points are used. Or we can say that training is required in the supervised learning. Second are unsupervised classification uses no labeled data that means no training is required. In this case we can take any random data. Third technique is Semi-supervised classification take several advantages over Supervised and Unsupervised classification. It uses unlabeled data points in order to remove the need for extensive domain scientist interaction and deal with bias that is the result of poor representation of labeled data. In this survey paper Semi supervised learning has been taken from [3]. As we know that the main principle of an image classification is to recognize the feature occurring in an image. Generally classification is done by computer. Computer classifies images with the help of different mathematical techniques. Classification will be made according to the following steps which shows in figure 1-

Definition of Classification Classes

Classification classes always depend on the property and objective of the image, and it should be clearly defined.

Selection of Features

Multi-spectral and multi-temporal properties should be used to fix the differences between the classes. Features of classes differ from one class to another.

Sampling of Training Data

To obtain correct decision rule it is necessary to sample the training data. Different classification techniques like Supervised, Unsupervised, and Semi supervised learning will be used according to the data.

Estimation of Universal Statistics

Several classification techniques will be compared with the data, and appropriate method will be selected.

Classification Method

Appropriate classification method will be used on the data. Some methods which we will discuss in this paper are- SVM, DAG, BMMA, Linear Discriminate Analysis, ANN, Fuzzy Tree.

Verification of Result

At last final result will be verified.

In this paper section 2, examine the related survey work that shows which classification method is most suitable. Section 3 compares between the three sampling methods i.e. Supervised, Unsupervised or Semi-supervised to show which is most suitable. Section 4 compares the other classification methods.

1.

1.

Definition of Classification

Selection of Features

Sampling of Training Data

Estimation of Universe Statistic

Classification Methods

Verification of Result
Related Work

Wei-jiu Zhang, Li Mao and Wen-bo Xu etld[6] Automatic Image Classification Using the Classification Ant-Colony Algorithm To improve the versatility, robustness, and convergence rate of automatic classification of images, An ant-colony based classification is defined in this paper. According to the characteristics of the image classification, traditional Ant-Colony algorithm is adopts and improves by this model. It defines two types of ants that have different search strategies and refreshing mechanisms. The stochastic ants identify new categories; construct the category tables and determining the clustering center of each category. The experiment indicate that ant-colony algorithm improve the efficiency and accurate the result.

D.Lu and Q. Wend etld [7] did a survey on image classification techniques and methods. Image classification is a complex process that may be affected by many factors. They examine current practices, problems, and prospects of image classification. The emphasis are placed on the summarization of major advanced classification approaches and the techniques used for improving classification accuracy.

Jipsa Kurian, Vkarunakaran etld[5] did a survey on image classification method and find Image classification is one of the most complex areas in image processing. It is more complex and difficult to classify if it contains blurry and noisy content. There are several methods to classify images and they provide good classification result but they fail to provide satisfactory classification result when the image contains blurry and noisy content. The two main methods for image classification are supervised and unsupervised classification. Both classifications have its own advantage and disadvantage. It is difficult to obtain better result with the noisy and blurry image than with normal image.

Saurabh Agrawal, N.K. Varma, Prateek Tamrakar and Pradip Sircar etld [8] increase the classification using support vector machine. Traditional classification approaches deal poorly on content based image classification tasks being one of the reasons of high dimensionality of the feature space. In this paper, color image classification is done on features extracted from histograms of color components. The benefit of using color image histograms are better efficiency, and insensitivity to small changes in camera view-point i.e. translation and rotation.

Quia Du etld [10] they proposed a constrained linear discriminate analysis (CLDA) approach for classifying the remotely sensed hyper spectral images. Its basic idea is to design an optimal linear transformation operator which can maximize the ratio of inter-class to intra-class distance while satisfying the constraint that the different class centres after transformation are aligned along different directions. Its major advantage over the traditional Fishers linear discriminate analysis is that the classification can be achieved simultaneously with the transformation. The CLDA is a supervised approach, i.e., the class spectral signatures need to be known a priori. But, in practice,

these information may be difficult or even impossible to obtain. So they will extend the CLDA algorithm into an unsupervised version, where the class spectral signatures are to be directly generated from an unknown image scene.

Mostafa Sabzekar, Mohammad Ghasemigol, Mahmoud Naghibzadeh, H. S. Yazdi [2] proposed Directed Acyclic Graph Support Vector Machines (DAG SVM). It suggests a weighted multi-class classification technique which divides the input space into several subspaces. In the training phase of the technique, for each subspace, a DAG SVM is trained and its probability density function (pdf) is guesstimated. In the test phase, fit in value of each input pattern to every subspace is calculated using the pdf of the subspace as the weight of each DAG SVM. Finally, a fusion operation is defined and applied to the DAG SVM outputs to decide the class label of the given input pattern. Evaluation results show the prominence of our method of multi-class classification compared with DAG SVM.

S.Moustakidis, G. Mallinis, N. Koutsias, John B. Theocharis [11] fuzzy decision tree is proposed in this paper. Where, the node discriminations are implemented via binary SVMs. The tree structure is determined via a class grouping algorithm. In addition, effective feature selection is incorporated within the tree building process, selecting suitable feature subsets required for the node discriminations individually. FDT-SVM exhibits a number of attractive merits such as enhanced classification accuracy, interpretable hierarchy, and low model complexity. Furthermore, it provides hierarchical image segmentation and has reasonably low computational and data storage demands.

L. Zhang, L. Wang, W. Lin [12].In this Paper we will discuss the semi supervised biased maximum margin analysis for interactive image classification. A variety of relevance feedback (RF) schemes have been developed as a powerful tool to bridge the semantic gap between low-level visual features and high-level semantic concepts, and thus to improve the performance of CBIR systems. Among various RF approaches, support-vector-machine (SVM)-based RF is one of the most popular techniques in CBIR. Despite

the success, directly using SVM as an RF scheme has two drawbacks mainly. First, it treats all the feedbacks equally i.e. negative and positive, which is not appropriate since the two groups of training feedbacks have distinct properties. Second, most of the SVM- based RF techniques do not take into account the unlabeled samples, although they are very helpful in constructing a good classifier. To explore solutions to overcome these two drawbacks, in this paper, we propose a biased maximum margin analysis (BMMA) and a semi supervised BMMA (Semi BMMA) for integrating the distinct properties of feedbacks and utilizing the information of unlabeled samples for SVM-based RF schemes. To differentiate positive feedbacks from negative ones based on local analysis BMMA is used, whereas the Semi BMMA can effectively integrate information of unlabeled samples by introducing a Laplacian regularize to the BMMA. We formally formulate this problem into a general subspace learning task and then propose an automatic approach of determining the dimensionality of the embedded subspace for RF. Extensive experiments on a large real-world image database demonstrate that the proposed scheme combined with the SVM RF can significantly improve the performance of CBIR system.

Ajay Kumar Singh, Shamik Tiwari & V.P. Shukla etld [13] Wavelet based Multi Class image classification using Neural Network, A feature extraction and classification of multiclass images by using Haar wavelet transform and back propagation neural network. The wavelet features are extracted from original texture images and corresponding complementary images. The features are made up of different combinations of sub-band images, which offer better discriminating strategy for image classification and enhance the classification rate.

Discussion

Related work proof that all other methods are very good and classify the image efficiently. But semi supervised BMMA method is more efficient then other methods because of 2 reasons. Firstly it is a semi supervised classification, as we know semi supervised classification gives accurate and cost effective result and secondly BMMA overcome the disadvantages of RBF SVM.

Comparison between Data Sampling Methods

Supervised Classification

Supervised learning based classification is depends on data which is created from the knowledge of domain. I n supervised learning labelled data points are used. To determine accurate categorization of an image in supervised classification pre-labelled samples are required. In this method training is required or expertise knowledge is required so this technique become time consuming. Thats by in some areas this technique is not suitable. In order to determine a decision rule for classification, it is necessary to know the spectral characteristics or feature with respect to the population of each class [4].

Advantages

Errors can be detected by operators and they often remedy them [5].
Expertise knowledge required, so this method will give the accurate result.

Disadvantages
Not suitable to deal with big data, because for each area it requires area experts.
Very Time consuming. It takes so much time to identify pre-labelled samples.

Unsupervised Classification

Some situation requires little information about the area to be classifies, only image properties are used as-
1. Randomly sampled datas several groups, will be divided mechanically into the same classes by using clustering techniques.
2. These clustered classes later used for determining population statistics. This kind of classification is called the unsupervised classification.
Advantages
Scientist spends less time to classify the domain. As a result only required images are classified.
This approach is very a suitable to classify large data.

Disadvantages
Any kind of training is not given in this method, so it requires great knowledge about the area or about the method which is suitable for the desired area.

With large data sets computation time is large and it creates useless classifier.

Semi Supervised Classification

This method is used to deal with the non-labelled samples to assist with the supervised classification method. By this method we will be able to deal with unsupervised classification. This shows that this method is able to deal with the situation where labelled data points are abundance. Sometimes both supervised and unsupervised methods do not able to obtain efficient result, but semi-supervised approach gives the accurate result and focuses completely on efficiency which is the principle of semi-supervised classification. Semi-supervised method does their classification in three steps. Firstly it selects the labelled or un-labelled data points, i.e. data point selection. After selecting the data point it creates the initial classifier, which is useful in third step. And the last step is to clustering the data points to find classifier. Semi- supervised technique is best suited in much application and will give the accurate results.

Direct Acyclic

		neurons. They connected with each other by a weighted link.
2	DAG-	DAG suffered a
	SVM	Graph based support Vector machine performs a better classification in	little bit problems with mapping of space data into feature selection process. Performance of result evaluation
		compression of	shows that SVM-
		another binary multi-class	DAG is not a better classifier.
		classification.
		Feature data is
		mapping by the
		graph portion
		technique
		applied by
		DAG. The
		mapping space
		of feature data
		mapped
		correctly
		automatically
		improved the
		voting process
		of classification.
3	SVM	SVM is a binary	Training required,
		Non- parametric	which is time
		classifier. Some	consuming.
		SVM supports	Transparency in
		multiclass	final result is very
		classifiers also.	less.
		SVMs [14] are
		learning
		systems that
		use a
		hypothesis
		space of linear
		functions in a
		hyper space.
		SVM is trained
		with a Learning
		algorithm from
		optimization
		theory that
		implements a
		learning bias
		derived from

		neurons. They connected with each other by a weighted link.
2	DAG-	Direct Acyclic	DAG suffered a
	SVM	Graph based support Vector machine performs a better classification in	little bit problems with mapping of space data into feature selection process. Performance of result evaluation
		compression of	shows that SVM-
		another binary multi-class	DAG is not a better classifier.
		classification.
		Feature data is
		mapping by the
		graph portion
		technique
		applied by
		DAG. The
		mapping space
		of feature data
		mapped
		correctly
		automatically
		improved the
		voting process
		of classification.
3	SVM	SVM is a binary	Training required,
		Non- parametric	which is time
		classifier. Some	consuming.
		SVM supports	Transparency in
		multiclass	final result is very
		classifiers also.	less.
		SVMs [14] are
		learning
		systems that
		use a
		hypothesis
		space of linear
		functions in a
		hyper space.
		SVM is trained
		with a Learning
		algorithm from
		optimization
		theory that
		implements a
		learning bias
		derived from

Advantages

Cost of classification will decreases because labelled data combined with the unlimited un- labelled data.
Focuses on accurate results and efficiency.

Disadvantages
Global maximum problem.

This Comparison shows that semi supervised classification is much better than both the supervised and unsupervised classification.

Comparison between Data Sampling Methods

S. No.	Method	Description	Disadvantages
1	ANN	Artificial Neural	ANN required
		Network is a	training which is
		kind of artificial	costly and time
		intelligence that	consuming. ANN
		controls human	is network
		minds function.	architecture so
		It is a non-	sometimes it
		parametric	becomes very
		approach. Non-	hard to choose
		parametric	which network is
		approach has	most suitable for
		no assumption	over approach.
		about the data
		and where
		correctness
		depends on the
		no. of inputs
		and network.
		Ann learns from
		the
		environment,
		and stores the
		experiential
		knowledge.
		ANN is a
		collection of
		layer basically it
		has 2 layers i.e.
		input and
		output, but
		some system
		has hidden
		layers. Every
		layer has no. of

distinguish between positive and negative feedback.

Based on local analysis. It also uses the

benefits of semi supervised approach. Semi BMMA forming RF by

combining unlabeled samples, and

remove the

over fitting problem of labelled samples.

problem.

		statistical learning theory. The aim of Classification via SVM is to find a computationally efficient way of learning good separating hyper planes in a hyperspace, where good hyper planes mean ones optimizing the generalizing bounds and by computationally efficient we mean algorithms able to deal with sample sizes of very high [8].
4	FDT	Fuzzy Decision	Main
		Tree approach	disadvantage of
		is a non	this method is
		parametric	that it does not
		Unsupervised	require training,
		approach and	so prior
		based on	knowledge about
		hierarchical rule	the desired area
		based method.	is required. Gives
		Fuzzy nature of	complicated
		this method	calculations,
		makes it more	when various
		reliable,	undecided
		because fuzzy	outputs are
		uses stochastic	correlated.
		approach. FDT
		uses the
		advantages of
		both the
		methods i.e.
		Fuzzy and
		decision tree.
5	BMMA	Semi	Basically Semi
		supervised	supervised
		BMMA	BMMA has many
		enhanced the	advantages, but
		content based	the main problem
		image retrieval	is that, it suffers
		approach. Semi	with global
		BMMA easily	maximum

		statistical learning theory. The aim of Classification via SVM is to find a computationally efficient way of learning good separating hyper planes in a hyperspace, where good hyper planes mean ones optimizing the generalizing bounds and by computationally efficient we mean algorithms able to deal with sample sizes of very high [8].
4	FDT	Fuzzy Decision	Main
		Tree approach	disadvantage of
		is a non	this method is
		parametric	that it does not
		Unsupervised	require training,
		approach and	so prior
		based on	knowledge about
		hierarchical rule	the desired area
		based method.	is required. Gives
		Fuzzy nature of	complicated
		this method	calculations,
		makes it more	when various
		reliable,	undecided
		because fuzzy	outputs are
		uses stochastic	correlated.
		approach. FDT
		uses the
		advantages of
		both the
		methods i.e.
		Fuzzy and
		decision tree.
5	BMMA	Semi	Basically Semi
		supervised	supervised
		BMMA	BMMA has many
		enhanced the	advantages, but
		content based	the main problem
		image retrieval	is that, it suffers
		approach. Semi	with global
		BMMA easily	maximum

Conclusion

This paper discuss about the image classification techniques and methods. This paper provides detailed information about the different classification techniques and methods. Main classification techniques are divided into three categories such as Supervised Classification, Unsupervised Classification and Semi- supervised classification. And also there related techniques such as ANN and SVM are a supervised approach and having some disadvantages. FDT is an unsupervised classification method and it also has some advantages and disadvantages. BMMA is a semi supervised approach and it is more suitable then all other methods because it takes advantages of both supervised and unsupervised techniques.

10. References

Pooja Kamavisdar, Sonam Saluja, Sonu Agrawal. A survey on image classification approaches and techniques, Department of Computer Science & Applications, SSCST, Bhilai, India, IJARCCE, Vol.2, Issue.1, Jan 2013.
Mostafa Sabzekar, Mohammad Ghasemigol, Mahmoud Naghibzadeh, H. S. Yazdi, Improved DAG SVM a new method for multiclass svm classification, Department of computer science & Engineering, Ferdowsi University of Mashhad, Iran ICAI09I.
Zhu, Xiaojin, Semi supervised learning literature survey,Department of Computer science & Engg, University of Wisconsin-Madison1530,2005,

http://www.cs.wise.edu/~jerryzhu/pub/ss1_survey.pdf
http://www.jars1974.net/pdf/12_Chapter11.pdf
Jipsa Kurian, V. Karunakaran, A survey on image classification methods, Department of Computer Science, Karunya University, Coimbatore, India, IJARECE, Volume 1, Issue.4, Oct 2012.
Wei-jiu Zhang, Li Mao and Wen-bo Xu. Automatic image classification using the classification ant- colony algorithm, Jiangnan University, Wuxi, China, 2009 International Conference on Environmental Science and Information Application Technology.
D.Lu and Q. Wend, A survey of image classification methods and technology for improving classification performance, Department of Geography, Geology, and Anthropology, Indiana State University, USA, IJRS, Vol.28, No.5, 10 March 2007.
Saurabh Agrawal, Nishchal K Verma, Prateek Tamrakar, Pradip Sircar, Content based color image classification using SVM, Department of Electrical Engg., IIT, Kanpur, India, 2011 8th international conference on information technology.
S.Lazebnik, C. Schmid and J. Ponce,Beyond bags of features spatial pyramid matching for recognizing natural scene categories, IEEE Con. Cmputer. Vis. Pattern Recognition. Vol.2, pp.2169-2178. March 2006.
Qian Du, Unsupervised real time constrained linear discriminate analysis to hyper spectral image classification, Department of Electrical & Computer Engg, Mississippi state university, USA, www.sciencedirect.com, Pattern Reorganization (2005) 361-368.
Serafeim Moustakidis, Giorgos Mallinis, Nikos Koutsias, John B. Theohari, Member IEEE and Vesalius Petridis, Member IEEE, SVM based fuzzy decision tree for classification of high spatial resolution remote sensing images, IEEE transaction on geosciences and remote sensing, Vol.50, No.1, January 2012.
Lining Zhang, Student Member, IEEE, Lipo Wang, Senior Member, IEEE, Weisi Lin, Senior Member, IEEE, Semi supervised biased maximum margin analysis for interactive image retrieval, IEEE transaction of image processing, Vol.21, No.4, April 2012.
M. Pesaresiand and J. A. Benediktsonn, A New approach for the Morphological segmentation of high- resolution satellite imagery, IEEE Trans. Geosci.Remote Sens., Vol 39, No.2, pp. 309-320, Feb 2001.
Christopher J. C. Burges, A Tutorial on support vector machine for pattern recognition, Data Mining and know discovery 2, pp. 121-167, 1998.

Review of Image Classification Methods and Techniques

Definition of Classification Classes

Selection of Features

Sampling of Training Data

Estimation of Universal Statistics

Classification Method

Verification of Result

Discussion

Supervised Classification

Unsupervised Classification

Semi Supervised Classification

Leave a Reply