- Open Access
- Total Downloads : 422
- Authors : Lisha Singla, Shamandeep Singh
- Paper ID : IJERTV3IS061240
- Volume & Issue : Volume 03, Issue 06 (June 2014)
- Published (First Online): 26-06-2014
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Offline Handwritten Devanagari Numerals Recognition using GLCM Features & Neural Networks
Lisha Singla
Dept. of Computer Science and Engineering Chandigarh University
Shamandeep Singh
Dept. of Computer Science and Engineering Chandigarh University
AbstractHandwritten Devanagari numeral recognition system using GLCM features and neural network is presented in this paper. Handwrittencharacter recognition is one of the most fascinating and challenging research areas in the field of image processing. It is the ability of the computer to receive and interpret handwritten input from various sources and convert it into editable text format. Feature extraction plays an important role in handwritten character recognition because of its effect on the capability of classifiers. In this paper, GLCM(Gray Level Co-occurrence Matrix) technique for feature extraction is used. Featuresof a numeral has been computed based on calculating the pair of pixels with specific values and specified spatial relationship occurrence in an image. Recognition and classification of numerals are then done by the use of Neural Networks. The recognition rate of the proposed system has been found to be quite high.
Keywords Handwritten Character Recogntione;Devanagari Numerals; Feature Extraction; Gray Level Co-occurrence Matrix (GLCM); Neural Networks
-
INTRODUCTION
Handwriting is the way by which people communicate with each other from the long time. Though today emailing is growing very fast but postal letters have its own importance. Optical character recognition is the field of automatic recognition of different characters from a document image. It coverts machine printed, hand printed or handwritten document file into editable text format. Handwriting recognition is one of the most interesting and challenging research areas in the field of pattern recognition and automation process [5]. Several research works have been focusing on new techniques and methods with the aim of achieving higher recognition accuracy in this area.
In general, handwriting recognition can be categorized into two types: on-line and off-line handwriting recognition methods. In on-line recognition system, handwriting is captured with the use of special pen in conjunction with electronic surface. But, in off-line recognition, handwriting is usually captured by optically scanning the input from a surface such as sheet of paper and is stored digitally. Handwriting recognition is easier in case of on-line
recognition methods ascompared to off-line recognition methods due to the temporal information available with the former.
Fig 1: Types of Handwriting Recognition
-
Applications
Handwritten numerals recognition have numerous applications [3] including those in postal addresses recognition, bank cheque processing, document verification, job application form sorting, automatic sorting of tests containing multiple choice questions and data entries.
-
Devanagari Script
A lot of research has been done in handwriting recognition for English, European and Chinese languages. But still there is a dearth of need to carry out research in Indian languages. Devanagari script is the most widely used Indian script and is used to write many languages such as Hindi, Nepali, Marathi, Sindhi, and Sanskrit and round 300 million people use it [1]. In this work, recognition of offline handwritten Devanagari numerals are specifically considered. Sample images of handwritten Devanagari numerals are shown in fig 2. It is seen that due to diverse human handwriting styles the input characteristics can vary in size, thickness, orientations, shape,
dimensions and format. These variations make the recognition task difficult.
Fig 2: Sample handwritten Devanagari numerals
-
Recent work
In recent years many attempts have been made to develop text recognition systems in almost all the scripts and languages of the world. In Devanagri script the attempts were made as early as 1977 in a research report on handwritten Devanagari printed characters [11]. After which a lot of work has been done by researchers on the Devanagari script with the use of different extraction methods and different classifiers.
Suthasinee Iamsa-at et al. [4] used HOG (Histogram of Oriented Gradient) features in deep learning of artificial neural networks for the recognition of Devanagari numerals. The system found to exhibit an accuracy of 78.4% on raw pixels while with the implementation of HOG 80% accuracy is achieved. Diagonal based feature extraction is used for extracting features of the handwritten Devanagari script by Ved Prakash Agnihotri [2]. Mainly artificial neural network technique is used to design system to preprocess, segment and recognize Devanagri characters by Neha Sahu et al. [7]. This system exhibit an accuracy of 75.6% on noisy characters.Shruti Agarwal et al. used template matching algorithm to recognize devanagari characters. Vikas J. Dongre
[1] presented a devanagari numeral recognition method based on statistical discriminant functions i.e linear, quadratic, diaglinear, diagquadratic and mahalabonis and an average accuracy of 78.87% is obtained. Ujjwal Bhattacharya et al. [2] presented a research report that deals with the problem of isolated handwritten numeral recognition of major Indian scripts. They proposed a scheme in which numeral is subjected to three MLP classifiers corresponding to three coarse-to-fine resolution levels in a cascaded manner. If rejection occurs at highest resolution, another MLP is used as the final attempt to recognize the input numeral by combining the outputs of three classifiers of the previous stages.Various feature extraction methods as well as classifiers are proposed in the literature for classification of handwritten
numerals and characters. Feature extraction methods include chain code, Gabor filter, Hough transformations, moments, structural features, principal component analysis. Classifiers include minimum mean distance, k-nearest neighbor techniques (KNN), support vector machines (SVM), neural Networks (NN), fuzzy based approaches and theircombinations [7].
-
-
PROPOSED HANDWRITTEN DEVANAGARI NUMERAL RECOGNITION SYSTEM
In this section proposed recognition system is described. A typical handwriting recognition system consists of:Image Acquisition, Image Pre-processing, Feature Extraction, Recognition and Classification and post-processing stages. A schematic diagram of the proposed recognition system is shown in Fig 3.
Fig 3: Flow process of handwritten Devanagari numerals recognition system
-
Image Aquisition
In Image acquisition, the recognition system acquires a scanned image as an input image. The image should have a specific format such as .jpeg, .bmp etc. This image is acquired through a scanner, digital camera or any other suitable digital input device. The present investigation used off-line handwritten Devanagari numerals dataset [2].
-
Image Pre-Processing
Pre-processing is a series of operations performed on the scanner image and also can be defined as cleaning of the
document image [5] and making it appropriate for the feature extraction step. Major steps done under pre-processing are:
-
Normalization
-
Noise Removal
-
Thinning
-
Normalization: All the input images must be of standard dimension for meaningful comparison of the features. Hence the images are normalized. In present work, data was all resized to pre-determined scale i.e. 32*32 pixels.
-
Image Thinning: Thinning is the transformation of a digital image into a simplified but topologically equivalent image. Image thinning is done to minimize processing time.
After performing above pre-processing steps, the image is ready for the extraction of features.
.
-
-
-
Feature Extraction
Feature extraction is done to find the set of parameters that can be used to define each character uniquely and precisely. Feature extraction plays an important role because of its effect on the capability of classifiers and ultimately on final accuracy. In this paper, Gray Level Co-occurrence Matrix (GLCM) technique is presented for feature extraction.
GLCM method is a statistical feature extraction method for global feature extraction. The GLCM functions characterize the texture of an image by calculating how often pairs of pixel with specific values and in a specified spatial relationship occur in an image, creating a GLCM, and then extracting statistical measures from this matrix.
To create a GLCM, graycomatrix function is used. The graycomatrix function creates a gray-level co-occurrence matrix (GLCM) by calculating how often a pixel with the intensity (gray-level) value i occurs in a specific spatial relationship to a pixel with the value j. Each element (i,j) in the resultant GLCM is simply the sum of the number of times that the pixel with value i occurred in the specified spatial relationship to a pixel with value j in the input image.
GLCM is based on a matrix that shows the distribution of occurrences in selected image. This method involves a statistical approach with a co-occurrence matrix which is able to describe a second order statistics for texture images. A gray-level co-occurrence matrix (GLCM) involves a two- dimensional histogram that i,j indicates the frequency of i
Fig 4: 4 directions of adjacency from pixel of interest
GLCM matrix stores the instance occurrencesbetween adjacent pixels. This is illustrated in fig
4 by taking an example image matrix and showing its corresponding GLCM matrix. Element (1,1) in the GLCM contains the value 1 because there is 1 instance of this pair in this image and element (1,2)because there are two instances in theimage.
Fig 5: A sample GLCM matrix
After creating GLCM matrix, several statistics can be derived from them using graycoprops function. In this paper, features extracted from GLCM are:
-
Contrast: It measures the local variations in the gray level co-occurrence matrix i.e. intensity contrast between a pixel and its neighbor over the whole image. Formula is:
, | |2(, )
-
Correlation: Measures the joint probability occurrence of the specified pixel pairs i.e. how correlated a pixel is to its neighbor over the whole image. Formula is:
(, )
occurs with event j. P(i, j, d, ) indicates the co-occurrence matrix frequency and d is the distance for a pair of pixels. The
,
.
direction is specified by which is with gray level i and j and angle can be 0°, 45°, 90°, and 1350 as shown in fig 4.
-
Energy: It provides the sum of squared elements in the GLCM and is also known as uniformity or the angular second moment. Formula is:
, (, )2
-
Homogeneity: It measures the closeness of the distribution of elements in the GLCM to the GLCM diagonal. Formula is:
-
(, )
1 + | |
,
-
-
Recognition
Neural Networks are definitely the preferred approach for recognizers, in cases of small variability of patterns. A neural network is a powerful data modeling tool that is relationships [9]. Here, the Feed forward neural network is used to recognize handwritten Devanagari numerals. The features extracted during feature extraction step are given as an input to the input layer of neural networks. There are two modes of operations as training mode and testing mode. In the training mode, the neuron can be trained to fire (or not), for particular input patterns and these trained patterns are stored in database which are further used for testing purpose. In the testing mode, when a taught input pattern is detected at the input, its associated output becomes the current output. If the input pattern does not belong in the taught list of input patterns, the firing rule is used to determine whether to fire or not. Table I shows the parameters taken for neural network training.
TABLE I. PARAMETERS OF NEURAL NETWORK TRAINING
Parameter
Value
Number of epochs
1000
Number of batch size
4
Learning Rate
0.3
Sparsity rate
0
Sparsity target
0.01
Momentum rate
0.0011
Training time
No limit
-
-
RESULTS AND DISCUSSION
The experiments are carried out in Windows 7 operating system with Intel Core i3-2350M Processor 2.30 GHz with 2 GB RAM. MATLAB 2012b is the tool used for implementation. Devanagari handwritten numerals database containing 60 numerals is taken [2] for recognition. GLCM feature vector is calculated for all numerals and are fed to Neural Networks for training. From the dataset of 60
numerals, 30 are used for training of neural networks, 10 are used for testing and rest 20 are used for validation purpose.
The training set and testing set were divided using k- Fold cross validation. The research employed 10 fold cross validation, which was simulated in the MATLAB 2012b. This method supposed that the data size equaled N samples. Each data set was divided into k parts and its size was N/k. This method used the training set for learning and data classification, which was checked using k round-training sets.For example, on the ith iteration, the ith training set was set as the testing set and the rest were training sets. Therefore, accuracy was calculated from the ratio between the whole training sets divided by the k value.
The recognition accuracy found in the proposed scheme is found to be satisfactory for handwritten Devanagari numerals database. This recognition accuracy is calculated as follows:
Accuracy= 1
where
NMS is the number of miss classification samples. NS is the number of samples.
-
CONCLUSIONS
This paper presents a method for applying GLCM (Gray Level Co-occurrence Matrix) for feature extraction and Neural Networks for recognition of offline handwritten Devanagari numerals. The recognition accuracy obtained is satisfactory. This research work can be extended for the recognition of handwritten Devanagari characters or for recognition of other Indian scripts where not much research is conducted for their recognition. More research work can be conducted in using GLCM features in combination with other features with the aim of achieving higher accuracy.
REFERENCES
-
Vikas J. Dongre et.al, Devnagari Handwritten Numeral Recognition using Geometric Features and Statistical combination Classifier, International Journal on Computer Science and Engineering (IJCSE), Vol. 5, No. 10, Oct 2013.
-
Ujjwal Bhattacharya, B.B. Chaudhuri,Handwritten Numeral Databases of Indian Scripts and Multistage Recognition of Mixed Numerals, IEEE, Vol 31, No. 3, March 2009.
-
Ved Prakash Agnihotri, Offline handwritten Devanagari Script Recognition, I.J. Information Technology and Computer Science, Vol. 8, pp 37-42, 2012.
-
Sutasinee Iamsa-at and Punyaphol Horata, Handwritten Character Recognition Using Histograms of Oriented Gradient Features in Deep Learning of Artificial Neural network, IEEE, 2013.
-
Nisha Sharma, Tushar Pathak, Bhupendra Kumar, Recognition for Handwritten English Letters: A Review, International Journal of Engineering and Innovative Technology, Vol. 2, Issue 7, Jan 2013.
-
Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, Electron spectroscopy studies on magneto-optical media and plastic substrate interface, IEEE Transl. J. Magn. Japan, vol. 2, pp. 740-741, August 1987 [Digests 9th Annual Conf. Magnetics Japan, p. 301, 1982].
-
Neha Sahu and Nitin Kali Raman, An Efficient Handwritten Devnagari Character Recognition System Using Neural Network, Automation, Computing, Communication, Control and Compressed Sensing (iMac4s), IEEE, , March 2013.
-
R. Jayadevan et. al., Offline Recognition of Devanagari Script: A Survey, IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, Vol 41, No. 6, Nov 2011.
-
Mitrakshi B. Patil et. al., Recognition of Handwritten Devnagari Characters through Segmentation and Artificial neural networks, International Journal of Engineering Research & Technology (IJERT),
Vol 1, Issue 6, ISSN: 2278-0181, Aug 2012
-
R. Amit et. al., Preprocessing and Image Enhancement Algorithms for a Form-based Intelligent Character Recognition System, Vol 2, No. 2, 2005.
-
I.K. Sethi and B. Chatterjee, Machine Recognition of constrained Hand printed Devnagari, Pattern Recognition, Vol. 9, pp. 69-75, 1977.
-
Raghuraj Singh et. al., Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network, International Journal of Computer Science & Communication 02/2002.
.