Handwritten Character Recognition using ANN

DOI : 10.17577/IJERTV3IS070070

Download Full-Text PDF Cite this Publication

Text Only Version

Handwritten Character Recognition using ANN

1Shushant Chak, 2Ambalika Sharma

1M.Tech, Department of Electrical Engineering, Indian Institute of Technology Roorkee, Uttarakhand, India.

2 Assistant Professor, Department of Electrical Engineering, Indian Institute of Technology Roorkee, Uttarakhand, India.

Abstract – Automatic recognition of handwritten characters is a problem that is currently garnering a lot of attention. The ability to efficiently process small handwriting samples, like those found on cheques and envelopes, is one of the major driving forces behind the current research. The Project is a computer driven application that converts the photograph of hand written scripts into text documents with minimum effort. The project is based on Pattern recognition techniques using artificial neural networks. In this approach, an artificial neural network is trained to identify similarities and patterns among different handwriting samples. We explore these techniques to design an optimal handwritten English word recognition system based on character recognition. Post processing technique that uses lexicon is employed to improve the overall recognition accuracy. A number of techniques are available for feature extraction and training of CR systems in the literature, each with its own superiorities and weaknesses.

Keywords – Automatic, handwritten, character, recognition, neural network.

  1. INTRODUCTION

    It is really a challenging issue to develop a practical hand- written character recognition (CR) system which can maintain high recognition accuracy. In most of the existing systems recognition accuracy is heavily dependent on the quality of the input document. In handwritten text adjacent characters tend to be touched or overlapped. Therefore it is essential to segment a given string correctly into its character components. In most of the existing segmentation algorithms, human writing is evaluated empirically to deduce rules [1]. But there is no guarantee for the optimum results of these heuristic rules in all styles of writing. Moreover handwriting varies from person to person and even for the same person it varies depending on mood, speed etc. This requires incorporating artificial neural networks, hidden Markov models and statistical classifiers to extract segmentation rules based on numerical data. [2][3][4]. After segmentation next crucial step is representation of character classes by features. These features should have high discriminative abilities so that they are different for different character classes (for example 26 uppercase and 26 lowercase characters in case of English language). Also, these features should be independent of the intra class variations. The different representation methods can be categorized into three major classes [1]:

    1. Global transformation and series expansion: includes Fourier transform, Gabor transform, wavelet, moments and Karhuen- Loeve Expansion.

    2. Statistical representation: Zoning, crossing and distances, projections.

    3. Geometrical and topological representation: Extracting and counting topological structures, geometrical properties, coding, graphs and trees etc.

      Features which depend on Fourier transform are suitable for recognizing handwritten numerals where 96% accuracy has been achieved [5]. Gradient features have been widely used in CR for machine and hand printed binary character images. But these features are not invariant to deformations in the characters. In [6], a new gradient feature is used where at each pixel, gradient is mapped onto 12 direction codes with an angle span of 30 degree between the directions. In [7], a redesigned direction feature [8] with a view to describe the character contour more effectively is developed. Also, an additional global feature was introduced in this technique to improve the recognition accuracy for those characters that were most frequently confused with patterns of similar appearances. But the disadvantage of this technique is its failure to deal with changes in stroke width as these features are extracted from non-thinned character images. Another crucial module in a character recognition system is its pattern recognition module which assigns an unknown sample to a predefined class. Numerous techniques for character recognition can be classified into four general approaches of pattern recognition.

  2. SYSTEM DESIGN MODEL

      1. Generic character recognition

        A generic character recognition system may be shown in figure 1. Its different stages are as below:

        • Input: Samples are read to the system through a scanner.

        • Preprocessing: Preprocessing converts the image into a form suitable for subsequent processing and feature extraction.

        • Segmentation: The most basic step in OCR is to segment the input image into individual glyphs. This step separates out sentences from text and subsequently words and letters from sentences.

        • Feature extraction: Extraction of features of a character forms a vital part of the recognition process. Feature extraction captures the vital details of a character.

        • Classification: During classification, a character is placed in the appropriate class to which it belongs. Character classification is roughly categorized as Sub-

        symbolic classifiers and Symbolic classifiers. The ANN approach is classified to be a sub symbolic classifier.

        As mentioned segmentation plays an important role in the overall process of recognition of printed and handwritten characters. This is more so with cursive writing. Success and failure of an OCR system depends on the segmentation process. But this description is related to a work that attempts to use ANNs for the segmentation stage of an ANN based OCR system exclusively for Assamese which is an important language in NE region of India.

        The reasons behind the use of ANNs for segmentation are as below:

        1. Static segmentation method suffers from a serious drawback that it cannot fix segmentation boundaries for cases where inputs have size and inclination variations.

        2. Static Segmentation methods also fail to fix segmentation boundaries for cases where there are writer induced variations in inputs. Figure 2 shows the failure of static segmentation methods in dealing with writer induced variations.

          The solution for such cases can be given by ANNs these have the ability to learn shapes and that way discriminate segmentation boundaries. ANNs have been used for several character recognition systems. Some of the segmentation methods relevant in practice is described in [3]. For cursive writing Cheng, Liu et. al [5] provides a description of available segmentation methods. Use of ANNs for segmentation has been reported by Blumenstein [6]. Other similar works are [7], [8], [9], [10], [11], [12], [13] to name a few. Very few known attempts have been reported regarding use of ANNs for segmentation in the Indian OCR scenario. The work conceptualized an ANN based OCR system where segmentation is done by a multi-layered perceptron (MLP)- a class of feed forward neural network. The MLP is trained to do so. The algorithm involves, first training of an ANN with individual handwritten characters extracted from different individuals. Handwritten sentences are separated out from text using a static segmentation method. From the segmented line, individual characters are separated out by first over segmenting the entire line. Prior to all these steps some preprocessing steps are required for the scanned image.

          These are:

          1. Noise removal: It involves noise removal using certain filtering operations.

          2. Enhancement: Here the filtered images are enhanced using certain high boost filter makes and histogram equalization technique.

          3. Sharpening: For degraded or blurred images after noise cleaning operations sharpening may be done.

          4. Binarisation: After enhancement and sharpening the gray level image is converted into binary form so as to ease the computational load of the subsequent stages

          5. Normalization: The images just before the segmentation stage are converted to certain standard sizes. If the input has inclination and skew, respective corrections are done.

        After preprocessing the next step is segmentation of the input. During this stage first lines are separated out from the text first into lines and then the words are next segmented into the individual characters. The approach adopted here is a projection -based one.

        A brief outline of the static segmentation method is as below:

        1. Row -wise dissection:

          • Calculate row-wise pixel sums of the inputs

          • Obtain the row -wise projection of the inverted inputs

          • Find the minimum of the projections

          • A pair of closely lying minimum points defines one segmentation boundary.

          • Dissection boundaries give sub-images of inputs. Hold them in an array.

        2. Column-wise dissection:

        For each entry into the array as above

        • Calculate the sum of pixels column-wise.

        • Obtain the column-wise projection of the inverted sub-images.

        • Find the minimum points from these projections

        • A pair of consecutive minimum points defines one segmentation boundary.

        • Store every character dissected out of the subimage into an array. The array must also include space in between words.

        • The array holds the segmented outputs of the segmented sub-images as obtained in step 1.

        Base line character spacing then becomes comparable to word spacing. This affects word spacing. In such a situation morphological dilation maybe used as described. In case, modifiers are not separated from characters, especially in the case where modifiers are lying below the middle zone i.e. in the lower zone, the statistics of the horizontal projection is so obtained that a threshold is fixed that is 1.5 times the average line height. The non-zero valleys below the threshold indicate the separation boundary between the character and the modifier]. This method has certain drawbacks which are described in subsequent stages.

      2. Neural network classifier

    Digitalization and normalization are two steps that are related and often combined. In this step, the handwriting sample to be analyzed is scanned, creating an image file.

    This image file consists of a grid of zeros and ones that represent a character. The normalization process takes this image file, and begins to identify and isolate features that are present in the sample image. These features either will be general, capturing characteristics like size, position, and orientation, or they will be local, capturing specific features of an image file. One example that illustrates the identification of local characteristics is an experiment using numerals that required the extraction of 80 directional features and 45 shape features. Whether the normalization process seeks to identify general characteristics or, instead, focuses on local characteristics, the goal is to cull a portion of the handwriting sample that will be representative of the sample as a whole. In the context of signature verification, general characteristics can be used to describe the overall shape of the signature. These general characteristics can be used to spot an obvious forgery. Local characteristics will examine pieces of the signature, like the length of a curve in a given letter. These specific characteristics can be used to help spot skilled forgeries, in which the forger has mimicked the overall shape of the signature, but was unable to fully master the individual parts of the signature. This is the most difficult step of preprocessing. It is also the most important step. The goal of segmentation is to break the handwriting sample down into smaller entities. These entities may represent individual characters, or they may represent individual pieces of a character. In either case, segmentation allows the ANN to examine small pieces of handwriting samples. This will aid the ANN in its analysis, by allowing it to compare the local details of a suspected forgery to those of a known, genuine signature. Similar to other preprocessing procedures, several different methods for segmentation exist, all sharing some similarities. Overlapping characters are one situation with which the heuristic method has trouble. In the case of two improperly connected characters, a heuristic algorithm will be able to locate minimas, but will be unable to distinguish which character they belong to. To address special cases like this, researchers are constantly striving for improved segmentation techniques. One such technique is to use ANNs in the segmentation process. Blumenstein describes an approach to segmentation using an ANN.

    First, the ANN must first be trained to recognize correct and incorrect segmentations. This is done by first segmenting using a standard heuristic method and then classifying the resulting segmentation points as either correct or incorrect. This allows the ANN to identify certain characteristics that will aid in the classification of segmentation points and the removal of the incorrect segmentation points. As a special note, it should be mentioned that Blumenstein does not use the raw pixel data for each segmentation point. Instead, a representation of the pixel density for each segmentation point is used. This pixel density representation is calculated by first extracting and normalizing a matrix of pixels for each segmentation point. Each matrix is then broken up into their windows of equal size, which allows the calculation of the ratio of black pixels to the total number of pixels for each window. For example, if the matrix window is 10 pixels by 10 pixels, and there are 45 black pixels in the window, the

    density is .45. Though neural networks are not a necessary component to the preprocessing stage, they prove quite helpful, by making classification and recognition of characters more efficient. With the necessary preprocessing completed, it becomes possible to begin to classify the data.

    One approach studied by Xiao uses a modified multilayer perceptron (MLP) approach, to identify and verify signatures. The topology of Xiaos network is composed of three main components. There is a feed-forward path, a converter, and a feedback path. Figure 4 illustrates this networks topology. Careful analysis reveals that this network actually consists of two separate MLPs. In order to understand how this neural network functions, it is necessary to examine each piece of the network separately. The structure of the feed-forward path consists of one input layer, two hidden layers, and one output layer. The handwriting sample is divided into an N x M grid, and a 3 x 3 window is slid over the grid. The nine resulting grid squares function as inputs to the MLP, and are, subsequently, shared among the three nodes of the first input layer. The first hidden layer is fully connected to the second hidden layer, which is fully connected to the output layer. An interesting feature of this network is the units that connect the feedback path to the feed-forward path. The converter takes the output from the feed-forward path and splits it into the three values to be used as inputs in the feedback path. The value of the first input is simply the output from the feed-forward path. The value of the second input is the complement of the output from the feed- forwad path, which gives an indication of the likelihood of a negative decision. The third input is equal to 1, representing the bias signal. In the context of ANNs, bias is simply the preference of one possible outcome over another.

    Topology of the neural network classifier

    The feedback path in this ANN is very similar to the feed- forward path. In fact, the feedback path mimics the structure of the feed forward path, mirroring the input, output, and hidden layers. The only difference is that, in the feedback path, the input layer is fully connected to the first hidden layer. The output from each subsequent node in hidden layers one and two of the feedback path are directed to their respective unit counterparts in the feed forward path. By doing this, the feedback succeeds in altering the weights of the feed-forward path. In this manner, the feedback path provides back propagation for the entire neural network. Information will continue to propagate through this network until total error is minimized. First, genuine signature samples and known signature forgeries were used to train the feed-forward path. The entire network was then trained using the same training set used to train the feed-forward path. Ideally, different training sets would be used to train the separate paths, but, in reality, genuine forgery data is hard to obtain. Known forgeries are hard to obtain because

    one, forgery is hard to detect, and two, although the number of total forgeries is large, the number of forgeries for any given signature is quite small. The details of the training procedures can be found in section 4.1.4. In this experiment, Xiao was interested in determining two things. One, how effective is the feedback path, and two, how effective is the use of artificially generated forgeries. For each subject, two training sets were collected. The first set contained genuine samples of the signature of that individual, and some samples of random forgeries. The second set consisted of genuine signatures, and artificially generated forgeries. These artificial forgeries were generated by removing certain stable pieces from genuine signatures, thus turning

    genuine signatures into forgeries. The testing set for each subject included genuine signatures (not used in the training sets), random forgeries, and skilled forgeries. In total, 350 genuine signatures, 158 skilled forgeries, and 230 random forgeries were used. The first part of the experiment consisted of comparing the performance of the network with and without the feedback path. Training set 2 was used to train the network in this evaluation, followed by the processing of the testing set. To further gauge performance, random and skilled forgeries were separately tested. Overall, it was found that the presence of the feedback path reduces the error rate of classifying genuine signatures as false by as much as 6.7%.

    The second part of the experiment was designed to determine the impact artificial forgeries had on the accuracy of an ANN. First, the network was trained using training set 1, and then tested using the test set. The network was then trained using training set 2, and again tested using the test set. It was shown that the ANN performed much better when it was trained using the set thatcontained artificial signatures. Because some individuals had their signatures used in both training set 1 and the testing set, the random forgeries in the testing set were broken up into two pieces. Untrained, random

    forgeries were provided by individuals whose signatures were not in training set 1. Trained, random forgeries were provided by individuals whose signatures were also in training set 1. For skilled forgeries, the error rate of classifying a forgery as genuine (FAR) was reduced from 38.9% to 17.0% when artificial forgeries were used. Similarly, there was a 9% reduction in the FAR for untrained, random forgeries. Because none of the forgeries in set 2 were used in training, there was no need to separate random forgeries. This data supports the conclusion that the strategy of using a feedback path is complementary to the strategy of using artificial forgeries.

  3. SIMULATION RESULTS

    The proposed CR system was tested on a database consisting of 26 word images. All of these images were given as inputto the proposed CR system. The lexicon used also consistedof the same 26 words that were used for testing. Out of these26 words, the proposed system correctly recognized 21 wordimages.

    The segmentation method used was efficient. The heuristic algorithm is based on rules which are deduced empirically and there is no guarantee about their optimum results for different styles of writing. So their validation using neural network becomes essential. We tried different Fourier

    features like moduli of Fourier coefficients, magnitude, phase and their various combinations as feature vectors.

  4. CONCLUSION

    This paper carries out a study of various feature based classification techniques for offline handwritten character recognition. After experimentation, it proposes an optimal character recognition technique. The proposed method involves segmentation of a handwritten word by using heuristics and artificial intelligence. Three combinations of Fourier descriptors are used in parallel as feature vectors. Support vector machineis used as the classifier. Post processing is carried out byemploying lexicon to verify the validity of the predicted word.The results obtained by using the proposed CR system arefound to be satisfactory.

  5. REFERENCES

  1. Amin, Adnan, et al. Recognition of hand-printed Latin characters based on a structural approach with a neural network classifier. Journal of Electronic Imaging, Vol. 6(3), July 1997. pp. 303-310.

  2. Bharath, Ramachandran. Neural Network Computing.McGraw-Hill, Inc., New York, 1994. pp. 4-43.

  3. Blumenstein, M. and Verma, B. An Artificial Neural Network Based Segmentation Algorithm for Off-line Handwriting Recognition. International Conference on Computational Intelligence and Multimedia Applications, 1998. pp. 306-311.

  4. Blumenstein, M. and Verma, B. Neural-based Solutions for the Segmentation and Recognition of Difficult Handwritten Words from a Benchmark Database. Proceedings of the Fifth International Conference on Document Analysis and Recognition, September 1999. pp. 281-284.

  5. Cho, Sung-Bae. Pattern recognition with neural networks combined by genetic algorithm.Fuzzy Sets and Systems, Vol. 103(2), April 1997. pp. 339-347.

  6. Gader, Paul D., et al. Neural and Fuzzy Methods in Handwriting Recognition. Computer, February 1997. pp. 79-86.

  7. Liu, Cheng-Lin, and Nakagawa, Masaki. Handwritten Numeral Recognition Using Neural Networks: Improving the Accuracy by Discriminative Training. Proceedings of the Fifth International Conference on Document Analysis and Recognition, September 1999. pp. 257-260.

  8. Luger, George F., and Stubblefield, William A. Artificial Intelligence: Structures and Strategies for Complex Problem Solving, (2nd Edition). Benjamin/Cummings Publishing Company, Inc., California, 1993, pp. 516-527.

  9. Molen, A.E., and Veelenturf, L.P.J. On-line signature validation using Kohonens Neural Network. Neural Networks: Best Practice in Europe. World Scientific Publishing Co. Pte. Ltd., Singapore. pp. 145-148.

  10. Russell, Stuart J., and Norvig, Peter. Artificial Intelligence: A Modern Approach. Prentice-Hall, Inc, New Jersey, 1995. pp. 567- 587.

  11. Skapura, David M., Building Neural Networks. ACM Press, New York. pp. 29-33.

  12. Xiao, Xu-Hong, and Leedham, Gary. Signature Verification by Neural Networks with Selective Attention.Applied Intelligence, Vol. 11(2), Sept/Oct. 1999. pp. 213-223.

  13. Stephane Armand, Michael Blumenstein and VallipuramMuthukkumarasamy, Off-line Signature Verification based on the Modified Direction Feature, 2006 International Joint Conference onNeural Networks, pages 684-691.

  14. Rasha Abbas, Department of Computer Science, RMIT, Masters thesis, A prototype System for offline signature verification using multilayered feed forward neural networks, 1996

  15. Milton Roberto Heinen and Fernando Santos Os´ori, Handwritten Signature Authentication using Artificial NeuralNetworks, International Joint Conference on Neural Networks, 2006.

  16. Dariusz Z. Lejtman and Susan E. George, On-line handwritten signature verification using wavelets and backpropagation neural Networks, Proceedings. Sixth International Conference on Document Analysis and Recognition, 2001.

Leave a Reply