- Open Access
- Total Downloads : 428
- Authors : Suneel K Nagavi, Mahesh S Gothe, Praveen S Totiger
- Paper ID : IJERTV4IS070130
- Volume & Issue : Volume 04, Issue 07 (July 2015)
- DOI : http://dx.doi.org/10.17577/IJERTV4IS070130
- Published (First Online): 08-07-2015
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Optical Character Recognition based Auto Navigation of Robot by Reading Signboard
Prof. Suneel K Nagavi Prof. Mahesh S Gothe Department of Mechanical Engineering, Department of Mechanical Engineering Rural Engineering College, Hulkoti Rural Engineering College, Hulkoti Gadag,India Gadag,India
Prof. Praveen .S. Totiger Department of Mechanical Engineering Rural Engineering College, Hulkoti Gadag,India
AbstractAutonomous navigation is an essential prerequisite for successful service robots. Service robots need to have maps that support their tasks. But in this system, it allows a robot to discover path automatically by detecting and reading textual information in signs located (Sign board) by using OCR. In contexts such as homes and offices, sign boards placed sideways of the road, places are often identified by text or signs posted throughout the environment, by using the concept of the OCR, textual data can be extracted from the image (sign board) and robot can be navigated.
KeywordsAutonomous navigation, Service robot, Optical character recognition.
-
INTRODUCTION
Optical character recognition (OCR) is an important research area in pattern recognition. The objective of an OCR system is to recognize alphabetic letters, numbers, or other characters, which are in the form of digital images, without any human intervention. This is accomplished by searching a match between the features extracted from the given characters image and the library of image models. Ideally, we would like the features to be distinct for different character images so that the computer can extract the correct model from the library without any confusion. At the same time, we also want the features to be robust enough so that they will not be affected by viewing transformations, noises, resolution variations and
other factors. Figure 1.1 illustrates the basic processes of an OCR system.
An OCR engine is a system which can load an image, preprocesses the image, extracts proper image features, computes the distances between the extracted image features and the known feature vectors stored in the image model library, and recognizes the image according to the degree of similarity between the loaded image and the image models.
Fig 1.1 Basic process of OCR system
The pre-processing stage aims to make the image be suitable for different feature extraction algorithms. Some feature extraction algorithms only deal with the contours of the image while some algorithms calculate every pixel of the image. On the other hand, the initial image may be noise affected, or blurred by other reasons. The pre-processing stage which includes thresholding, binarizing, filtering, edge detection, gap filling, and segmentation can make the initial image more suitable for later computation.
The classification step, which corresponds to the matching stage of object recognition systems, assigns each loaded character image to one of the possible image models. The decision is made on the basis of the similarity measure. Given a shape representation scheme, the degree of similarity of any two shapes can be calculated by using a similarity measuring metric. [1]
A fully autonomous robot has the ability to,
-
Gain information about the environment.
-
Work for an extended period without human intervention.
-
Move either all or part of itself throughout its operating environment without human assistance.
-
Avoid situations that are harmful to people, property, or itself unless those are part of its design specifications.
-
Problem description
Service robots need to have maps that support their tasks. Traditional robot mapping solutions are well- suited to supporting navigation and obstacle avoidance tasks by representing occupancy information. The occupancy grid is used in the development of robust mapping and navigation systems for mobile robots operating in and exploring unstructured and unknown environments. The occupancy grid is a multidimensional random field that maintains stochastic estimates of the occupancy state of the cells in a spatial lattice. To construct a sensor-derived map of the robots world, the cell state estimates are obtained by interpreting the incoming range readings using probabilistic sensor models. One possible flow of processing for the use of occupancy grids in mobile robot mapping appears in Figure 1.2. The vehicle explores and maps its environment, acquiring information about the world. Data acquired from a single sensor reading is called a sensor view. [2]
Fig 1.2 A framework for occupancy grid based robot mapping.
Various sensor views taken from a single robot position can be composed into a local sensor map. Multiple sensor maps can be maintained separately for different sensor types, such as sonar or laser. To obtain an integrated description of the robot's surroundings, sensor fusion of the
separate local sensor maps is performed to yield a robot view, which encapsulates the total sensor information recovered from a single sensing position. As the vehicle travels through its terrain of operation, robot views taken from multiple data-gathering locations are composed into a global map of the environment. One of the most important competencies for a service robot is to be able to accept commands from a human user. In an unstructured terrain the robot utilizes the help of human and accordingly navigates but in this project the robot will navigate in the structured terrain (utilises sign board as reference) without human intervention.
-
Objective of the project
-
The main objective of the project is autonomous navigation of the robot using optical character recognition technique without human intervention.
-
The proposed system allows a robot to discover path automatically by detecting and reading textual information in signs located (Sign board) by using OCR.
-
-
OVERVIEW OF APPLICATION
Fig 2.1 Project structure diagram
This project provides the way to navigate the robot without any human intervention. A robot serves the purpose here. Mount the camera on the robot. The communication between the robot and the server (PC) is through GPRS. Hence, distance between the control unit and the robot does not matter and between cell phone and robot through Bluetooth.
Java Application running at the server side and Android application in mobile.
Initially robot will be moving in a particular direction. If robot comes across RFCard then robot stops immediately and takes the photograph and sends it to server. Server processes the image and sends the instructions to robot to navigate.
As soon as RF card reader gets the data, micro controller stops the robot and sends instruction to Cell through Bluetooth to capture the image. Cell phone takes the
image and loads it in java server page and sends this image to server for processing through GPRS. Server side Tomcat container is running receives the image from the cell phone, applies OCR to extract the data.
Fig 2.2 Servlet request and response model
Based on the extracted data, server sends the instruction to the robot. As soon as the robot gets the instruction from server robot will speak up the current location of the robot and moves according to instruction. If the data such as Restaurant, Petrol pump, Men at work etc the server sends instruction to the robot that will speak up the received data, then robot will navigate.
-
OCR Application
Data collection: We have to choose the data type with geat care. Here, we are selecting the English characters from A-Z and numerals from 0-9 and few special characters like #,$ etc.
Image processing: This is the main stage of character recognition process. A digital image that is containing a English character is generally an RGB (red, green, blue) image. In preprocessing stage RGB image is converted to grey scale image. Grey scale image is the image in which the value of each pixel carries intensity information wearing from white at the weakest intensity to black at the strongest. Then, the greyscale image is converted to binary image. In a gray scale image there 256 combinations of black and white colors. If pixel value is less than or equal to 128 then the value is set to 0 means pure white and if the value is greater than 128 then the value is set to 1 means pure black. As specified pixel containing the value 1 is black spot and 0 for white. Next, in the processing stage feature is extracted (1 pixel value spots).This is done by pixel grabbing (grabbing 1 pixel value spots). So, naturally pixels having value 1 is expected to contain the character. Then the entire image is sampled into specified portion (the area containing the character). Once sampled, we will get rows and columns of 0s and 1s. Now, each row is collected and combined together and it will make a vector. The vector represents particular character. Now the data obtained from processing stage is compared with the trained data by Kohonen neural network and finally the character or word is recognized.
Since the vector length is 81, so the input layer has 81 input neurons and in the output layer the number of output neurons depends on the number of characters trained with the neural network. Hence, output layer has 162 output neurons. Output from the Kohonen neural network does not consist of the output of several neurons. When a pattern is presented to a Kohonen network one of the output neurons is selected as a "winner". This "winning" neuron is the output from the Kohonen network. [4]
The project has 12V, 2.5A battery which supplies the power to the whole robotic design. The power from the battery is supplied to the regulator and from the regulator a constant voltage of 12V and 5V is supplied to the respective electronic devices. (micro controller, Blue tooth module – 5V,RF reader,relay,motor-12V)
The RF reader is connected to the interrupt port, Bluetooth module is connected with C6 and C7 ports which are USART asynchronous transmitter and receiver. D0,D1,D2,D3 are input/output ports to which ULN 2003A relay driver is connected which drive the relays which in turn drive the DC motors.
Fig 6.1 Hardware circuitry of robotic design
-
Software Interfaces
Application: Java, J2EE, Android
Java : Is a widely-used general-purpose application programming language. Optical Character recognition process is written in this language.
J2EE: Is the widely used programming language for web based applications. We are transferring the captured image from mobile to server by internet through java server page.
Android: Is a programming language used for mobile based applications.
Tomcat Server: Is a web container where we are running our application.
Network: Application depends on internet and Bluetooth.
Operating System: Android Operating system 2.2 or higher version is needed.
Text to speech: Used to convert the text to voice. Actually it is an inbuilt application in android Mobiles. Whenever the robot gets the instruction from the server the robot will Speak up the current location of the robot and will further navigate.
-
-
APPLICATIONS
Automatic license plate recognition: OCR robot can be used in automatic number plate recognition. Once the number plate image has been captured, robot applies OCR to recognize the number plate.
Automated guided vehicles: OCR robot can be used as a automated guided vehicle with readable capacity.
Librarian robot: Once the navigation system guides the robot to the correct bookshelf, a scanning strategy is followed by OCR technique to recognize the correct book.
Navigation of places with fixed boundary: The OCR robot can be used to navigate the places with fixed boundary like Zoo, Military basis. The robot can also be used for the navigation of indoor facilities like office, school, Hostel etc.
-
EXPERIMENTS AND RESULTS
The signboard is designed such that the robot can perform key functions: the signboard detection, identification. The signboard used is standardized by black colour characters with character size from 38 to 48.The characters are written in Ariel style. Few sample signboards are shown below.
Fig 4.1 Few sample signboards
When the robot meets all the standard conditions, it will give fare results.
The apache web container is running in the server as and when it receives the image applies OCR and sends the instruction to the robot to navigate.
Fig 4.2 Side view of the robot
Fig 4.3 Top view of the robot
-
CONCLUSION
The character recognition process has started from Taushecks patent Reading machine and has evolved as the technology has developed but still optical character recognition is an inexact science and demands some standardization. In future, OCR technology can be further improved so that the characters can be detected without standardization.
The project work demonstrates that it is possible for mobile robots to read signboard autonomously and hence carry out the navigation process, using characters and symbols on the signboard. A neural network trained to identify characters to take into consideration the various viewpoints possible. This is surely a nice starting point to increase the reading capabilities of mobile robot.
Future scope
Optical character recognition technology can be still developed to recognize the characters of nonstandard size and in uncontrolled conditions will solve difficulties in detecting and reading text accurately.
The main future work in land mark recognition involves:
-
Using a better OCR technique to recognize the extracted text from landmarks.
-
Applying an affine rectification to correct the perspective distortion caused by a camera view angle in order to increase the text recognition rate.
REFERENCES
-
S. Mori, C. Y. Suen, and K. Yanamoto, Historical review of OCR research and Development, Proceedings of the IEEE, Vol.80, pp.1029-1058, July 1992
-
Alberto Elfes, Using occupancy grids for Mobile robot perception and navigation 0018-9162/89/0600- 0046$01, 1989 IEEE.
-
Dominic L, Francoise M,JM Valin, Catherine P, Making a mobile robot read textual Messages, 0- 7803-7952-07/03/$17 2003 IEEE.
-
N.Vishvanath, S.Somasundaram,NIshad A.N Krishnan, Indian licence plate character Recognition using Kohonen neural network , 978-1-4673-1344- 5/12/$31 2012 IEEE.
-
J. Heaton, Introduction to Neural Network in Java, H.R publications, 2002.
-
VIshvas M, Arjun M M, Dinesh R, Hand Written kannada character recognition based On Kohonen neural network, 978-1-4673- 0255-5-$31,2012 IEEE.