Mobile Camera Based Text Detection and Translation

Ravindra Bandal; Adesh Jadhav; Vitthal Kale.

doi:10.17577/IJERTV3IS11034

Volume 03, Issue 01 (January 2014)

Mobile Camera Based Text Detection and Translation

DOI : 10.17577/IJERTV3IS11034

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 193
Total Downloads : 91
Authors : Ravindra Bandal, Adesh Jadhav, Vitthal Kale.
Paper ID : IJERTV3IS11034
Volume & Issue : Volume 03, Issue 01 (January 2014)
Published (First Online): 30-01-2014
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Mobile Camera Based Text Detection and Translation

Ravindra Bandal, Adesh Jadhav, Vitthal Kale.

B.E computer engineering,Navsahyadri Education SocietysGroup of Institutions , pune.

Abstract –

Text in a natural image directly carry rich high-level semantic information about a scene, which can be used to assist a wide variety of applications, such as image understanding, image indexing and search,geolocation or navigation, and human computer interaction. However, most existing text detection and recognition systems are designed for horizontal or near-horizontal texts. Witt the increasingly popular computing-on- the-go devices, detecting texts of arbitrary orientations from images taken by such devices under less controlled conditions has become an increasingly important and yet challenging task.

In this project, we are using a new algorithm to detect texts of arbitrary orientations in natural images. Our algorithm is based on a two-level classication scheme and utilize two sets of features specially designed for capturing both intrinsic and orientation- invariant characteristics of texts. To better evaluate the proposed method and compare it with other existing algorithms, we generate a more extensive and challenging dataset, which includes various types of texts in diverse real- world scenes.

Keywords:-

Text detection , fuzzy clustering

,edge profile , signboard image , OCR.

Introduction

Character which can be used to assist a wide variety of applications, such as image understanding, image indexing and search, geolocation or navigation, and human computer interaction. However, most existing text detection and recognition systems are designed for horizontal or near-horizontal texts. With the increasingly popular computing-on- the-go devices, detecting texts of arbitrary orientations from images taken by such devices under less controlled conditions has become an increasingly important and yet challenging task. In this project, we are using a new algorithm to detect texts of arbitrary orientations in natural images. Our algorithm is based on a two-level classication scheme and utilize two sets of features specially designed for capturing both intrinsic and orientation-invariant characteristics of texts. To better evaluate the proposed method and compare it with other existing algorithms, we generate a more extensive and challenging dataset, which includes various types of texts in diverse real-world scenes.We are also using a new evaluation protocol, which is more suitable for benchmarking algorithms designed for texts of varying orientations. Experiments on conventional benchmarks and the new dataset demonstrate that our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves signicantly enhanced performance on texts of arbitrary orientations in complex natural scenes..
Existing system

In case of existing system it is difficult to make changes into the existing document but with the help of this application its easy to make changes into the file.
Problem statement

The mobile camera based text detection and translation is apply all the words in the English language, or a more technical for specific field.This technique can be problematic if the document contains words in cursive, like proper nouns. Tesseract uses its dictionary to influence the character segmentation step, for improved accuracy. The output stream may be a plain text stream or file of characters, but more sophisticated OCR systems can preserve the

original layout of the page and produce, for example, an annotated PDF that includes both the original image of the pageand a searchable textual representation..
Proposed system

In this system we are providing the effective way to make more editable document . This technique can be problematic if the document contains words in cursive, like proper nouns. Tesseract uses its dictionary to influence the character segmentation step, for improved accuracy. The output stream may be a plain text stream or file of characters, but more sophisticated OCR systems can preserve the original layout of the page and produce.
System Requirement
1. Hardware Interfaces:
  
  Processor : Processor
  
  At
  
  Least Pentium
  
  Ram
  
  :
  
  512 MB
  
  Hard Disk
  
  :
  
  2 GB
2. Software Interfaces:
Operating System : Windows XP/7/8. Technology : ADT, Java (JDK 1.6). Front End : Android

Database : MySQL.

Java (JDK 1.6)

Java is a general purpose programming language with a number of features that make the language well suited for use on the World Wide Web. Small Java applications are called Java applets and can be downloaded from a Web server and run on your computer by a Java-compatible Web browser, such as Netscape Navigator or Microsoft Internet Explorer.
System Architecture:

The proposed system consists of four stages:
1. component extraction
2. component analysis (3)candidate linking
  
  (4) chain analysis
  
  which can be further categorized into two procedures, bottom-up grouping and top- down pruning, as shown in Fig. In the bottom-up grouping procedure, pixels are rst grouped into connected components
  
  and later these connected components are aggregated to form chains; in the top- down pruning procedure non-text components and chains are successively identied and eliminated. These two procedures are applied alternately when detecting text in images.
  At the chain analysis stage, the chains determined at the former stage are
  
  veried by a chain level classier. The chains with low classication scores (probabilities) are discarded. The chains may be in any direction, so a candidate might belong to multiple chains; the interpretation step is aimed to dispel this ambiguity. The chains that pass this stage are the nal detected texts. The remainder of this paper is organized as follows. Section II presents the details of the proposed method, including the algorithm pipeline and the two sets of features. Section III introduces the proposed dataset and evaluation protocol. The experimental results anddiscussions are given in Section IV. Section V concludes the paper and points out potential directions for future research.
  
  Figure6.5 : System pipelined architecture
  
  Workflow of System
  
  Figure: 6.6 Workflow of System
Algorithms for converting color to grayscale

How do you convert a color image to grayscale? If each color pixel is described by a triple (R, G, B) of intensities for red, green, and blue, how do you map that to a single number giving a grayscale value? There are following three algorithms.

The lightness method averages the most prominent and least prominent colors: (max(R, G, B) + min(R, G, B)) / 2.

The average method simply averages the values: (R + G + B) / 3.

The luminosity method is a more sophisticated version of the average method. It also averages the values, but it forms a weighted average to account for human perception. Were more sensitive to green than other colors, so green is

weighted most heavily. The formula for luminosity is 0.21 R + 0.71 G + 0.07 B.

The example sunflower images below

Original image

Lightness

Average

Luminosity

The lightness method tends to reduce contrast. The luminosity method works best overall. However, some images look better using one of the other algorithms. And sometimes the three methods produce very similar results.
Thresholding

Thresholding is the simplest method of image segmentation. From a grayscale image, thresholding can be used to create binary images.During the thresholding process, individual pixels in an image are marked as "object" pixels if their value is greater than some threshold value (assuming an object to be brighter than

the background) and as "background" pixels otherwise. This convention is known as threshold above. Variants include threshold below, which is opposite of threshold above; threshold inside, where a pixel is labeled "object" if its value is between two thresholds; and threshold outside, which is the opposite of threshold inside. Typically, an object pixel is given a value of 1 while a background pixel is given a value of 0. Finally, a binary image is created by coloring each pixel white or black, depending on a pixel's labels.

8.1 Threshold selection

The key parameter in the thresholding process is the choice of the threshold value (or values, as mentioned earlier). Several different methods for choosing a threshold exist; users can manually choose a threshold value, or a thresholding algorithm can compute a value automatically, which is known as automatic thresholding . A simple method would be to choose the mean or median value, the rationale being that if the object pixels are brighter than the background, they should also be brighter than the average. In a noiseless image with uniform background and object values, the mean or median will work well as the threshold, however, this will generally not be the case. A more sophisticated approach might be to create a histogram of the image pixel intensities and use the valley point as the threshold. The histogram approach assumes that there is some average value for the background and object pixels, but that the actual pixel values have some variation around these average values. However, this may be computationally expensive, and image histograms may not have clearly defined valley points, often making the selection of an accurate threshold difficult. One method that is

relatively simple, does not require much specific knowledge of the image, and is robust against image noise
Template matching

Template matching is a technique in digital image processing for finding small parts of an image which match a template image. It can be used in manufacturing as a part of quality control a way to navigate a mobile robot, or as a way to detect edges in images.
For templates without strong features, or for when the bulk of the template image constitutes the matching image, a template-based approach may be effective. As aforementioned, since template-based template matching may potentially require sampling of a large number of points, it is possible to reduce the number of sampling points by reducing the resolution of the search and template images by the same factor and performing the operation on the resultant downsized images (multiresolution, or pyramid, image processing), providing a search window of data points within the search image so that the template does not have to search every viable data point, or a combination of both.
Advantages
- Easier and less expensive using a mobile device equipped with real time machine translation.
- Real time mobile translation can help people at travelling.
- The portability of mobile phones makes it convenient.
Disadvantages
- No cursive character detection.
- Not suitable for low image quality.
Application
- More quickly make textual versions of printed documents.
- Converting handwriting in real time to control a computer
Conclusion

We have presented a text detection system that detects texts of arbitrary directions in complex natural scenes. Our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves signicantly enhanced performance on texts of arbitrary orientations. Furthermore we have presented an approach for automatic detection and binarization of texts for application to mobile systems. The proposed method can robustly detect and binarize main texts from signboard images. Firstly, we perform detection of main text region using edge histogram method with horizontal and vertical direction in edge map image. After text region verification, detected region is segmented by fuzzy c-mean clustering and each region is distinguished as text region.
Reference:

[1].IEEE TRANSACTION ON Mobile

based text detection and translation. Jonghyun park, Toan Nguyen and Gueesang lee. IEEE TRANSACTION Nov.2012.

[2]. IEEE TRANSACTION ON Detecting

Texts of Arbitrary Orientations in Natural Images. Cong

yao,XinZang,Xiang Bai,Member IEEE

,Wenyu Liu . IEEE TRANSACTION May 2012.

[3]. a-Soft: An English Language OCR Junaid Tariq. Department of Computer science. COMSATS Institute of Information Technology, Islamabad,Pakistan junaid tariq@ comsats.edu.pk Umar Nauman Department of Computer science. COMSATS Institute of Information Technology, Islamabad, Pakistan.

[4]. Graph Matching Method for Character Recognition in Natural Scene Images Jieun Kim*, Ho-sub Yoon** Unversity of Science and Technology / Department of Computer Software and engineering, Daejeon, Republic of Korea.INES 2011 15th International Conference on Intelli gent Engineering Systems June 2325, 2011, Poprad, Slovakia.

[5]. Application of clustering Technique for Tissue image segmentation and comparison mohit Agarwal, Gaurav Dubey, Ajay Rana 22 International Journal of Scientific & Engineering Research, Volume 4, Issue 6,

June-2013 ISSN 2229-5518.

[6]. Digital Image Encryption Algorithm Based on Chaotic Block and Pixel Mapping Table . Nadia Al- Rousan, Hazem Al-Najjar . International Journal of Scientific & Engineering Research, Volume 4, Issue 6, June-

2013 ISSN 2229-5518.

Processor : Processor	At	Least Pentium
Ram	:	512 MB
Hard Disk	:	2 GB

Mobile Camera Based Text Detection and Translation

Leave a Reply