Voice Computing: Technology for Next Technical Era

Amit Ashok Mokashi

doi:10.17577/IJERTV2IS120894

Volume 02, Issue 12 (December 2013)

Voice Computing: Technology for Next Technical Era

DOI : 10.17577/IJERTV2IS120894

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 75
Total Downloads : 162
Authors : Amit Ashok Mokashi
Paper ID : IJERTV2IS120894
Volume & Issue : Volume 02, Issue 12 (December 2013)
Published (First Online): 26-12-2013
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Voice Computing: Technology for Next Technical Era

Amit Ashok Mokashi

Professor at Sharadchandra Pawar College of Engg., Otur (Pune)

Abstract

Voice computing is the newly proposed system for latest technical generation. Todays computer system is useful if the operating person is even unable to speak but there is no provision for the people who are handicap by other organs. As the name of project indicates we are going to implement the concept of computing totally based on voice especially for blind or handicap people. Because if anyone is mentally feet then the Weakness of physical organ must not be the limitation for learning or operating computer. To overcome this problem we design a system consists of following facility in a single unit.

-Controlling operating system

-Talking editor

-People choice

-Mail reader

-Security module

The concept of voice computing is useful in many areas like: for blind people, officers, students, housewives etc.

Key words: Speech Recognition, Voice computing, speech, text

Introduction

This paper presents a brief survey on Automatic Speech Recognition and discusses the major themes and advances made in the past 60 years of research, so as to provide a technological perspective and an appreciation of the fundamental progress that has been accomplished in this important area of speech communication. After years of research and development the accuracy of automatic speech recognition remains one of the important research challenges (e.g., variations of the context, speakers, and environment).The design of Speech Recognition system requires careful attentions to

The following issues: Definition of various types of speech classes, speech representation, feature extraction techniques, speech classifiers, database and performance evaluation. The problems that are existing in ASR and the various techniques to solve these problems constructed by various research workers have been presented in a chronological order. Hence authors hope that this work shall be a contribution in the area of speech recognition. The objective of this review paper is to summarize and compare some of the well known methods used in various stages of speech recognition system and identify research topic and applications which are at the forefront of this exciting and challenging field.
Proposed system

Language is man's most important means of communication and speech its primary medium. Speech provides an international forum for communication among researchers in the disciplines that contribute to our understanding of the production, perception, processing, learning and use. Spoken interaction both between human interlocutors and between humans and machines is inescapably embedded in the laws and conditions of Communication, which comprise the encoding and decoding of meaning as well as the mere transmission of messages over an acoustical channel. Here we deal with this interaction between the man and machine through synthesis and recognition applications. The paper dwells on the speech technology and conversion of speech into analog and digital waveforms which is understood by the machines Speech recognition, or speech-to-text, involves capturing and digitizing the sound waves, converting them to basic language units or phonemes, constructing words from phonemes, and contextually analyzing the words to ensure correct spelling for words that sound alike.

We sucked all the filtered features from each software and consider the most important attributes all in together. The modules provided in our proposed software are described in detail as follow:
So to implement the security we provide the option for encryption and Decryption of data. It will perform using AES, RSA algorithm. In this module we provide the advance security option for strong security. Because password or shoulder sniffing are not comparatively strong. Cryptography is the latest technique and easy to implement. In encryption we convert the normal text to cipher text and key is provided to receiver. At receiver end decryption is done in which cipher text is converted to normal text.
Three primary speech technologies are used in talking WORD editor processing applications: stored speech, text-to speech and speech recognition. Stored speech involves the production of computer speech from an actual human talking word that is stored in a computers memory and used in any of several ways. Speech can also be synthesized from plain text in a process known as text-to speech which also enables talking word processing applications to read from textual database. The first step in voice recognition is for an individual to produce an actual voice sample. Voice production is a fact of life in which we take for granted every day, and the actual process is complicated. The production of sound originates at the vocal cords. In between the vocal cords is a gap. When we attempt to communicate, the muscles which control the vocal cords contract. As a result, the gap narrows, and as we exhale, this breathe passes through the gap, which creates sound. The unique pattern of an individuals voice is then produced by the vocal tract. The vocal tract consists of the laryngeal pharynx, oral pharynx, oral cavity, nasal pharynx, and

the nasal cavity. It is these unique patterns created by the vocal tract which is used by voice recognition systems. Even though people may sound alike to the human ear, everybody, to some degree, has a different or unique annunciation in their speech.

The current applications for voice recognition systems are for physical access entry and where remote identity verification is required. Examples of this include call center automation, and transaction processing applications via the telephone or computer. Popular applications in this area are financial transactions (account access; funds transfer; bill payment; trading of financial instruments) and credit card processing (address changes; balance transfers; loss prevention). Voice recognition has also made an impact in the penal system. This technology has been used for inmates on parole, juvenile inmates, and those under house arrest.

However, voice recognition technology has not been as widely adopted and utilized as the other biometric technologies examined in previous articles (iris recognition, fingerprint recognition, hand geometry recognition, and facial recognition).

Some future applications for voice recognition systems include Customer Relationship Management (CRM) applications, wireless products, and Voice over IP (VOIP).

What is Voice Computing?

Voice recognition is an alternative to typing on a keyboard. Put simply, you talk to the computer and your words appear on the screen. The software has been developed to provide a fast method of writing on a computer and can help people with a variety of disabilities. It is useful for people with physical disabilities who often find tping difficult, painful or impossible. Voice-recognition software can also help those with spelling difficulties, including users with dyslexia, because recognized words are almost always correctly spelled.

Voice-Computing Software

Voice-recognition software program work by analyzing sounds and converting them to text. They also use knowledge of how English is usually spoken to decide what the speaker most probably said. Once correctly set up, the systems should recognize around 95% of what is said if you speak clearly. Several program are available that provide voice recognition. These systems have mostly been designed for Windows operating systems, however program are also available for Mac OS X. In addition to third-party software, there are also voice- recognition program built in to the operating systems of

Windows Vista and Windows 7. Most specialist voice applications include the software, a microphone headset, a manual and a quick reference card. You connect the microphone to the computer, either into the soundcard (sockets on the back of a computer) or via a USB or similar connection.
Future scope

The system can be extended to continuous word recognition with large vocabulary based on a phone acoustic model. This work can be taken into more detail and more work can be done on the project in order to bring modifications and additional features. The current software doesnt support a large vocabulary, the work will be done in order to accumulate more number of samples and increase the efficiency of the software. The current version of the software supports only few areas of the notepad but more areas can be covered and effort will be made in this regard. The scope of project can be extended to implement the system on small microchip so as to make it more popular, cost effective and user friendly.
Conclusion

Through this paper, we present a scheme to convert speech to text as well as text to speech. The key factor in designing such system is the target audience. For example, physically handicapped people should be able to wear a headset and have their hands and eyes free in order to operate the system. A word based acoustic model is used. This model can be used only for limited vocabulary. As the size of the vocabulary increases performance of the system decreases. .The system cannot properly distinguish between similar words. Like to and two because they have similar sound phonemes. At last we conclude that this project can be used at very large scale with very little modifications. During the experiment work medium size vocabulary system was implemented. The system can be extended to continuous word recognition with large vocabulary based on a phone acoustic model, using the HMM Technique or using other growing techniques like Artificial Neural Network.
References

[1]. Dat Tat Tran, Fuzzy Approaches to Speech and Speaker Recognition , A thesis submitted for the degree of Doctor of Philosophy of the university of Canberra.

[2]S. Young, G. Evermann, T. Hain, D. Kershaw,

G. Moore J. Odell, D. Ollason, D. Povey,

V. Valtchev, and P. Woodland, The HTK Book

for HTK V3.2, Cambridge University Press, Cambridge, UK, 2004.

[3]. Sadaoki Furui, 50 years of Progress in speech and Speaker Recognition Research , ECTI Transactions on Computer and Information Technology,Vol.1. No.2 November 2005.

[4].K.H.Davis, R.Biddulph, and S.Balashek, Automatic Recognition of spoken Digits, J.Acoust.Soc.Am., 24(6):637-642,1952.

Speaker Recognition, Joseph P. Campbell, Jr. Article is from the book: Biometrics: Personal Identification in Networked Society, By Anil Jain, Ruud Bolle, and Sharath Pankati.
S. Young. Large vocabulary continuous speech recognition: A review IEEE Signal Processing Magazine, 13(5):4557, 1996.
S. Young. Statistical modelling in continuous speech recognition. Proc. Int. Conference on Uncertainity in Artificial Intelligence, Seattle, WA, August 2001.

Volume 02, Issue 12 (December 2013)

Voice Computing: Technology for Next Technical Era

Voice Computing: Technology for Next Technical Era

Problem definition

Aim

Purpose

Application

Speech technology

What is Voice Computing?

Voice-Computing Software

Leave a Reply