- Open Access
- Total Downloads : 14
- Authors : Vatsala Mathapati, Naveen Kumar C, Lakshmi. P
- Paper ID : IJERTCONV4IS22077
- Volume & Issue : ICACT – 2016 (Volume 4 – Issue 22)
- Published (First Online): 24-04-2018
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Application of Language Model in ATC
Vatsala Mathapati MTECH, Department: CSE AMC Engineering College
Lakshmi. P
Senior Scientist,
Naveen Kumar C
Assistant professor, Dept. of CSE AMCEC
Bengaluru, India
Aerospace Electronics and Systems Division National Aerospace Laboratories (CSIR- NAL) Bengaluru, India
Abstract-Many researches have been conducted from past few years on Application of Language Model, but still they are unsatisfied with the progress. Language Model is a part of Speech Recognition, Language Model can be built for many languages. The main aim of applying Language Model in Air Traffic Controller is to overcome the miscommunication between Air Traffic Controller and pilot, and also overcome an accent and dialect problems. An approximate Language Model is built, which includes Dictionary. The approximate LM is built using high java technologies with large data collection regarding phonemes. This is focused and built for only Indian English Air Traffic Controller.
Keywords Speech Recognition; Language Model ;Air Traffic Controller commands
-
INTRODUCTION
This paper is determined for training Air Traffic Controller (ATC) [1]. To convey information from ATC to pilot Speech Recognition (SR) technology is required, as SR converts spoken words into text using Acoustic Model [2] and Language Model (LM). Before planning for LM, SR and microphone must be selected. To recognize speech the SR system requires high frequency microphone that receives approximate sound waves [3]. The speaker should be aware of cool and silent circumstance in order to avoid other noisy sound. The microphone must be able to convert sound waves to electrical signal.
Language Model is not just utilized as a part of Speech Recognition. They are additionally helpful in fields like penmanship acknowledgment, spelling amendment, writing Chinese! Like discourse acknowledgment, these are regions where the info is vague somehow, and a dialect model can help us Fig the probably enter [3]. I am additionally chipping away at finding new uses for dialect models, in different territories. The University of Edinburgh has an expansive gathering of specialists chipping away at discourse acknowledgment [2]. The work ranges from exceptionally fundamental exploration (building scientific models of how discourse functions), through examination into perceiving the discourse of elderly clients, to perceiving discourse recorded utilizing far off mouthpieces (e.g. on a table top). Speech
Recognition innovation has gigantic potential for human PC association. In the course of the most recent ten years, acknowledgment innovation has started to show up in a huge number of gadgets from home PCs to cellular telephones and home security frameworks. It is the least difficult route for a client to cooperate since discourse is the most widely recognized and normal type of correspondence. It doesn't require confused client interfaces.
Speech Recognition has numerous conceivable uses for home consideration. It can be utilized to control things, for example, warming, lighting or apparatuses. It additionally frames a key a portion of talked dialog frameworks. Individuals can speak with a discourse recognizer utilizing a telephone, or by wearing a pin-on amplifier (that might be remote). Mouthpieces may likewise be appended to the dividers or put on a table, despite the fact that this makes the undertaking of acknowledgment harder.
The utilization of Language Models in a Speech Recognition System. The acoustic part of discourse acknowledgment frameworks delivers a arrangement of phonemes or phonetic fragments that can be an arrangement of theories, which relate to content perceived by the framework. Amid the acknowledgment, the succession of images produced by the acoustic segment is contrasted and the arrangement of words present in the vocabulary as to deliver the ideal succession of words that will create the framework's last yield. It is imperative to present tenets amid this stage can depict etymological confinements present in the dialect and can permit the lessening of the quantity of conceivable substantial phoneme arrangements [1,2,3]. This is expert through the utilization of a dialect model in the framework. A dialect model includes two principle segments: the vocabulary which is an arrangement of words that can be perceived by the framework and the syntax which is an arrangement of guidelines that direct the way the expressions of the vocabulary can be masterminded into gatherings and structure sentences. The sentence structure can be made of formal phonetic guidelines or can be a stochastic model. The etymological models present solid confinements in admissible successions of words however can get to be computational requesting when consolidating in a discourse
acknowledgment framework. They likewise have the issue of not permitting the presence of linguistically erroneous sentences that are regularly present in unconstrained discourse. This makes the stochastic models in view of probabilities for successions of words more appealing for use in discourse acknowledgment frameworks because of their power and effortlessness. To make this kind of models it is required to utilize a lot of preparing information as to acquire legitimate insights that permit the development of vigorous stochastic dialect models [5].
-
Types of SR
Speech to Text conversion is also done using Android platform [4]. There are two types of SR which plays a very important role in respective fields. The two types are:
-
Speaker Dependent SR :
Speaker Dependent represents only one persons voice samples are saved in system database as shown in Fig 1. Only that persons voice samples are recognized by system. It requires minimum database. It can be seen only in small scale Industries.
-
Speaker Independent SR :
-
Speaker Dependent represents multiple voice samples are saved in system database as shown in Fig 2. It requires large database as it stores many persons voice samples. Any ones voice samples are recognized by system. It can be seen in large scale industries.
Fig 1: Speaker Dependent
The phonemes of HMM are fed into LM as input [6]. The LM recognizes these phonemes and a start to identify what words has been spoken.
C. Generation of raw commands in HMM
Fig 3: Generation of raw commands in HMM
The microphone picks up the sound waves. And converts to signal and send to Signal Analyzer (SA). SA separates the unwanted noisy sound and send approximate correct sound signal to HMM or Acoustic Model. The Fig 4 show HMM receives signal and send to soundcard where it converts to binary format and these binary form is converted to phonemes [5].
Fig 4: Working of HMM
-
-
SYSTEM DESIGN
-
Module of Language Model
-
Models of SR
Fig 2: Speaker Independent
These speaker dependent and speaker independent SR contains two models respectively [5,6,7]. The two models are
-
HMM :
The other name Hidden Markova Model (HMM) is Acoustic Model [5]. HMM is completely based on Mathematical Model and probability. It recognizes voice sample signal and breaks into phonemes. The phonemes are small unit of spoken words.
-
LM :
-
Fig 5: Module of LM
LM contains Dictionary file. The Dictionary file is used to store thousands of phonemes and meanings related to ATC commands. The phonemes are nothing but small unit of pronunciation word.
-
Problem Statement
When a ATC speaks ino microphone there is no boundary between each words, one word runs on other word. Word boundary problem is occurred by accent and dialect problem. The pronunciation of Indian English is different from British English, American English and so on, called accent [8]. If pronunciation of one person differs from other person then, it is called dialect. In order to avoid these problems approximate LM is built with Dictionary and Grammar. The Dictionary and Grammar overcomes all the accent and dialect problems occurred between ATC and pilot.
-
System Overview
A. Flowchart
Fig 7 : Sequence Diagram
-
-
IMPLEMENTATION
Fig 6: Architecture of LM
The Fig 6 shows System overview of LM, Acoustic Model picks up the Speech in the form of sound and converts into phonemes with a mathematical formulas and Algorithms [5]. Phonemes [10] are nothing but raw commands fed into LM [6]. LM finds the word of raw command in Dictionary and its words are presented as processed command. The format of ATC command to pilot is in particular order.
Aircraft Number Command and unit
Aircraft name must be first word and remaining all is optional.
D. Sequence diagram
The sequence diagram shows raw commands or phonemes are fed into LM. LM sends phonemes to Dictionary. Dictionary searches its word and then create sentence and send to LM. The LM checks for the correct format of ATC command. The LM then send output as processed commands.
Fig 8 : Flowchart
The Flowchart shows the pictorial representation to build the code. The raw commands are fed as input and its word is searched in dictionary, dictionary is created using java properties which stores keys and value. The meaningful words are stored in variable and meaningful sentence in particular order are created and resulted as processed commands.
-
Flowchart
Step 1: START
Step 2: INPUT: {Phonemes or A Raw commands } Step 3: Find word in Sound Model dictionary
Step 4: if INPUT= Raw commands in dictionary, Print word
Step 5: create sentence with correct matched words Step 6: If first word != Aircraft name
print No Aircraft found for provided phonemes Go to step 6
else
print Aircraft found
Step 7: OUTPUT: Processed command
Step 8: Stop
The Algorithm represent clear step of implementation.
-
-
RESULT
The input will be in the form of phonemes. If first entered phonemes were not Aircraft then it prints no Aircraft found for provided phonemes.
Fig 9 : If first phonemes were not Aircraft
-
The input will be:
The input dialog box shows the phonemes are given as input to get processed command. If first phonemes are Aircraft then it prints Aircraft found and then the processed command.
Fig 10 : command is in right phonemes format
-
The output will be:
ATC commands command prompt commands
voxforge Dictionary
Fig 13 : HTK performance analysis graph
The Fig 12 and Fig 13 shows the performance analysis comparisons between Java performance and HTK [Hidden Markova Tool Kit] performance.
The java performance analysis have ATC commands, Searching concept in Java, voxforge and Dictionary. The java technology reduces the time, and it works according to logic, it does not require any commands to search a word. The HTK performance analysis have ATC commands, command prompt commands, voxforge and Dictionary. The HTK tool works according to many commands for each operation they perform. It is very difficulty to work with many commands. Using of Java technology and its tools reduces the scratching of head and also reduces cumbersome.
CONCLUSION
The Application of Language Model in ATC overcomes the accent and dialect problem between ATC and pilot. Accurate LM is built with Dictionary. It is preconFigd with phonemes and its meaning with Enhanced Java Technology [9].
Fig 11 : command is in right phonemes format
I. PERFORMANCE ANALYSIS
ATC commands Searching concept
voxforge Dictionary
Fig 12 : Java performance analysis graph
ACKNOWLEDGMENT
I have got a best opportunity to present the paper on Application of Language Model in ATC. I am thankful to who all encouraged for the completion of this paper.
[1] [2] |
M.Dora.H.Waage.E.Thora.Language Technology in ATC ,IEEE , vol ISBN 0-7803-7844,2003. S.J.Yomg et al." The HTK Book", Cambridge University Engineering |
|
[3] |
Department, IEEE, Vol 13.no.6.pp 07-161 2006 Michael U Guttmann and Aapo Hyyarinen,"Noise-Contrastive |
|
estimation of unnormalized statistical models, with applications to |
||
in Java |
[4] |
natural image statistics". Vol 13.no.1.pp307-361 2012 Speech to text conversion using Android platform vol.3, issues 1 |
January- February 2013,PP-253-258. |
||
[5] |
A survey on Automatic speech Recognition with an Illustrative Example on continuous speech Recognition of Mandarins vol, 1August |
|
1996, pp 01-36.. |
||
[6] |
Unnormalized Exponential and Neural Network Language Models 2015 ,IEEE. |
|
[7] |
Aerospace Science and Technology 13(2009) 423-430. |
|
[8] |
Natural Language Processing , VOL: 0-7695-2457-5 ,S6-9 Nov. 2015 |
|
[9] |
The Native Interface in JAVA : http://java.sun.com |
|
[10] |
Voxforge Indian English : http://voxforge.org |
[1] [2] |
M.Dora.H.Waage.E.Thora.Language Technology in ATC ,IEEE , vol ISBN 0-7803-7844,2003. S.J.Yomg et al." The HTK Book", Cambridge University Engineering |
|
[3] |
Department, IEEE, Vol 13.no.6.pp 07-161 2006 Michael U Guttmann and Aapo Hyyarinen,"Noise-Contrastive |
|
estimation of unnormalized statistical models, with applications to |
||
in Java |
[4] |
natural image statistics". Vol 13.no.1.pp307-361 2012 Speech to text conversion using Android platform vol.3, issues 1 |
January- February 2013,PP-253-258. |
||
[5] |
A survey on Automatic speech Recognition with an Illustrative Example on continuous speech Recognition of Mandarins vol, 1August |
|
1996, pp 01-36.. |
||
[6] |
Unnormalized Exponential and Neural Network Language Models 2015 ,IEEE. |
|
[7] |
Aerospace Science and Technology 13(2009) 423-430. |
|
[8] |
Natural Language Processing , VOL: 0-7695-2457-5 ,S6-9 Nov. 2015 |
|
[9] |
The Native Interface in JAVA : http://java.sun.com |
|
[10] |
Voxforge Indian English : http://voxforge.org |
REFERENCES