- Open Access
- Authors : Le Hoang Thi My
- Paper ID : IJERTV13IS060119
- Volume & Issue : Volume 13, Issue 06 (June 2024)
- Published (First Online): 27-06-2024
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
Proposing a Vietnamse-Hre-Co Interaction Model to Support Building Bilingual Vocabulary Database of Vietnamese-Hre, Hre-Vietnamese, Vietnamese-Co, Co-Vietnamese
Le Hoang Thi My
University of Technology and Education The University of Danang
Danang, Vietnam
Abstract In order to preserve and promote the national cultural identity; helps ethnic minority students easily acquire knowledge when studying in schools and other educational institutions. Building a bilingual policy for ethnic minorities is really necessary, in order to create conditions for ethnic minorities to learn the spoken and written language of their ethnic group. From the need for a bilingual policy for ethnic minorities, the article has proposed a solution to build a bilingual vocabulary database of Vietnamese-Hre, Hre- Vietnamese, Vietnamese-Co, Co- Vietnamese based on the similarity model. Viet-Hre-Co collaboration, to contribute to creating an interactive environment between users and the vocabulary database. Bilingual vocabulary databases built from the Hre- Vietnamese and Co-Vietnamese interactive model serve as the infrastructure for deploying bilingual vocabulary lookup applications Vietnamese-Hre, Hre-Vietnamese, Vietnamese-Co, Co- Vietnamese.
Keywords Bilingual vocabulary database; ethnic minority; Hre ethnic group; Co ethnic group; interaction model
The effectiveness of political, economic, and cultural activities of Party and State officials in ethnic minorities is improved when the speaker directly informs the listener about what needs to be done. Each ethnic minority citizen being proficient in two languages, writing and reading fluently in the national language and ethnic script is a great advantage in the process of improving cultural and scientific levels and expanding relationships between ethnic groups.
In parallel with the bilingual policy for ethnic minorities, in the field of Information Technology, building bilingual Vietnamese-ethnic minority vocabulary databases in general and Vietnamese-Hre, Hre-Viet, Viet-Co, Co-Vietnamese bilingual vocabulary databases in particular is always a challenge.
Starting from the above situation, the article proposes a solution to build a bilingual vocabulary database Vietnamese- Hre, Hre-Viet, Viet-Co, Co-Viet based on the Viet-Hre-Co interaction model, in order to contributing to overcoming the limitations of the existing Vietnamese-Hre, Hre-Viet, Viet-Co, Co-Vietnamese bilingual vocabulary databases. Bilingual vocabulary databases built from the Vietnamese-Hre-Co
interactive model serve as the infrastructure for deploying bilingual vocabulary lookup applications Vietnamese-Hre, Hre- Viet, Viet-Co, Co-Vietnamese.
The structure of the article includes the following contents: First of all, we present the Vietnamese-Hre, Hre-Viet, Viet-Co, Co-Vietnamese. Second, propose a Vietnamese-Hre-Co interaction model in building a bilingual vocabulary database Vietnamese-Hre, Hre-Viet, Viet-Co, Co-Vietnamese. Third, the results achieved and finally the conclusion.
Data criteria
With the goal of building bilingual vocabulary databases Vietnamese-Hre, Hre-Vietnamese, Vietnamese-Co, Co- Vietnamese as the infrastructure for deploying bilingual vocabulary lookup applications Vietnamese- Hre, Hre- Vietnamese, Vietnamese-Co, Co-Vietnamese. The data criteria set forth in the bilingual vocabulary database are as follows:
Hre words are mainly collected and recorded in Hre language, which is considered the easiest to hear and understand. Hre language entries partly reflect the traditional culture of the Hre people and are written in Hre script.
Co language words are mainly collected and recorded in Co language, which is considered the easiest to hear and understand. Items from the Co language partly reflect the traditional culture of the Co people and are written in Co.
Examples are included to clarify the meaning and usage of words, also known as the context of the entry.
The word items are labeled with word types: label N for nouns, label V for verbs, label A for adjectives, label O for items that are not nouns, verbs or adjectives.
Polysemous words are recorded, translated and compared with different equivalent words in the target language.
When aligning words from the source language, find equivalent words in the target language, based on the basic meaning, commonly used meaning today in both languages.
Data is stored on the computer in standard Unicode fonts.Maintaining the Integrity of the Specifications
Data sources
The Hre-Vietnamese dictionary includes 3,020 word entries, most of which belong to the basic, common vocabulary of Vietnamese. The main vocabulary items are mainly words from Hre language training materials [5], [6].
Co-Vietnamese dictionary includes 1,042 word entries, most of which belong to the basic, common vocabulary of Vietnamese. The main vocabulary items are mainly words from Co language lesson materials [4].
Vietnamese vocabulary with over 31,000 entries, inherited from "VLSP Topic"
Paper dictionary data sources are entered manually, according to a common and consistent form using standard Unicode fonts.
Bilingual vocabulary database structure
Bilingual vocabulary database is designed according to the relational database model. Relational databases are used as collections of tables to store data. Database tables are similar to a bilingual vocabulary database, stored completely independently of structure and data. The relational bilingual vocabulary database model has the following advantages and disadvantages:
Table 1, designed to store Hre language entries. NOTE attribute, marks items borrowed from another language.
Table 1: Stored Hre language entries
Field name
Data type
Hre index
Hre entry
Table 2, designed to store Co language entries. NOTE attribute, marks items borrowed from another language.
Field name
Data type
Co index
Co entry
Table 2: Stored Co language entries
Table 3: Stored Vietnamese language entries
Field name
Data type
Vietnamses ndex
Vit entry
Table 4 is designed to store the results of Vietnamese index and Hre index when interacting according to the Hre- Vietnamese interaction model with data on Viet word index, Hre word index, word types, and other words. For example, Hre Vietnamese follows the Vietnamese entry.
Table 4: Vietnamese-Hre vocabulary database
Field name
Data type
Vietnamese index
Hre entry
Word type
Vietnamese -Hre example
Table 5 is designed to store the results of Vietnamese index and Co index when interacting according to the Co- Vietnamese interaction model with data on Viet word index, Co word index, word types, and other words. For example, Co- Vietnamese follows the Vietnamese entry.
Table 5: Vietnamese-Co vocabulary database
Field name
Data type
Vietnamese index
Co entry
Word type
Vietnamese Co example
Table 3, designed to store Vietnamese word entries, has about 31,000 word entries inherited from the bilingual vocabulary database in the "VLSP Project" [2]. The ADD attribute to mark Vietnamese entries corresponding to Hre or Co languages is added to the bilingual vocabulary database.
The interactive model is an environment for users to interactively convert paper dictionary data to computer dictionaries online[1], [3].
Vietnamese-Hre, Vietnamese-Co interaction model
Based on actual Hre and Co language data sources and with the goal of building infrastructure for Hre and Co language processing. The Vietnamese-Hre, Vietnamese-Co interaction model in developing the bilingual vocabulary database Viet- Hre, Hre-Vietnamese, Vietnamese-Co, Co- Vietnamese is proposed, shown in Figure 1.
(This work is licensed under a Creative Commons Attribution 4.0 International License.)
Hre-Vietnamese dictionary
Co-Vietnamese dictionary
Step 2: check Hre words in the Hre vocabulary database, if not there, add them.
Step 3: read the index of the Hre word
Step 4: separate the Vietnamese word from the set of readable Vietnamese words in the second column of the row in the dictionary file. Do this for each separable word in turn:
Step 4.1: check the separated Vietnamese words in the Vietnamese vocabulary database, if not there, add them and make notes to identify new words added to the Vietnamese
Hre-Vietnamese interactive model
Vietnamese-Hre Hre-Vietnamese
Co-Vietnamese interactive model
Vietnamese-Hre Hre-Vietnamese
Interacting Updating
Interactive enviroment
vocabulary database.
Step 4.2: read the index of Vietnamese words
Step 4.3: extract from the example set the examples Hre : Vietnamese corresponding to the Vietnamese word separated in step 4 in the example set read in step 1. Convert the example Hre : Vietnamese into Vietnamese : Hre.
Example: Ih oi aip uh?: Ông có khe không?. converted into Ông có khe không?: Ih oi aip uh?
Step 4.4: check the triple value (Vietnamese word index, Hre word index, word class) in the Vietnamese-Hre vocabulary database.
Figure 1. Viet-Hre-Co interaction model
Operations in the model:
The Hre-Vietnamese and Co-Vietnamese dictionary data sources are formatted before being transferred into the interactive environment.
The interactive environment reads data from the input data source and interacts with the Vietnamese, Hre and Co vocabulary databases, checking for word entries in the vocabulary databases. or not to perform the update and read the corresponding index. With the Vietnamese index and Hre index, Co index, the environment continues to interact with the bilingual vocabulary databases Vietnamese-Hre, Hre-Vietnamese, Vietnamese-Co, Co-Vietnamese. From there, update the bilingual vocabulary database Vietnamese- Hre, Hre-Vietnamese, Vietnamese-Co, Co-Vietnamese.
The interactive environment in the model is built through two interactive modules:
The Hre-Vietnamese interactive module has an input data source that is the Hre-Vietnamese dictionary text file.
The Co-Vietnamese interactive module has an input data source that is the Co-Vietnamese dictionary text file.
Operation of two module
Hre-Vietnamese interactive module
Input data sources are Hre-Vietnamese dictionary files, Vietnamese vocabulary database, Hre vocabulary database. The output of this module is the Vietnamese vocabulary database, Hre language database, Vietnamese-Hre vocabulary database and Hre-Vietnamese vocabulary database with updated data.
Procedure of execution
Step 1: read data (Hre words, sets of Vietnamese words, word types and Hre:Vietnamese examples) on each row in the Hre- Vietnamese dictionary file.
If not, add the three values and examples of Vietnamese:Hre obtained from step 4.3 to the Vietnamese-Hre, Hre-Vietnamese vocabulary database.
If it exists, check the examples extracted in step 4.3 in the set of examples corresponding to the three value sets. If any examples do not exist, add them.
Step 5: return to step 1 to read data on each row in the Hre- Vietnamese dictionary file one by one until the end of the file.
Co-Vietnamese interactive module
Input data sources are Co-Vietnamese dictionary files, Vietnamese vocabulary database, Co vocabulary database. The output of this module is the Vietnamese vocabulary database, Co language database, Vietnamese-Co vocabulary database and Co-Vietnamese vocabulary database with updated data.
Procedure of execution
Step 1: read data (Co words, sets of Vietnamese words, word types and Co:Vietnamese examples) on each row in the Co- Vietnamese dictionary file.
Step 2: check Co words in the Co vocabulary database, if not there, add them.
Step 3: read the index of the Co word
Step 4: separate the Vietnamese word from the set of readable Vietnamese words in the second column of the row in the dictionary file. Do this for each separable word in turn:
Step 4.1: check the separated Vietnamese words in the Vietnamese vocabulary database, if not there, add them and make notes to identify new words added to the Vietnamese vocabulary database.
Step 4.2: read the index of Vietnamese words
Step 4.3: extract from the example set the examples Hre : Vietnamese corresponding to the Vietnamese word separated
in step 4 in the example set read in step 1. Convert the example Co : Vietnamese into Vietnamese : Co.
Example: Mó kâi emn hih?: Ông bà khe không ?. converted into Ông bà khe không ?: Mó kâi emn hih?
Step 4.4: check the triple value (Vietnamese word index, Co word index, word class) in the Vietnamese-Co vocabulary database.
If not, add the three values and examples of Vietnamese:Co obtained from step 4.3 to the Vietnamese-Co, Co-Vietnamese vocabulary database.
If it exists, check the examples extracted in step 4.3 in the set of examples corresponding to the three value sets. If any examples do not exist, add them.
Step 5: return to step 1 to read data on each row in the Hre- Vietnamese dictionary file one by one until the end of the file.
Build an interactive environment
The MVC (Model – View – ontroller) design model was chosen to build and manage a bilingual vocabulary database.
The MVC model is a software architecture model used in software engineering. The MVC model divides the application into three design parts, with the goal of separating the interface and code for ease of management, development and maintenance. The design parts of the MVC model include:
The model is responsible for manipulating the database, containing all functions and data query methods such as select, insert, update, delete in the database. The Controller will use those functions and methods to get data and send it to the View.
View is responsible for displaying information to the user through the interface. Display data is received from the Controller.
The Controller receives requests from the user and retrieves the corresponding data from the Model and sends the data through the View to process and return results to the user.
Send request
The MVC pattern in action is illustrated in Figure 2.
Figure 2: MVC design model
The interactive environment is implemented with two interactive modules to import data from Hre-Vietnamese dictionaries and Co-Vietnamese dictionaries to update into vocabulary databases.
Results achieved
Through the interactive environment, the results of updating the Vietnamese-Hre and Hre-Vietnamese bilingual vocabulary database are listed in Table 6.
Table 6: Statistics on the number of word entries in the vocabulary database.
Vocabulary database
Hre-Vietnamese interaction
Vietnamese -Hre
Through the interactive environment, the results of updating the Vietnamese-Co and Co-Vietnamese bilingual vocabulary database are listed in Table 7.
Table 7: Statistics on the number of word entries in the vocabulary database
Vocabulary database
Co-Vietnamese interaction
Vietnamese -Co
In the context of processing ethnic minority languages in general and Hre language and Co language in particular, the article proposes to build a bilingual vocabulary of Vietnamese- Hre, Hre-Vietnamese, Vietnamese-Co, Co-Vietnamese is based on the Hre-Vietnamese, Co-Vietnamese interaction model, achieving the following results:
Unified use of Unicode in bilingual vocabulary database. Contribute to the development of infrastructure for the problem of processing Hre and Co languages in particular and Vietnamese ethnic minority languages in general.
Sharing skills for research activities related to processing Hre and Co languages.
The Hre-Vietnamese, Co-Vietnamese interaction model can be extended for the development of Vietnamese-ethnic minority bilingual vocabulary databases.
The proposed solution is practical, because it has contributed to overcoming limitations in the bilingual vocabulary database of Vietnamese-Hre, Hre-Vietnamese, Vietnamese-Co, Co- Vietnamese that the Previous research has not been performed.
This research is supported by the Provincial Technology Development and Application Research Project, code 03/2023/HD-TKHCN
Thank for all authors of the papers quoted.
Hoang Thi My Le Phan Huy Khanh: The solution to build a Vietnamese-Ede bilingual vocabulary database based on the Vietnamese-Ede interaction model, No 5 (2), pp. 3640, 2017.
Ho Tu Bao, VLSP topic – Document processing branch [Online],
Le Hoang Thi My, Khanh Phan Huy; Deploying environment for processing Ede ethnic minority language in Vietnam, IEEE International Conference on System Science and Engineering (ICSSE), 2017.
Nguyen Minh Tri, Co language lessons, 2016.
People's Committee of Quang Ngai province, Hre language training
and fostering materials, 2019.
Saigon Linguistic Research Institute, Hre language lessons, Saigon
Publishing Ministry, 1974.
(This work is licensed under a Creative Commons Attribution 4.0 International License.)