- Open Access
- Total Downloads : 261
- Authors : Amir Hamzah
- Paper ID : IJERTV6IS110226
- Volume & Issue : Volume 06, Issue 11 (November 2017)
- Published (First Online): 28-11-2017
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Opinion Mining and Sentiment Analysis Application for Opinion Classification from Education Questionnaire
Amir Hamzap)
1)Departement of Informatics Engineering Institut Sains & Teknologi AKPRIND Yogyakarta, Indonesia
Abstract Measurement of academic services using questionnaires with multiple choice answers generally provide comments and advice columns. In the data analysis results, comments and suggestions made by the thousands of students cannot be utilized due to the lack of analysis tools. Whereas comments and suggestions could be actually contain student opinions on various things, such as facilities, faculty, library and others. Opinion mining and sentiment analysis as a new tool in text mining can be applied to the data to utilize comments and suggestions. This research applied HMM-POS Tagger to give automatically POS TAG to the sentence based on training POS TAG data by using Hidden Markov Model. By implementing POS TAG pattern the comments can then be determined whether it was opinion or not. Furthermore if it were opinion it can be determined its target and also the orientation of the opinion whether it is positive or negative. The data used was 1,000 comments given POS-TAG manually and 1,000 comments as test data. Sentiment analysis is applied using four methods of classification, namely SVM, NBC, ME and KM-Clustering. The result showed that the accuracy of POS-Tagger was 0.95 and the average of accuracy of four classification method was 0.85.
Keywords: HMM POS Tagger; opinion classification
-
INTRODUCTION
Opinion and opinion orientation are the most important part of decision making for a policy. The right decision is strongly influenced by opinion analysis from various sources related to decision making. For example in the business world, the addition of products by the production manager requires the analysis of the product review of goods on the market. Other examples such as the management of educational services in universities, the measurement of the level of service learning satisfaction can be measured from the opinions of students about the learning process. Opinions appear in a variety of situations, e.g. deliberately requested by an opinion assessment tool through a request for suggestions in a questionnaire activity, or appear naturally from an online forum provided by the college's official site. The volume of online opinion in the form of free text is getting more and more and generally not utilized because of its unstructured shape.
The existence of internet and other on-line information sources is growing very rapidly. Online data and information from companies and organizations are generally unstructured and generally in the form of text that reaches 80% [5]. The emergence of social media such as Facebook (2004) and
Tweeter (2006) has encouraged activities such as reviews, discussion forums, blogs, micro-blogs, comments, and posts that multiply the existence of text documents on the internet. This is because the social media has been used both by individuals and organizations for various interests in conducting information sharing activities. This explosive condition of information further complicates the process of data mining as well as it has been predicted [10]. Therefore, the development of research in the field of opinion mining becomes a very important topic in addition to the previous topics, namely data mining and text mining.
One branch of research that evolved from the information explosion situation on the internet was sentiment analysis and opinion mining. Opinion mining is a challenging research because there is an accumulation of research challenges from Information Retrieval (IR) : Information Extraction, Information Summarization, Document Classification and from Natural Language Processing (NLP) fields such as Named Entity Recognition (NER) and Document Subjectivity Analysis [9]. This research branch examines how one extracts opinions from online media and performs an analysis of those opinions. Sentiment Analysis or opinion mining is a computational study of people's opinions, appraisals and emotions through entities, events and attributes [6].
The application of sentiment analysis and opinion mining to conduct policy evaluation and decision making promises a more practical and economical way than the classical method using the questionnaire approach. Critics of the method of questionnaire is that it is very time consuming and expensive method, while it is also providing results that sometimes cannot catch the real problem. Questionnaires and interviews are considered weak because in general people do not like to answer survey questions that are sometimes long-winded. In this position opinion mining answers the question of opinion extracting by listening rather than by asking like a questioner (by asking), thus more accurately reflecting the true reality [13]. Even further opinion mining allows capturing the emotions of opinion owners [7].
The Institute of Science and Technology AKPRIND as a higher education institution constantly wants to improve services in learning management. For that purpose at the end of each semester the academic administration should evaluate the learning service using the questionnaire instrument with the items provided. So far, there are data of questionnaires that cannot be used and analyzed that is student's suggestion data.
This data amounts to thousands of suggestions or more exactly the opinion of all participants from all courses. Suggestions or opinions can be about academic atmosphere, lecturer, class room, air conditioners, Over Head Projector, or other campus facilities. Within a few years this data is getting stacked up unusable.
consuming, there has been a wide interest in the automation of the POS tagging process [3].
If there is a sentence consisting of n words (wi: i = 1, .., n), and will be assigned a POS-tag for each word that composes the sentence (ti: i = 1, .., n), then this issue can be formulated as seeking the maximum value of:
t
The problem in this research is how to build a software
t arg max P(tn | wn)
(1)
prototype that can extract opinions from a collection of commentary text documents, and then determine the target of opinion and opinion orientation.
n 1 1
1
By applying Bayes theorem in conditional probability, then
(1) can be written to be:
The purpose of this study is to review the application of opinion mining techniques and sentiment analysis to analyze
1 1 1
t arg max
P(wn | tn)P(tn)
(2)
the data suggestions / opinions of students. This research is also designed to create prototype of opinion mining software and sentiment analysis that can extract opinion, analyze opinion, map target of opinion and set opinion orientation
P(wn)
1
Since the value of the denominator is always the same for every sentence, then (2) can be written to be:
t arg max P(wn | tn)P(tn)
(3)
-
RELATED WORKS
The study of opinion mining applications conducted by
[4] on the English National Health Service website that captures 6,412 free comments from treated patients. Analysis1 1 1
By making two assumptions, then equation (3) can be written:
-
The probability of a word depends only on its POS-tag.
n
of comments related to hygiene, hospital services and various
P(wn | tn) P(w | t )
(4)
aspects of hospital responsibilities has resulted in conform between 81% to 89% compared with quantitative rating method that as provided through the questionnaire.
1 1 i i
i1
-
The probability of a POS-tag depends only on the previous POS-tag.
In opinion mining the method to detect opinions is by
P( n) P(t | t )
(5)
applying HMM-POS Tagger. This method is applied to the
t1
n
i1
i i1
collection of trainer data which is the opinion sentence and
By applying (4) and (5) to (3) to be obtained:
non-opinion sentence that has been given POS-TAG or Part of Speech's sign of the word on each word. The program is expected to extract opinions from the commentary text of the
t arg max
n
i1
P(wi | ti )
P(ti
| ti1 )
(6)
questionnaire, as well as search for the object of opinion.
There are several approaches to automated POS tagging, which are rule-based, probabilistic, and transformational- based approaches. Rule-based tagger POS sets a tag to a word based on some of the manual linguistic rules created, for example a word is tagged NOUN if it follows AJECTIVE. The probabilistic approach determines the word tag of a token based on the context probability of manually specified token tags from a corpus. The transformational-based approach combines rule-based and probabilistic approaches to automatically derive symbolic rules from the corpus [11]. The use of the Hidden Markov Model for POS Tagger Bahasa Indonesia was examined by [14] resulted in accuracy of 96.2% and [15], which resulted in an accuracy of 92.2%
-
-
THEORITICAL BACKGROUND
-
HMM POS-TAGGER
HMM-POS Tagger is a method for performing POS- tagging in a sentence automatically based on an analysis and characteristic of POS-tag data from training data collection. Part-of-Speech (POS) tagging known as grammatical tagging is the process of providing a POS-tag to a word in a sentence text. Part-of-speech is a grammatical category of words in a sentence, e.g. verb (VERB), noun (NOUN), adjective (ADJECTIVE), and others. POS tagging is an important tool in many natural language processing applications such as
Furthermore from the results of HMM-POS Tagger will be
generated sentences that have been given POS-tag. In the next step based on these POS-Tag patterns it will be determined whether a text is an opinion or not an opinion. With POS-Tag patterns can also be determined an object of opinion.
-
SENTIMENT ANALYSIS
Sentiment analysis for comments that have been detected as an opinion is done by applying an opinion classification to determine the opinion orientation. There are many methods available in text classification. In this research, we used four methods, namely Naive Bayes Classifier (NBC) method, Support Vector Machine (SVM), Maximum Entropy (ME) and K-Means Clustering (KMC).
NBC Method
The NBC method assumes the collection of opinion documents as D ={d1, d2, …, d | D |} and collection of categories C = {c1, c2, …, c | C |}. The NBC classification is done by finding the probability P (C = cj | D = di), i.e. the probability of category cj if the document is found in. The document is seen as a tuple of words, i.e. <w1, w2, …, wn>, whose frequency of occurrence is assumed to be a random variable with Bernoulli probability distribution [8]. The document classification is to find the maximum value of:
disambiguation, parsing, question-answer, and machine translation systems. Because assigning manual part-of-speech tags to words in a sentence is costly, exhausting, and time-
VMAP = arg max P( j | w1, w2 ,…, wn )
C
c jC
(7)
By providing the Bayes theorem obtained
w. xi – b 1: first class x (15)
or
VMAP=
arg max P(w1, w2 ,…, wn | c j )P(c j )
(8)
w. xi – b -1: xi second class (16)
c jC
P(w1, w2 ,…, wn )
It can be rewritten as:
Since the denominator value is constant for a document, and assuming that each word is independent of each other the equation (8) can be written as:
yi (w xi – b) 1,
for all 1 i n (17)
n
VMAP = arg max
c jC i1
P(wi | c j )P(c j )
(9)
So the problem of finding the maximum hyperplane is the optimization problem:
Practically the calculation of P (cj) is approximated by:
Minimize w with constraints for every i = 1, …, n
P(cj)=
| doc j |
(10)
y (w x – b) 1 (18)
| contoh | i i
| docj | is the number of document categories j and | sample | is the number of sample documents (training).
In computation P (wi | cj) is approached by:
Maximum Entrophy (ME) Method
Classification with Maximum Entropy (ME) applies information theory. Entropy is the average set of information contained in a set of events X = {x1, x2, …, x3} which can be
P(wi|cj)=
| ni 1|
n | vocabulary |
(11)
expressed in:
H (p) =
1
p(x) log e p(x)
(19)
ni is the frequency of occurrences of the word wi in category cj, and n is the word frequency in the category document cj and | vocabulary | is the number of occurrences of all words in the sample document collection.
SVM Method
Support Vector Machine (SVM) was first developed by [1] and continued with a more detailed description by [2]. The SVM concept can be explained as the search for the best hyperplane that serves as a separator of two classes in the input space. For a collection of documents in the form of:
= {(xi, yi) | xi Rp, yi {-1,1}} (12)
where yi is 1 or -1, indicates which class the xi is located. Each xi is a real p-dimensional vector. There will be a maximum margin hyperplane that divides the points for points that have yi = 1 of which has yi = – 1.
Any hyperplane can be written as set of points x satisfying
-
x – b = 0 where (.) indicates dot product. The vector w is a normal vector that is perpendicular to the hyperplane. Parameter || w || determine the offset of hyperplane from the origin along the normal vector w. It will be attempted to choose w and b to maximize margins, or the distance between parallel hyperplanes apart as far as possible to separate data.
These hyperplanes can be described by the equation:
w. x – b = 1 (13)
and
w. x – b = -1 (14)
The distance between the two hyperplanes is 2 / || w ||, so we want to minimize w. To prevent data points from falling into the margins, the following limits must be added:
For each value i
xX
With the value of H (p) is the set of information from the set of events X, and p (x) is the probability of the occurrence of x in the set X. The Maximum Entropy (ME) method is a method to maximize the value of H (p). The maximum value of H (p) will be obtained if the value of X is uniform so that p (x) = 1 /
| X | with | X | is the cardinality of the set X.
The application of the ME Method for document classification is performed by the conditional probability approach of a class of documents when a document is present. Suppose that the set of class of documents is A = {a1, a2, …, ac} and the collection of documents is D = {d1, d2, …, dn}. The determination of class a of document d will be seen by determining the conditional probability value p (a | d) of maximum value of the probability distribution with maximum entropy.
KM-C Method
K-means clustering approach is clustering by using a cluster center as grouping criterion. The cluster center is the average value of all cluster members' objects. Suppose we have a collection of documents D = {di | i = 1,2, … | D |} = {d1, d2, …, d | D |} to be clustered into K clusters. In this case di is a real valued vector that represents the document. The vector has a dimension n, which is the number of unique words in the document collection. Document collections can be represented by the n x |D| size matrix, denoted by [Xij], with the xij element represnting Term Frequency (TF) i.e. the occurrence of the term (word) i in the document j. To obtain better accuracy in computation, the matrix containing the term frequency value is converted into a real element matrix that takes into account the frequency of occurrence of documents containing the i-th word by weighting Invert Document Frequency (IDF). Furthermore, it is also strived that the length of the document vector is always 1 by normalizing the document vector. Weighting that combines TF with IDF then known as TF-IDF weighting that can be formulated as:
wij =
N
(ln( fij ) 1).log
ni
(20)
code as in Table I. Collection of data training for POS-Tagger has format as shown in Fig 1.
<DOC-0001>tolong/VBI gedungnya/NNG dibersihkan/VBP <*DOC>
<DOC-0002>birokrasi/NNU kampus/NNC yang/PR jelek/JJ membuat/VBT
mahasiswa/NNC merasa/VBT dikhianati/VBP
<*DOC>
<DOC-0003>kursi/NNC di/IN dalam/IN ruangan/NNU kurang/RB mengenakkan/JJ <*DOC>
<DOC-0004>lebih/RB ditingkatkan/VBP lagi/RB
<*DOC>
<DOC-0005>acnya/NNG kurang/RB dingin/JJ
<*DOC>
…
<DOC-0999>pasang/VBT ac/NNC pak/UH soalnya/NNG panas/JJ <*DOC>
N 2
(ln( fij ) 1).log
n
i
K-Means clustering algorithm is done by taking K initial vector as the seeds of the centers of the cluster. Furthermore the entire vector document is calculated distance against each cluster center. The document vector that closest to a cluster center then is defined to be a new member of the cluster. The cluster center is the average vector of all vectors in a given cluster.
The algorithm of K-means can be written as follows:
-
Take K object as seed from K center cluster
-
For all objects: locate the cluster with the closest distance, and assign the object into the cluster.
-
Recalculate the center of the cluster with the average of all objects in the cluster
-
Calculate the criteria function and evaluate. If the criterion function does not change the algorithm stops.
-
-
-
PROPOSED METHOD
-
Research Materials
The materials used in this research are POS TAG for HMM-POS Tagger and pairs of POS TAG as rules for opinion detection and target detection which were taken from [14] and [12] with modification. These rules are in the form of POS TAG collections. The POS TAG related to the HMM POS Tagger is shown as in TABLE I, while the POS Tagger data training that use this table is shown in Fig.1. The rule of opinion detection was based on POS TAG pair as shown in TABLE II. The rule of target detection was based on POS TAG pairs as shown in TABLE III.
The source of comments data were text documents consisting of comment collection from suggestion on student questionnaire from Institute of Science and Technology AKPRIND for 6 semesters :
-
Semester 1 2014/2015: 4.128 comments
-
Semester 2 2013/2014: 3.152 comments
-
Semester 1 2013/2014: 3.801 comments
-
Semester 2 2013/2014: 2.551 comments
-
Semester 1 2012/2013: 3.663 comments
-
Semester 2 2012/2013: 1883 comments
From these collections 1000 documents were selected as training data and 1000 documents were selected as test data.
-
-
Data Analysis
The research steps taken in solving the problem are divided into two steps: the pre-processing step of the document and the step of detection and extraction of opinion, the target detection and finally the classification of opinion
Pre-processing Document and training
Documents in the form of student comments with irregular form are converted into standard sentence. The next step, the parsing process in the collection of training data so that it can be determined the frequency of occurrence of each POS-TAG word in the collection of train data with POS-TAG
Fig. 1. POS-Tagger data training format
The collection of training data for opinion classification has format as shown in Fig. 2. In the training data the orientation of opinion document has been manually set, which are positive opinions and negative opinions. The orientation of opinion on the training data is labeled on the document in the form of code (+) and (-).
<DOC-0001(+)>mengajarnya sudah oke, perlu ditingkatkan </DOC>
<DOC-0002(-)> kamar mandi kotor sekali, tolong dibersihkan yang rutin </DOC>
<DOC-0003(+)>pak joko ngajarnya memang joss..</DOC>
<DOC-0004 (-)>kipas angin kurang baik, fasilitas wivi tolong lebih ditingkatkan lagi biar baik </DOC>
<DOC-0005(-)>lebih ditingkatkan lagi fasilitas perpustakannya </DOC>
<DOC-0006(+)>birokrasi kampus sangat jelek urusan terlalu bertele-tele </DOC>
Fig. 2. Classification data training format
After POS Tagger training data has been processed, the information of POS-TAG to each word in the sentence collection can be determined. The next step, i.e. opinion extraction and opinion classification was conducted follow the flowchart as shown in Fig.3.
No
POS
POS Name
Contoh
1
OP
Open Parenthesis
({[
2
CP
Close Parenthesis
)}]
3
GM
Slash
/
4
;
Semicolon
;
5
:
Colon
:
6
Quotation
" and '
7
.
Sentence terminator
.
8
,
Comma
,
9
–
Dash
–
10
Ellipses
11
JJ
Adjective
Baik, Bagus
12
RB
Adverb
Sekali, sangat
TABLE I. POS TAGGER DATA
13
NNC
Countable Noun
Kursi, Kulkas
14
NNU
Uncountable Noun
Gula, hujan
15
NNP
Proper Noun
Toyota, Sony
16
NNG
Genetive Noun
Motornya
17
VBI
Intransitive Verb
Pergi
18
VBT
Transitive Verb
Membeli
19
VBP
Passive Verb
ditingkatkan,
diperbaiki
20
IN
Preposition
Di, Dari, Ke
21
MD
Modal
Bisa
22
CC
Coor-Conjunction
Dan, Atau, tetapi
23
SC
Subor-Conjunction
Jika, Ketika
24
DT
Determiner
Para, Ini, Itu
25
UH
Interjection
Wah, Aduh, Oi
26
CDO
Ordinal Numerals
Pertama, Kedua,
Ketiga
27
CDC
Collective
<>Numerals
Berdua
28
CDP
Primary Numerals
Satu, Dua, Tiga
29
CDI
Irregular Numerals
Beberapa
30
PRP
Personal Pronouns
Saya, Mereka
31
WP
WH-Pronouns
Apa, Siapa, Dimana
32
PRN
Number Pronouns
Kedua-duanya
33
PRL
Locative Pronouns
Sini, Situ
34
NEG
Negation
Bukan, Tidak
35
SYM
Symbols
#,%,^,&,*
36
RP
Particles
Pun, Kah
37
FW
Foreigns Words
Word
TABLE II. RUL E FOR OPINION DETECTTION
No
Rule
Examples
1
RB JJ
sangat buruk, dengan bagus , memang
jelek
2
RB VB
semoga berjalan, jika memilih
3
NN JJ
LCDnya jelek, alatnya bagus
4
NN VB
Ngajarnya membosankan, perkataannya
menjengkelkan
5
JJ VB
mudah difahami, cepat memahami
6
CK JJ
bagus atau baik, tetapi malas
7
JJ BB
sama bagus
8
VB VB
membuat pusing, membikin bosan
9
JJ RB
indah sekali, bagus sekali
10
VB JJ
membikin bingung
11
NEG JJ
tidak seindah, tidak semudah
12
NEG VB
tidak mengerti, tidak memahami, bukan
mengajar
13
PRP VBI
saya menyukai, kita suka
14
PRP VBT
kita suka
15
VBT NN
memiliki kedekatan, memiliki kepekaan
16
MD VBT
Perlu mengambil referensi
17
MD VBI
Perlu dikembangkan
18
UH VBP
Tolong dicat, tolong diperbaiki
19
JJ VBP
Mudah diterima, sulit dipahami
TABLE III. RULE FOR TARGET DETECTTION
No
Rule
Examples
1
NN
Ac, lcd, internet
2
NNG
Laboratoriumnya, lcdnya
3
NNP
Pak Joko, Bu Yuli, Pengok
4
NN NN
kantin kampus, ac pengok
5
NN CC
NN
Kampus dan lab
6
NN IN NN
AC di Klas
Fig. 3. Flowchart of Opinion extraxtion and classification
Opinion Extracttion and Target detection
The step of opinion extraction was done after pre- processing and POS tagging had been done. By applying the rule fo opinion detection from TABLE II the document using opinion and opinion targeting rules using target rule is determined as the scheme in Fig.3. After opinions and opinions targets was obtained the last step was to make an opinion classification to determine whether an opinion be positive or negative.
-
-
RESULT AND DISCUSSION
The prototype for Pre-processing document collection for HMM POS-Tagger has been developed in the interface as shown in in Fig.4.
Fig. 4. The result of document pre-processing
For HMM POS-Tagger demonstration to assign POS TAG to the document that is input to the system as shown in Fig.5. The sentence tolong gedungnya dibersihkan is tagged by system as depicted in Fig.5 : tolong/VBI gedungnya/NNG dibersihkan/BP with accuracy 100%.
Fig. 5. Result of HMM POS Tagger
Opinion extraction from various target opinions can be shown in Fig.6., where target can be chosen selected from combo box component.
Fig. 6. Opinion Ekstraction without classification
The classification of opinions into two class i.e. positive or negative can be demonstrated in Fig.7. The classification method can be selected from four options (NBC, SVM, MAxEnt and KM-Klus). Those figure also shown that NBC method classification for target Perpustaaan has accuracy of 81.54%.
Fig. 7. Opinion Ekstraction with classification
Performance Testing Program
Program performance is measured through precision and recall parameters for POS Tagger HMM performance in opinion detection and target detection. Test results using 1000 data training and 1000 test data is shown in Table IV.
TABLE IV. PERFORMANCE OF HMM POS TAGGER
PerformanceM easurement
Application of HMM-POS Tagger
Opinion Detection
Target Detection
Precision
0.95
0.91
Recall
0.92
0.89
The performance of the classification program is measured through accuracy parameters. The comparison of four methods test performance is shown in Table 5. The result showed that KMC-method has best performance compared to the three other methods.
TABLE V. PERFORMANCE OF CLASSIFICATION
PerformanceM easurement
Classification Methods
NBC
SVM
ME
KMC
Accuracy
0.84
0.83
0.84
0.88
-
CONCLUSION
In this study the prototype of preprocessing the collection of comments has been developed. The system prototype of opinion extraction and classification has also been developed. The program has been demonstrated how to extract the opinion using HMM POS-Tagger which is equipped with sentiment analysis using four opinion classification methods, namely NBC, SVM, ME and KM-Clustering. Accuracy in opinion detection and object opinion detection averages over 90% of class of opinion opinions with four methods of accuracy reaching an average above 80%. The result also showed that the KMC-method has the best performance in accuracy compare to the three other method.
[1] |
REFERENCES Boser,B.E.,Guyon, I.M. and Vapnik,V.N.,1992, "A Training Algorithm for Optimal Margin Classifiers",Proceedings of the 5th Annual ACM Workshop on Computational Learning |
[9] |
Pang, B., Lee, L. and Vaithyanathan, S., 2002, Thumbs up?: sentiment classification using machine learning techniques", Proceeding EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing – Volume |
Theory,1992,pp. 1171-1183. |
10, pp: 79-86 |
||
[2] |
Cortes,C. and Vapnik,V., 1995,"Support-Vector Networks", |
[10] |
Putten,P.V.D., Kok, J. and Gupta,A., 2002, Why the Information |
Machine Learning, 20, pp.273-297 |
Explosion can be Bad for Data Mining, and How Data Fusion |
||
[3] |
Cutting, D., Kupiec,J., Pesderson,J. and Sibun,P., A Practical |
Provides a Way Out, Proc.of the 2nd SIAM International |
|
Part-ofspeech Tagger, Xerox Palo Alto Research Center, in |
Conference on Data Mining,pp:11-13 |
||
Proceeding of the third Conference on applied Natural Language |
[11] |
Pisceldo. F., Manurung, R. and Adriani,M. Probabilistic Part- |
|
Processing , 1992, pp.133-140. |
of-Speech Tagging for Bahasa Indonesia, Third International |
||
[4] |
Greaves, F., D.R. Cano, C. Millet, A.Darzi, and L. Donaldson, |
MALINDO Workshop, colocated event ACL-IJCNLP, |
|
Use of Sentiment Analysis for Capturing Patient Experience |
Singapore, 2009 |
||
From Free-Text Comments, Journal of Medical Internet |
[12] |
Rozi,I.F., Pramono,S.H. dan Dahlan, E.A.,2012, Implementasi |
|
Research, 2013, 15:11, e239. Online publication date: 1-Jan- |
Opinion Mining (Analisis Sentimen) untuk Ekstraksi Data Opini |
||
2013 |
Publik pada Perguruan Tinggi, Jurnal EECCIS, Vol.6,No.1,Juni |
||
[5] |
Grimes, S., 2013, Unstructured Data and the 80 Precent Rule, |
2012 |
Clarabridge Bridgepoints.
-
Liu,B.,2010, Sentiment Analysis: Multi Facet Problem, IEEE Intelligence System, 25 (3),pp:76-80
-
Loia, L. and Senatore, S.,A fuzzy-oriented sentic analysis to capture the human emotion in Web-based content, Knowledge- Based Systems 58, 2014, pp. 75-85 Online publication date: 1-
Mar-2014
-
McCallum, A. and Nigam, K., 1998, "A Comparison of Event Models for Naive Bayes Text Classification", AAAI/ICML-98 Workshop on Learning for Text Categorization, pp. 41-48
-
Shelke,N.M., Deshpande,S. and Thakre,V., Survey of Techniques for Opinion Mining, International Journal of Computer Applications (0975 8887) Volume 57 No.13,
November 2012
-
Wicaksono, A.F. and Purwarianti,A., 2010, "HMM Based Part- of-Speech Tagger for Bahasa Indonesia", Proceedings of the 4th International MALINDO Workshop, Jakarta.
-
Widhiyanti, K and A. Harjoko, POS Tagging for Bahasa Indonesia dengan HMM dan Rule Based, INFORMATIKA Vol.8.,No.2., November 2012, pp.151-167