- Open Access
- Total Downloads : 768
- Authors : P. Sarala, S. Jayaprada
- Paper ID : IJERTV1IS7457
- Volume & Issue : Volume 01, Issue 07 (September 2012)
- Published (First Online): 26-09-2012
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Mining Association Rules Using Ontologies
P. Sarala#1, S. Jayaprada*2
#Department Of Computer Science and Engineering
V.R.Siddhartha Engineering College, Vijayawada, Andhra Pradesh, India
*Department Of Computer Science and Engineering V.R.Siddhartha Engineering College, Vijayawada, Andhra Pradesh, India
Abstract
Association rule mining is considered as one of the most important tasks in Knowledge Discovery in Databases. Among sets of items in transaction databases, it aims at discovering implicative tendencies that can be valuable information for the decision-maker. The rules generated by the existing methods are in more number. To reduce the number of rules several post processing methods and many techniques were developed but they are not effective. This paper aims to develop a new frame work called Mining Interest Rules Using Ontologies for extracting association rules based on user interest and also implementing a real time web semantic engine using an extended robust framework.
Keywords- Association Rules, Association Rule Mining, Ontology, correlation measures, user constraints, Web Ontology Language.
Introduction
Data mining is the nontrivial process of identifying valid, novel, potentially useful and ultimately understandable patterns from data. One important topic in data mining is concerned with the discovery of interesting association rules. Association rules mining allows nonsupervised discovery of implicative and interesting tendencies in databases. An association rule a b implies the presence of the itemset b when an itemset a occurs in a database transaction. Apriori [3] is the first algorithm proposed to extract all association rules satisfying minimum thresholds of support and confidence. If the support threshold is low then we can extract more valuable information. But usually rules are high.
To reduce the number of rules several post processing methods were developed using nonredundant rules or pruning techniques such as pruning, summarizing, grouping or visualization based on statistical information in the database. Many methods such as Rule deductive method [4], Stream Mill Miner, Data Stream Management Systems [2] , Partitioning around medoids clustering technique [5] , Constraint-based Multi-level Association Rules with an ontology support [1] were developed but which are not effective.
This paper is implementing a new framework to reduce number of rules based on user interest. User constraints can be categorized as succint constraints that can be pushed into the intial data selection process at the start of mining. Monotonic constraints can be checked and once satisfied not to do more constraint checking at their further pattern growth. Anti- monotonic constraints can pushed deep into the mining process to restrain pattern growth. Also instead of statistical measures we can use correlation measures such as lift, cosin and overall confidence and this work is extended to implement a real time web semantic engine using an Association Rule Interactive post- Processing using Schemas and Ontologies framework.
Methodology
Our proposed system Mining Interest Rules Using Ontologies consists of the following steps:
Define User Data and Search
Ontology Construction
Define User Data and Search Ontology: User will specify the particular item on which he interested. Searching for the user specified (interest) item in ontology.
Here the user is searching for the item milk-cream on which he/she interest.
Visualize the Results
Generate Candidate Itemsets
Figure1: Mining Interest Rules Using Ontologies
Input Ontology: Ontology construction for a supermarket dataset, to describe the ontology the framework uses the Web Semantic representation language, OWL.
The ontology representation of supermarket dataset which is given as input.
Figure2: Ontological representation of supermarket dataset
Figure3: List of items in dataset
Generate Candidate Itemsets: Generating candidate itemsets.
Visualize the Results: Visualization of the Results in the form of rules.
The rules/results are generated according to the user selected item milk-cream.
Figure4: Association Rules for Milk-Cream item
It is producing the results based on user selected/interested item so the rules generated by this approach is reducing the number of association rules compared with existing approach.
The above approach is used for handling single dataset, so a real time web semantic engine is implementing using an Association Rule Interactive post-Processing using Schemas and Ontologies.
The framework implementation consists of following steps:
Step1: we propose to integrate user knowledge in association rule mining using two different types of formalism: ontologies and rule schemas.
Step2: we propose to use ontologies in order to improve the integration of user knowledge in the post processing task
Step3: we propose the Rule Schema formalism extending the specification language proposed by Liu et al. for user expectations.
Step4: Furthermore, an interactive framework is designed to assist the user throughout the analyzing task.
Interactive post mining Process Description:
Ontology Construction
Filters
Defining Rule Schemas
Interactive Loop
Applying Operators
Visualizing Results
Selection/ Validation
Figure5: Interactive post mining Process
Step1: Ontology will be developing by the user on database items. To describe the ontology the framework uses the Web Semantic representation language, OWL-DL.
Step2: The user expresses his/her local goals and expectations concerning the association rules that he/she wants to find in terms of rule schemas.
Step3: Applying the operators over the rule schemas created. The operators are pruning and filtering. The filters are minimum improvement constraint filter, and item relatedness filter.
Step4: Visualizing the filtered association rules by the user.
Step5: The user validates gained results.
Step6: Filters can be applied over rules whenever the user wants to reducing the number of rules.
Step7: The interactive loop permits to the user to revise the information that he/she proposed. So the user can return to step 2 in order to modify the rule schemas, or can return to step 3 in order to change the operators.
Figure 6: Ontology Searching
Here the user wants to search ontology then it will collect all urls related to that word and constructs ontology for those and then we are applying filters which are specified in the framework. Then it will produce filtered rules (filtered links) which is shown above.
For visualizing the results two ways of representations using here are Tree and Visualization. The rules generation is also done by placing user interest. That means the user will selects his/her interest item then rules will be generates according to the selected item.
Figure 7: Representation of Results using Tree
Figure8: Representation of Results using Visualization
Figure9: Rules for user selected item Encyclopedia
Results
The results of the implementation are defined as follows:
This graph is showing the results of number of association rules provided for each item based on user interest.
Real-time analysis: Proposed approach gives better rank results
Number of
1200 30
25
ProposedRan k
GoogleRank
1000 20
15
800
10
Association Rules 600
400
5
0
http://
http://
http://
200
0
Apriori Improved Apriori
www.j www.o www.l ava.co racle.c onelyp
Pro pos edR ank |
1 |
3 |
14 |
Goo gleR ank |
1 |
4 |
24 |
This graph is presenting the comparative results between Apriori and Improved Apriori.
-Apriori algorithm implemented by using support and confidence measures so it is generating 1048 association rules.
-Improved Apriori implemented by using lift, confidence, leverage, conviction measures so it is generating 148 association rules.
-When compared with the Apriori, Improved Apriori is reducing the association rules from 1048 to 148.
Rank
Fr oz |
B ak |
Br ea |
Bi sc |
Mi lk |
Ju ic |
Fr uit |
P art |
C he |
|
Rank |
55 |
42 |
14 |
58 |
70 |
58 |
81 |
18 |
23 |
160
140
120
Rank
100
80
60
40
20
0
This graph is presenting the comparative results between google search and our approach (using Association Rule Interactive Post mining using Ontologies and Rule schemas).
While we are searching for oracle then the related link http://www.oracle.com will be provided in 4th position by google search and in 3rd position by our approach.
We want search for lonely planet then the related link http://www.lonelyplanet.com will be provided in 24th position by google search and in 14th position by our approach.
Conclusion
The post-processing of association rules is improved by using a framework in order to generate association mining rules of user interest and also implemented a real time web semantic engine using an extended robust framework called Association Rule Interactive Post mining Using Schemas and Ontologies.
References
-
A. Bellandi, B. Furletti, V. Grossi, and A. Romei, Ontology- Driven Association Rule Extraction: A Case Study, Proc. Workshop Context and Ontologies: Representation and Reasoning, pp. 1-10, 2007.
-
Hetal Thakkar, Barzan Mozafari, Carlo Zaniolo. Continuous Post- Mining of Association Rules in a Data Stream Management System. Chapter VII in Post-Mining of Association Rules: Techniques for Effective Knowledge Extraction, Yanchang Zhao; Chengqi Zhang; and Longbing Cao (eds.), ISBN: 978-1-60566-404-0.
-
Wenxiang Dou, Jinglu Hu, Gengfeng Wu, Interesting Rules Mining with Deductive Method, ICROS-SICE International Joint Conference 2009.
-
Xuan-Hiep Huynh, Fabrice Guillet and Henri Briand, Extracting representative measures for the post-processing of association rules, 2006.