- Open Access
- Total Downloads : 962
- Authors : Vimalkumar B. Vaghela, Dr. Kalpesh H. Vandra, Dr. Nilesh K. Modi
- Paper ID : IJERTV1IS3070
- Volume & Issue : Volume 01, Issue 03 (May 2012)
- Published (First Online): 30-05-2012
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Multi-Relational Classification Using Inductive Logic Programming
Vimalkumar B. Vaghela1, Dr. Kalpesh H. Vandra2, Dr. Nilesh K. Modi3
1Ph.D. Scholar, Department of Computer Science & Engineering, Karpagam University, Coimbatore, Tamilnadu, India
2Director Academic, C. U. Shah College of Engineering & Technology, Surendranagar, Gujarat India 3Professor & Head, Department of Computer Science, S V Institute of Computer Studies, Gujarat, India
Abstract
Due to the increase in the amount of relational data that is being collected and the limitations of propositional problem definition in relational domains, multi-relational data mining has arisen to be able to extract patterns from relational data. Multi relational data mining has become popular due to the limitations of propositional problem definition in structured domains and the tendency of storing data in relational databases. Several relational k nowledge discovery systems have been developed employing various search strategies, heuristics, language pattern limitations and hypothesis evaluation criteria, in order to cope with intractably large search space and to be able to generate high-quality patterns. Inductive logic programming (ILP) studies learning from examples, within the framework provided by clausal logic. ILP has become a popular subject in the field of data mining due to its ability to discover patterns in relational domains. Several ILP based concept discovery systems are developed which employs various search strategies, heuristics and language patterns. LINUS, GOLEM, CIGOL, M IS, FOIL,
PROGOL, ALEPH and WARMR are well-k nown ILP- based systems.
Index Terms Classification; Multi -relati onal data mining; Inducti ve Logic Pr ogramming
-
INTRODUCTION
Due to the impract icality of single-table data representation, relational databases are needed to store comple x data for real life applications. This has led to the development of mult i-relat ional learn ing systems that are directly applied relational data. Re lational upgrades of data mining and concept learning systems generally e mploy first-order predicate logic as representation language for background knowledge and data structures. The learning systems, wh ich induce
logical patterns valid for given background knowledge, have been investigated under a research area, called Inductive Logic Progra mming (ILP). Stephen Muggleton introduced the name Inductive Logic Progra mming (ILP) that is the intersection of machine learning and logic progra mming (Muggleton, 1999).
ILP techniques are widely used for classification and concept discovery in the data min ing algorith ms. In classification, general rules are created according to data and then they are used for grouping the unclassified data. In concept discovery, interesting rules, if e xist, are given to the users of the systems.
Several ILP based systems are developed which emp loys various search strategies, heuristics and language pattern limitations. LINUS, GOLEM, CIGOL, MIS, FOIL, PROGOL, A LEPH and WARMR
[1, 4] a re we ll-known ILP techniques. -
INDUCTIVE LOGIC PROGRAMMING
-
Overview
ILP has ability to use structured, comple x and mu lti-re lational data. ILP system represents exa mples, background knowledge, hypotheses and target concepts in horn clauses logic. ILP techniques are wide ly used for c lassification and concept discovery in the data mining algorith ms. The core of ILP is the use of logic for representation and the search for syntactically legal hypotheses constructed from predicates provided by the background knowledge. In ILP systems, the training e xa mples, background knowledge and induced hypothesis are all e xpressed in a logic progra m form.
Two measures are used to test the quality of the induced theory. After learning, the theory with background knowledge should cover all positive e xa mples (co mpleteness) and should not cov er any negative exa mples (consistency). Completeness and consistency together form correctness [3, 5, 13].
In ILP, a system often starts with an initia l pre- processing phase and ends with a post-processing phase. In pre-processing phase, error (noise) in the given e xa mples can be detected and eliminated. In post –
processing phase, redundant clauses in the induced theory are removed in order to imp rove its efficiency. There are two approaches for the search direction: top – down and bottom-up.
ILP is the study of learning methods for data and rules that are represented in first-order predicate logic. Predicate logic a llows for quantified variables and relations and can represent concepts that are not e xpressible using exa mples described as feature vectors. A relational database can be easily translated into first-order logic and be used as a source of data for ILP [4]. As an e xa mple , consider the following rules, written in Pro log syntax (where the conclusion appears first), that define the uncle re lation:
uncle (X,Y) :- brother (X, Z), parent (Z, Y).
uncle (X,Y) :- husband (X, Z), sister (Z, W), parent (W, Y).
The goal of inductive logic programming (ILP) is to infer rules of this sort given a database of background facts and logical defin itions of other relations [2, 13]. For e xa mp le, an ILP system can learn the above rules for uncle (the target predicate) given a set of positive and negative exa mples of uncle relationships and a set of facts for the relations parent, brother, sister, and husband (the back ground predicates) for the me mbe rs of a given extended fa mily, such as:
uncle (tom, frank), uncle (bob, john), uncle (tom, c indy), uncle (bob, tom) parent (bob, frank), parent (c indy, fran k), parent (alice , john), parent (to m, john), brother (tom, cindy), sister (c indy, tom),
husband (tom, alice), husband (bob, cindy).
Alternatively, rules that logically define the brother and sister relations could be supplied and these relationships inferred fro m a more co mplete set of facts about only the basic predicates: parent, spouse, and gender. If-then rules in first-order logic a re fo rma lly referred to as Horn clauses. A more forma l definit ion of the ILP p roble m follows:
-
Gi ven:
-
Backg round knowledge, B, a set of Horn clauses.
-
Positive e xa mples, P, a s et of Horn clauses (typically ground litera ls).
-
Negative e xa mples, N, a set of Horn clauses (typically ground litera ls).
-
-
Find: A hypothesis, H, a set of Horn clauses such that:
-
p P: H B p (co mp leteness)
-
n N: H B n (consistency)
-
A variety of algorithms for the ILP proble m have been developed [3, 10, 13] and applied to a variety of important data mining proble ms [2, 10, 12]. Nevertheless, relational data min ing re ma ins an under- appreciated topic in the larger KDD co mmunity.
-
-
Search Strategies
Two basic steps in the search for correct theory are specialization and generalization. If a theory covers negative exa mples, it means that it is too strong, it needs to be weakened. In other words, a more specific theory should be found. This process is called specialization. If a theory does not imply all positive e xa mples, it means that it is too weak, it needs to be strengthened. In other words, a more general theory should be found. This process is called generalization. Specialization and generalization steps are repeated to adjust the induced theory in the overall learn ing process.
There are two approaches for the search directions:
top-downand bottom-up. Top-down approach starts with an overly general theory and tries to specialize it until it no longer covers negative exa mples. Specialization (refine ment) operators [4, 5, 11] e mp loy two basic operations on a clause: apply a substitution to the clause and add a literal to the body of the clause. Refine ment graph is the most popular data structured used in specialization. Bottom-up approach starts with an overly specific theory and tries to generalize it until it cannot further be generalized without covering negative exa mp les. Genera lizat ion operators perform two basic syntactic operations on a clause: apply an inverse substitution to the clause and remove a litera l fro m the body of the clause. Relative least general generalization (rlgg) and inverse resolution are basic generalization techniques. Generalization techniques search the hypothesis space in a bottom-up manner. rlgg used in GOLEM and inverse resolution used in CIGOL.
-
Algorithm: Generic ILP based concept learner
Re quire: TR: Target Relat ion, B: Background Knowledge
Ensure: H: hypothesis describing the target relation Start with some init ial theory H
repe at
if H is too strong then
Specialize H
endif
if H is too weak the n
Generalize H
endif
until H is consistent and complete
return H
A generic algorithm for an ILP -based concept learner is presented in Algorithm. Target instances are defined as positive, belonging to the target concept, and negative, not belonging to the target concept. Hypothesis is called strong if it does not cover all the positive exa mp les and weak if it covers some negative e xa mples. The induced hypothesis is called complete if it covers all the positive e xa mples, and consistent if it covers none of the negative e xa mples.
-
-
WELL-KNOWN ILP-BASED SYSTEM
-
Bottom-up System
-
GOLEM
Go le m is an inductive logic programming algorith m developed by Stephen Muggleton and Feng. It uses the technique relative least general generalization proposed by Gordon Plotkin. Therefore, only positive exa mp les are used and the search is bottom-up. Negative e xa mp les can be used to reduce the size of the hypothesis by deleting useless literals fro m the body clause.
In order to generate a single clause, GOLEM first randomly p icks several pairs of positive e xa mples, computes their rlggs and chooses the one with greatest coverage. If the final c lause does not cover all positives, the covering approach will be applied. The covered positives are removed fro m the input and the algorith m will be applied to the remain ing positives (Lavrac & Dzeroski, 1994; Muggleton & Feng, 1990).
-
CIGOL
CIGOL (logic backwards) is interactive bottom-up relational ILP system based on inverse resolution . CIGOL e mp loys three generalization operators which are relational upgrades of absorption, intra-construction and truncation operators. The basic idea is to invert the resolution rule of deductive inference using the generalization operator based on inverse substitution. CIGOL uses the absorption operator. However, CIGOL also needs oracle knowledge to direct the induction process.
-
-
Top-Down System
-
LINUS
LINUS is one of the most popular attribute-value learning environ ments in the ILP history. LINUS is a fra me work that reduces the relational lea rning proble m into a propositional one, employs an attribute-value learning method and transforms the solution hypothesis into relational form. It is a non-interactive ILP system, integrating several ILP attribute-value learn ing algorith m in a single environment. It can be vie wed as a toolkit, in which one or more of the algorith m can be selected in order to find the best solution for the input. The main algorithm behind LINUS[6][7] consists of three steps. In the firs t step, the learning proble m is transformed fro m re lation to attribute-value form. In the second step, the transformed lea rning proble m is solved by an attribute-value learning method. In the final step, the induced hypothesis is transformed back into relational form.
-
MIS
Model Inference System (MIS) is an interactive top-down relational ILP system, wh ich uses refine ment graph in the search process (Shapiro, 1983). In its algorith m, at the beginning the hypothesis is empty (H
= ).Then it reads the exa mp les (either positive or negative) one by one. If the exa mple is negative and covered by some clauses in the Hypothesis set, then incorrect clauses are removed fro m the solution set. If the exa mple is positive and it is not covered by any clause in the solution set, with breadth-first search, a clause c, which covers the exa mp le [6], is developed and added to solution set. The process will continue until the solution set (H) becomes complete and consistent (Lavrac & Dzeroski, 1994).
-
FOIL
First-Order Inductive Learner (FOIL) is a non- interactive top-down relational ILP system, which uses refinement graph in the search process as in MIS. It uses the covering approach for the solution having more than one clause. FOIL is a sequential covering algorith m that builds rules one at a time. After build ing a rule, all positive target tuples satisfying that rule are re moved and FOIL will focus on tuples that have not covered by any rule. When building each rule, predicates are added one by one. At each step, every possible predicate is evaluated, and the best one is appended to the current rule. FOIL chooses the clause according to weighted informat ion gain criteria .
-
PROGOL
PROGOL is a top-down relat ional ILP system, which is based on inverse entailment (Muggleton, 1995; Muggleton & Tamaddoni-Ne zhad, 2008).. It
performs a search through the refine ment graph. Besides a definite p rogra m B as background knowledge and a set of ground facts E as exa mp les, PROGOL requires a set of mode declarat ions for reducing the hypothesis space.
-
ALEPH
A Lea rning Engine fo r Proposing Hypotheses (ALEPH) (Srinivasan, 1999) is a top-down relational ILP system based on inverse entailment simila r to PROGOL. The basic algorithm is the same as PROGOL algorith m whereas in the advanced use of ALEPH, it is possible to apply different search strategies, evaluation functions and refine ment operators. It is also possible to define more settings in ALEPH such as min imu m confidence and support. In the advanced use of ALEPH, contrary to PROGOL, it is possible to select more than one exa mple as a sample at the beginning of the algorithm. The best clause obtained from reducing corresponding bottom clause is then added to the theory. The basic PROGOL algorith m is a batch learner in the sense that all e xa mples and background are expected to be in place before learn ing commences. An incre mental mode allo ws ALEPH to acquire new e xa mples and background informat ion by interacting with the user. The basic PROGOL a lgorith m does a fairly standard general to specific search and constructs a theory one clause at a time. ALEPH a llo ws the basic procedure for theory construction to be altered in a number of ways by using randomized search methods.
Minpos and minacc are the two para meters representing minimu m support and confidence criteria in ALEPH. The default value for minpos is 1 and it sets a lower bound on the number of positive exa mp les to be covered by an acceptable clause. If the best clause covers positive exa mples below this number, then it is not added to the current theory. The default value for minacc is 0.0 (doma in of minacc is the set of floating – point numbers between 0 and 1) and it sets a lower bound on the min imu m accuracy of an acceptable clause.
-
WARMR.
-
-
Design of algorithms fo r frequent pattern discovery has become a popular topic in data min ing. Almost all algorith ms have the same of level-wie search known as APRIORI a lgorith m (Agra wal, Mannila, Srikant, Toivonen, & Verka mo, 1996). The level-wise algorith m is based on a breadth-first search in the lattice spanned by a specialization relat ion between patterns (Dehaspe & Raedt, 1997; Dehaspe &
Toivonen, 2001). The APRIORI method looks at a level of the lattice at a time. It starts from the most general pattern. It iterates between candidate generation and candidate evaluation phases. In candidate generation, the lattice is used for pruning non -frequent patterns from the next level. In candidate evaluation, frequencies of candidates are computed with respect t o database. Pruning is based on the monotonicity property with respect to frequency: if a pattern is not frequent then none of its specializat ions are frequent.
WARMR (Dehaspe & Raedt, 1997; Dehaspe & Toivonen, 2001) is a non-interactive descriptive ILP system that employs APRIORI rule as search heuristics. Therefore, it is not a predictive system, i.e. it does not define the target relation. Instead, it can find the frequent queries including the target relation. Then, it is possible to extract association rules having target relation in the head according to confidence criteria. The target relation is defined as the key relation in WARMR. In WARMR algorith m, at the beginning there are three sets: candidate queries (Q), frequent queries (F) and infrequent queries (I). Q is initia lized as having the key predicate. F and I are initia lized as empty set. In the first level, the specializat ions of the ite m in Q a re generated according to language bias (warmode is similar to mode declaration in PROGOL). They are put into current Q set. After this, frequency values of the items in Q are eva luated and infrequent ite ms are put into I and frequent items are put into F. In the next level, Qset is generated according to previous contents of Q, F and I set. The language bias (warmode ) defines the types and modes of the parameters of the predicates. The user can define the warmode in the settings file in the input data.
-
LANGUAGE BIAS
ILP algorithms usually use one of the following relational languages:
-
general clauses language,
-
Horn clauses language.
Construction of the hypothesis in the language fra me works is not always possible, because of the following reasons:
-
hypotheses space is huge and/or comple x,
-
the language used is not expressive enough.
To solve this problem two types of bias are used a mechanis m e mp loyed by a leaner to constrain the search for hypotheses:
-
language bias determines the search space itself,
TABLE I:
Language bias in ILP-based systems
System
Language bias
mode decla rations (input/output) types of the predicates' arguments
function free clauses
ground clauses
ground litera ls
non- recursive clauses
Horn clauses
LINUS
LE
LH
LH
GOLEM
LE, LB
LH
CIGOL
LE
LH
LH
MIS
LE, LH
LE
LB
LH
FOIL
LE, LH
LE
LB
LH
PROGOL
LE, LH
LE
LB
LH
WARMAR
LE, LH
LE
LB
LH, LB, LE
TABLE II:
Co mparison of ILP-based systems
System
Search Direct ion
Basic Techniques
Use of mode
Use of negative
Allow
recursive
LINUS
Top-down
Attribute-Va lue
Learn ing
No
No
No
GOLEM
Bottom-up
Rlgg
Yes
Yes
No
CIGOL
Bottom-up
Inverse Resolution
No
No
No
MIS
Top-down
Refine ment Graph
Yes
Yes
Yes
FOIL
Top-down
Refine ment Graph
Yes
Yes
Yes
PROGOL
Top-down
Inverse Entailment
Yes
Yes
Yes
WARMAR
Top-down
APRIORI
Yes
Yes
Yes
-
search bias determines how the hypothesis space is searched.
There are two categories of language bias:
-
the syntactic restrictions of the selected logic forma lis m,
-
the vocabulary of predicate, function, variables and constant symbols: function free clauses, ground clauses (e.g. without variables), non-recursive clauses, mode declarations (input/output) of the predicates arguments.
To represent exa mp les, hypotheses and BK in the learning task e xa mp les language (LE), hypotheses language (LH), and BK language (LB) have been used.
Each of language restriction above mentioned could be applied to each of these languages independently, or to all of them together (Table 1).
All ILP systems use some language bias. Mode declarations and learning of non-recursive clauses are necessary for narrowing search in the hypotheses space, but other language restrictions are imposed from the theory. For e xa mple , such a hypothesis does not exist
in general case when both set of e xa mp les and BK set consist of Horn clauses.
-
-
COMPARISON OF ILP-BASED SYSTEMS
In this section, we compare the afore mentioned ILP systems in terms of basic techniques they employ, availability of basic features in concept discovery and their concept discovery performance in diffe rent domains. In co mparison, we included the features of search direction, use of mode declaration, use of negative data and handling recursive rules.
Search direction of the systems is either top -down or bottom-up. In top-down systems, the search starts with a general c lause and at each turn, it makes the clause more s pecific until it covers no negative e xa mples. On the other hand, in bottom-up systems, the search starts with a specialized clause, and at each turn, it generalizes the clause so that it covers more positive e xa mples, until no more improve ment is possible.
-
ACCURACY AND TIME CHARACTERISTICS
The follo wing characteristics are measured in the classical chess and endgame doma in White King and Rook versus Blac k King. The results of the e xperiment are presented in the following table III: the classification accuracy is given by the percentage of correctly classified testing instance and by the standard deviation (sd), averaged over 5 e xpe riments .
TABLE III:
Expe rimental co mparison of ILP -based systems
System
100 t rain ing
e xa mples
1000 train ing
e xa mples
Accuracy
Time
Accuracy
Time
CIGOL
77.2%
21.5 h
N/A
N/A
FOIL
90.8%/p>
31.6 s
99.4%
4.0
min
LINUS
98.1%
55.0 s
99.7%
9.6
min
PROGOL
95.3%
53.9 s
99.6%
8.3
min
Although LINUS is better than other algorithms in small and large tra ining sets, it has one ma jor disadvantage – does not provide features for handling BK. Fro m the rest algorithms PROGOL has better accuracy, but it is slower.
-
CONCLUSION
This work aims to present a review of ILP -based concept discovery systems through illustrations of the working mechanis ms of these systems. The we ll- known systems, LINUS, GOLEM , CIGOL, MIS, FOIL, PROGOL, ALEPH and WARMR, are the
systems that are covered in this comparative study. The comparison is based on basic characteristics of the systems. Although search strategies of FOIL and its fa mily algorith ms ma ke them very effic ient, they have a considerable disadvantage these algorithms in the search process sometimes can prune searched hypotheses.
If the target concept is not clear enough, using WARMR is a wise choice in order to see various frequent clauses hidden in the data. If the user is fa miliar with mode decla rations, search and evaluation mechanis ms, PROGOL and ALEPH are suitable choices. User may try d iffe rent mode declarations, search and evaluation mechanisms to find the settings and rule structures that best identify the target concept. Many inverse resolution algorithms increase the concept description language by constructing predictor descriptors (i.e., predicates), but are either limited to deduction or require an orac le to ma intain reasonable efficiency.
ACKNOWLEDGM ENT
We are thankful to the great GOD for ma king us able to do something.
This research is the part of Ph.D. progra mme in Co mputer Sc ience & Eng ineering, Karpaga m University, India .
REFERENCES
-
Arno J. Knobbe, Arno Siebes, Hendrik Blockeel, Daniël van der Wallen, M ulti-Relational Data M ining, using UM L for ILP, PKDD '00 Proceedings of the 4th European Conference on Principles of Data M ining and Knowledge Discovery Springer-Verlag London, UK
©2000
-
Yusuf Kavurucu, Pinar Senkul, Ismail Hakki Toroslu, AGGREGATION IN CONFIDENCE-BASED CONCEPT DISCOVERY FOR M ULTI-RELATIONAL DATA M INING, IADIS European Conference Data M ining 2008.
-
Raymond J. M ooney, Prem M elville, Lappoon Rupert
Tang, Relational Data M ining with Inductive Logic Programming for Link Discovery,National Science Foundation Workshop on Next Generation Data Mining, Nov. 2002, Baltimore, MD.
-
Yusuf Kavurucu, Pinar Senkul, Ismail Hakki Toroslu,
Confidence-based Concept Discovery in M ulti- Relational Data M ining, International M ultiConference of Engineers and Computer Scientists 2008 Vol I IM ECS 2008, 19-21 M arch, 2008, Hong Kong.
-
Lappoon R. Tang, Raymond J. M ooney, and Prem
M elville, Scaling Up ILP to Large Examples: Results on Link Discovery for Counter-Terrorism, KDD-2003 Workshop on Multi-Relational Data M ining (M RDM – 2003), pp.107-121, Washington DC, August, 2003.
-
Seda Daglar Toprak,Pinar Senkul, A New ILP-based
Concept Discovery M ethod for Business Intelligence, 2007, IEEE.
-
Alexessander Alves, Rui Camacho and Eugenio Oliveira, Discovery of Functional Relationships in M ulti-relational Data using Inductive Logic Programming, Fourth IEEE International Conference on Data M ining (ICDM 04).
-
Dzeroski, S., Lavtac, N. 2001. eds, Relational data mining, Berlin: Springer.
-
Han, J., Kamber, M . 2007. Data M ining: Concepts and
Techniques, 2nd Edition, M organ Kaufmann.
-
Wrobel S, Inductive Logic Programming for Knowledge Discovery in Databases: Relational Data M ining, Berlin: Springer, pp.74-101, 2001.
-
I. Bratko. Prolog Programming for Arti_cial Intelli-
gence, 3rd edition. Addison-Wesley, Harlow, England, 2001.
-
W. Buntine. Generalized subsumption and its applications to induction and redundancy. Artificial Intelligence, 36(2): 149-176, 1988.
-
L. De Raedt, editor. Advances in Inductive Logic Programming. IOS Press, Amsterdam, 1996.
AUT HORS PROFILE
Prof. Vimalkumar B Vaghela, is currently doing his Ph.D. in Computer Science & Engineering at Karpagam University, India. This author is Young Scientist awarded from Whos Whos Science & Engineering 2010-2011 & also his biography is included in American Biographical Institute in
2011. His publication is also available in ieeeexplorer and also in spocus online database. He is currently working as a Assistant Professor in Computer Engineering Department at
L. D. College of Engineering, Ahmedabad, Gujarat, India. He received the B.E. degree in Computer Engineering from C. U. Shah College of Engineering and Technology, in 2002 & M .E. degree in Computer Engineering from Dharmsinh Desai Univrsity, in 2009. His research areas are Relational Data M ining, Ensemble Classifier, Pattern M ining. Author has published / presented more than 5 international papers and 5 national papers.
Dr. Kalpesh H Van dra received the B.E. degree in Electronics & Communication form North Gujarat University, Patan, in 1995, the M .E degree in M icroprocessor System Applications from M .S. University, Vadodara, in 1999 and PhD from Saurashtra University, Rajkot, in 2009. He is working as Director Academic and Head
Of Computer Engineering & Information Technology at C. U
Shah College of Engineering & Technology, wadhwan city, Gujarat, INDIA, Author having more than 15 years of UG & 6 years of PG (M CA and M .E) Teaching Experience. Author had written more than 10 Books related to Computer & IT related area. He has published / presented 10 International and 7 national Research Papers & Guided more than 10 PG Students and more than 155 UG Students Dissertation work. Dr. K H.Wandra is a Chairman of Board of Studies Computer Engineering At Saurashtra University, Rajkot, Core Committee member For Syllabus of UG & PG at Gujarat Technology University, Ahmedabad. Section M anaging Committee M ember of ISTE Gujarat Section, Life member of ISTE, member of IEEE and CSI, Interested for working in the area of Wireless communication, Networks, Advanced M icroprocessors, Data M ining.
Dr. Nilesh K Modi received the M CA degree from A.M .P. Institute of Computer Studies, Kherva, Gujarat, India
and PhD from Bhavnagar University in 2006. He is working as a professor & head of M CA department in Sarva Vidyalayas Institute of Computer Studied, S V Campus, Kadi, Gujarat, India. He is Associate Life M ember in Computer Society of India (CSI) M umbai, Senior Associate M ember in International Association of Computer Science and Information Technology (IACSIT) Singapore, Senior M ember in International Association of Engineers (IAEng) Hong Kong, Senior M ember in Computer Security Institute New York, M ember in Data Security Council of India (DSCI) a NASSCOM initiative New Delhi. Author has published / presented more than 18 international papers and more than 25 national papers. His areas of research interest are Data mining, Computer network, information security.