Recommendation Framework using Pattern Searching Mechanism, HD Technique, Social Data

Saurabh R.    Deshpande; Jyoti R.    Yemul

doi:10.17577/IJERTV3IS110199

Volume 03, Issue 11 (November 2014)

Recommendation Framework using Pattern Searching Mechanism, HD Technique, Social Data

DOI : 10.17577/IJERTV3IS110199

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 82
Total Downloads : 176
Authors : Saurabh R. Deshpande, Jyoti R. Yemul
Paper ID : IJERTV3IS110199
Volume & Issue : Volume 03, Issue 11 (November 2014)
Published (First Online): 04-11-2014
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Recommendation Framework using Pattern Searching Mechanism, HD Technique, Social Data

Mr. Saurabh R. Deshpande

Department of Information Technology Sinhgad Technical Education Societys, SKNCOE,

Pune, India

Prof. Jyoti R. Yemul

Department of Information Technology Sinhgad Technical Education Societys, SKNCOE,

Pune, India

AbstractData on the web is growing exponentially. Also, users of internet are dependent on it for their day-to-day activities like internet banking, shopping and many more. For E-Commerce businesses to grow accurate recommendation of product suites is necessary to attract new customers and retain existing ones. Typically, existing recommendation techniques are based on Collaborative Filtering which are dependent on rating data which may be unavailable in most of the cases. Due to this, recommendations generated are less accurate. In this paper, a new recommendation technique is used which will increase accuracy of the recommendations generated. Though accuracy and time taken to generate recommendations varies according to searched query, but, approximately accuracy is improved by 50-55% in proposed system.

KeywordsCollaborative Filtering, Data Mining, E- Commerce, Recommendations

INTRODUCTION

We are living in information age; everyday data is getting generated exponentially and extracting knowledge out of that data is becoming a challenge. Web data mining is a technology that aims to provide interesting patterns from large amounts of data and knowledge discovery from data. Customers are using E-commerce websites for online shopping which contains insight information about customer behavior, likes, dislikes, preferences and priorities. Recommendations has got immense importance so as to improve customer experience and increase in business.

Currently available recommendation systems has reached limitations in terms of accuracy because of dependency on user-item rating data. This rating data is unavailable most of the time. In proposed methodology, a new recommendation technique is suggested which uses frequent pattern matching and Heat-Diffusion algorithm along with the inclusion of social media dataset so as to increase accuracy of the recommendations generated.

In this paper Section 1 contains basic information about domain and introduction of the problem and its solution. Section 2 contains contains an overview of existing system. Section 3 includes a description of system architecture. Section 4 contains an explanation about proposed system.

Section 5 describes implementation. Section 6 contains results. Section 7 conclusion of this system and future work.
RELATED WORK
1. Collaborative Filtering (CF)
  
  Collaborative Filtering is a technique used by many recommendation engines. Neighborhood based and Model based are the two approaches of collaborative filtering. [2]
  Neighborhood based approaches are applied for the predictions and are widely used in commercial CF systems. Example includes used-based approach and item-based approach. User-based approach analyzes rating data of similar users and generate recommendations for active user on that basis. While an item-based approach, items rated by active user are analyzed and recommendations are generated. [1]
  In model based approach datasets are trained previously and based on already trained model predictions are made. Example of model based approach includes the clustering model Google News categories of news like business, sports, technology are predicted using the heading words of the news. Words in the heading are applied to already trained model and news is placed in the respective cluster. As more and more content are added to the cluster model becomes more clever and recommends more accurately. [3][5]
  Recommendations based on CF techniques depends rating matrix containing user specific ratings. Though, most of the times rating data are unavailable as information on the Web is less structured and more diverse. Also for available rating data question of veracity cannot be addressed. [8][9]
2. Query Suggestion
  
  Query suggestion is a valueable technique so as to recommend relevant queries to user. User searches information by providing queries to search engines; if search engine would able to predict what user may want to search then it will save much time and predictions will be made accurately. [11]
  Model enriches itself by the addition of more and more queries to it, and it learns on its own through the customer behavior. [4]
3. Image Recommendation
  
  Image recommendation is another interesting and most widely used recommendation application on the web. Usually, such systems asks active user to rate some images from and within different categories to find users like and preferences and based on ratings data it then displays similar images which are more likely to the user. [9]
  In general, no matter what type of source data is, the proposed recommendation framework can be applied to most of recommendation tasks on the web, which will give more relevant results.
SYSTEM ARCHITECTURE

In new system architecture, novel scheme to generate recommendation is proposed. For this, DFS algorithm is used to traverse web graph data, Apriori algorithm to decrease time complexity and social data is used to increase accuracy of the system. [1][12]
Fig. 1. Architecture of proposed system
Here in figure 1, architecture flow is shown and is implemented accordingly.

PROPOSED SYSTEM

Proposed system for generating recommendations allows user to use it as a general framework for recommendation. Here, in this paper it is demonstrated and results are calculated by considering a sample dataset.

New system uses DFS and apriori algorithm to extract subgraph related to searched query and heat values are calculated for each query-URL relationship.

Heat Diffusion

Heat Diffusion is a physical phenomenon; in physical medium, heat always flows from a position with higher temperature to lower temperature. In the same way, queries are also interrelated with each other in someor the other way. [1]
Graph Diffusion

Graph is a data structure that is most suited for web data as, relationship between the nodes is strongly establishes using graphs. Heat Diffusion technique can be applied to established graph nodes assuming information propogation on web graphs.[12]
Random Jump

According to heat diffusion technique, heat can only propogate through the links that are connect nodes in a graph. But in practical scenario, random relations do exist even though nodes are not directly connected. To capture such relations random jump technique is used. [1]

Building Graph for Recommendation

Here to build the generalised recommendation framework , sample query-URL dataset is used. Relationship is established betweem searched queries and clicked URLs. Sample dataset is as shown in table.

TABLE I. SAMPLE DATASET

ID	QUERY	URL	RANK
368	p>TWITTER	HTTP://WWW.FACEBOOK.COM	3
368	TWITTER	HTTP://EN.WIKIPEDIA.ORG/WIKI/TWITTER	1
1248	IPHONE	HTTP://WWW.APPLE.COM/IPHONE	4
1248	IPHONE	HTTP://WWW.YOUTUBE.COM/WATCH?P=OFXXG	3
2598	GOOGLE	HTTP://WWW.GOOGLE.COM	6
2598	GOOGLE	HTTP://WWW.GMAIL.COM	8
2598	GOOGLE	HTTP://WWW.YOUTUBE.COM	7

This sample dataset is now represented in terms of graph datastructure. Queries & URLs are considered as nodes of bipartite graph and edge from query to URL exists if user has clicked URL u after issuing query q.

Bipartite Graph, Bql = (Vql, Eql), where, Vql = Q U L, Q={q1, q2,…qn}, and L={l1, l2,….,ln}. Eql = {(qi, lj) | there is an edge from qi to lj}.

543	MOBILE	WWW.SNAPDEAL.COM/PRODUCTS/MOBILESPHONES	2
543	MOBILE	GADGETS.NDTV.COM/MOBILES/ALL-BRANDS	6

Fig. 3. Directed query-URL bipartite graph

The weight on query-URL edge is normalized by the number of times that query is issued, while the weight on a directed URL-query edge is normalized by the number of times URL is clicked. After the conversion of graph, query suggestion algorithm is designed. [1][12]

Query Suggestion Algorithm

Fig. 2.

Undirected query-URL bipartite graph
1. A converted bipartite graph G = (V+ U V* , E) consists of query set V+ and URL set V*
2. Given a query q in V+ , a subgraph is constructed by using depth-first-search in G
  
  Undirected graphs cannot directly processed with heat diffusion as it cannot predict interpret results accurately.Hence the undirected bipartite is converted into directed bipartite graph.
3. A subgraph is given to apriori algorithm to retrieve frequently appearing query and clicked URLs
4. Start the diffusion process using , f(1) = eRf(0)
5. Top-K queries with the largest values in f(1) are displayed as suggestions. [12]

IMPLEMENTATION

Proposed system is implemented as a browser based application with username and password authentication mechanism.
1. Procedural Steps
  1. U is the set of users, U = {u1, u2, u3}
  2. D is the set of data, D = {d1, d2}
    
    d1 = {q, u}, q = query, u = clicked URL for the query
    
    d2 = {u1, u2, i}, u1 = user, u2 = user related to u1 & i = related entity of u1
  3. Q be the main set of entered query, Q = {q1, q2, q3}
  4. SYS = {DX, DF, AP, BG, HD, HV}
    
    DX = It Data Extractor which extract the data from the dataset
    
    DF = Search query using DFS in database
    
    AP = Filter the results of DFS using Apriori Algorithm
    
    BG = It generate the Bipartite graph by considering query and url as a node
    
    HD = Heat Diffusion find out H-D matrix for query and H-D with Random Jump matrix
    
    HV = It is Heat Vector which suggest the final recommendation for the given query in particular order
  5. P be the set of processes P = {P1, P2, P3, P4}
    
    P1 = {e1, e2}
    
    where,
    
    e1 = i|i database designing from the dataset
    
    e2 = j|j show all clicks through data from the database
    
    P2 = {e1, e2, e3, e4}
    
    where,
    
    e1 = i|i Take the Query from the user e2 = i|i Search query using DFS
    
    e3 = j|j Filter the results of DFS using Apriori
    
    e4 = j|j generate the directed bipartite Graph
  6. Graph G = {E,V} where, V={v1, v2, v3} be the set of vertex and E={(v1,v2),(v2,v3)} be the set of edges
7) P3 = {e1, e2}

where,

e1 = i|i find out the similarity information propagation on Web graphs

e2 = j|j Find out H-D matrix for query and H- D with Random Jump matrix

Fi(t) = heat at node Vi at time t

8) P4 = {e1, e2}

where,

e1 = i|i find out the Heat Vector

e2 = j|j suggest final recommendation on the base of Heat Vector in given order

Fi(t) = heat at node Vi at time t
RESULTS & DISCUSSIONS Experimental evaluation of recommendation

framework is given in this section. In proposed system, evaluation is done by taking sample dataset and results are shown by issuing a query aa. Time and Accuracy are the parameters that are targetted for comparison.

Fig. 4. Graph comparing results of query suggestions by considering Apriori algorithm and without Apriori algorithm

Fig. 5. Grpah comparing results of social recommendation by considering Apriori algorithm and without Apriori algorithm

Fig. 6. Graph comparing results of query suggestions by considering sample social media dataset without apriori algorithm

Fig. 7. Graph comparing results of query suggestions by considering sample social media dataset with apriori algorithm

Fig. 8. Graph comparing results of query suggestions by considering query-URL dataset & sample social media dataset with & without using apriori algorithm

As results shown in above graphs, due to inclusion of apriori algorithm time taken to generate query recommendation is significantly reduced. Also by considering social media dataset recommendation includes various sources which are previously exists only in the form of text. Overall performance of the system is improved by 60-65% interms of time and accuracy.
CONCLUSION AND FUTURE SCOPE

Existing system uses collaborative filtering technique to generate recommendation. The respective technique uses user-item rating data which is unavailable most of the times which affects accuracy of the system. In this paper, a general framework for recommedation using DFS, Apriori algorithm, HD technique and social media data is proposed that reduces time taken to generate recommendation and increases accuaracy of the recommendation. Due to this, the overall performance of the system is improved by 60-65% footnotes.

In future, the proposed system can be built using new generation big data technology by adding more feature to UI like query suggestion, event processing.

ACKNOWLEDGEMENT

The authors would like to thank all the unknown reviewers for their valuable comments and suggestions.

REFERENCES

B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, Item-Based Collaborative Filtering Recommendation Algorithm, ACM, MAY 2001
H. Cao, D. Jiang, J. Pei, Q. He, Z. Liao, E. Chen, H. Li, Context- Aware Query Suggestion by Mining Click-Through and Session Data, ACM ON KDD, 2008
H. Cui, Ji-Rong Wen, Jian-Yun Nie, and Wei-Ying Ma, Query Expansion by using user logs, IEEE, July/August 2003
Ma, I. King, and M. Rung-Tsong Lyu, Mining Web Graphs for Recommendations, IEEE Transaction on Knowledge and Data Engineering, VOL. 24, No. 6, June 2012
H. Mase, and H. Ohwada, A Collaborative Filtering Incorporating Hybrid-Clustering Technology, ICSAI 2012
H. Ma, H. Yang, M. R. Lyu and I. King, Mining Social Networks Using Heat Diffusion Processes for Marketing Candidates selection, ACM, October 2008
H. Ma, H. Yang, I. King, and M. Rung-Tsong Lyu, SoRec: Social Recommendation Using Probabilistic Matrix Factorization
J. Yu, K. Xie, H. Zhao, F. Liu, Prediction of User Interest Based on Collaborative Filtering for Personalized Academic Recommendation, IEEE, 2012
N. Abdullah, Y. Shlomo Geva, Integrating Collaborative Filtering and Search-based Techniques for Personalized Online Product Recommendation, 11th IEEE International Conference on Data ining, 2011
N. Craswell and M. Szummer, Random Walks on the Click Graph,

ACM, 2007
S. Hui, Lu. Pengyu, and Z. Kai, Improving Item-Based collaborative Filtering Recommendation System with Tag, IEEE, 2012
Saurabh R. Deshpande, and Jyoti R. Yemul, Web Graph Recommendation Technique using Heat Diffusion Method and Apriori Algorithm, Cyber Times International Journal of Technology & Management, vol. 7, Issue 1, October 2013-March 2014
W. Xia, L. He, J. Gu, K. He, and L. Ren, Boosting Collaborative Filtering Based on Missing Data ImputationUsing Items Genre Information, IEEE, 2009

Recommendation Framework using Pattern Searching Mechanism, HD Technique, Social Data

Leave a Reply