Enhancing User Experience using Machine Learning

DOI : 10.17577/IJERTV7IS020132

Download Full-Text PDF Cite this Publication

Text Only Version

Enhancing User Experience using Machine Learning

Sumaiya PK

Associate Developer,

SAP Labs Whitefield, Bangalore, Karnataka, India

Abstract— Many business solutions are available, but what differentiates the best products is Customer Satisfaction. Customer satisfaction is the key factor in any business. By performing User Activity Tracking we ensure that honest feedback from the customers is passed on to the developers, which helps them analyze the difficulties that customers face. The feedback is collected through intermediators. The customers share the feedback with the consultants, consultants share their understanding to the lead, which then goes to product owner who then share it with the developers. In such a process, the customer feedback is prone to modification based on the stake holders understanding, which sometimes leads to an issue radically different from what the customer had stated. Marinating the sanctity of customer feedback is imperative as it allows the developers to identify the features that would enhance and enrich the User Experience. We can achieve a lossless capture of user feedback with the help of Clickstream. Clickstream will capture all the events triggered along with the time of the event. This difference in the time will be plotted and a peak in the graph would determine a potential pain point for the customer. This instant data retrieval can also help us reduce the number of customer tickets that we encounter, as we would exactly know where there is an issue and can start working on it even before the incident is raised. The huge amount of user activity tracking data can be added into a Machine Learning algorithm which would help us analyze patterns of normal user behavior and abnormal user behavior (When there is any User experience issue). We cant rely on a single set of rules to ascertain that the user is facing an issue. Every application is different, features are different, customers are from different locations etc. Hence a simple rule based algorithm will not be helpful. We need Machine Learning Algorithm to find a lasting solution to the problem.

Index TermsClickstream, Machine Learning, MongoDB, User Activity Tracking.

  1. INTRODUCTION

    Any business solution is considered to be good when it solves the exact problem. However, there are innumerable solutions available in the market. What differentiates the best solution from the rest, is the way the user feels about a solution. The end user is the one who uses the entire solution which is available in the form of websites/web applications/on premise solution. Hence how user feels about the solution is the key factor to be considered, which in simple terms can be called as User Experience. Any business Solution be it a Web application or a Website or Mobile app etc., is always dependent on ongoing user inputs. If user finds

    any difficulties in understanding and using the solution itself then the productivity of the user is hampered. The task which can be completed in 5 hours might end up taking up to 7 hours. This causes the end user to be frustrated and unhappy while using the application. When the websites / Web applications/ mobile applications provide easy and self- understandable User Interface then it provides a better user experience leading to happy customers. Customer satisfaction is what determines which company products are best.

  2. COMMON USABILITY ISSUES

    1. Lack of appropriate name for UI Element

      This is one of the biggest issue faced in terms of usability. Many a times the name given to a UI element is not self- explanatory. From a development point of view, there is a tendency to make the UI fancy and use fancy vocabulary. But the problem is, we deal with a variety of customers spread over different regions consisting of users of different age group and different knowledge level. Thus, analysis of what naming format is appropriate is quite difficult. Machine learning can help us analyze and decide on a better naming convention [8].

    2. Self-Understanding

      The UI elements should be clearly visible, spacious and self-understandable. The icons used and the color, fonts etc., should be connected to real world for easy grasp [10]. In many cases, the UI is made to be complicated thereby demanding the users to invest a lot of time understand the flow and find the UI elements on the Screen.

    3. Varying perceptions of developer and customer

      It is an extremely well-known fact that the understanding level of a beginner and an experienced person is different. We are familiar with things that we use very often. Similarly, when any developer works on an app, they are familiar with the terms they use in a UI, but relying on such knowledge introduces a bias into the application that works against the user. Customers use different apps at different times. E.g. HR of any company will use products for different uses, like payroll, leave request etc., In such a case, one user is using different applications and hence terminology will be different. The UI of Finance app will look entirely different compared to Payroll. The challenge here will be to develop

      both in such a way that it is self-explanatory. Hence, we need to know customers level of understanding to develop user friendly apps.

    4. Quick Turnaround time

    We all understand the quote Time is money. A user expects issues to be resolved on time, but with multiple stakeholders involved such as, Project Manager, User Representative(s), Developer(s), Support team, the time available for response is depleted [9]. A significant time is spent by the brokers to identify the issue through mails and calls and then fix the issue.

    The amount of data available

    Note: Calculating the data based on certain assumptions. Assuming 10000 users of one application

    One application requires around 200 clicks (User activity) So, we have 10000*200 = 2000000 data points in one day One month 2000000 * 30 = 60000000 data points in one month

    Just in ERP there are around 12 major fields; considering each field as one app

    We have 60000000 * 12 = 720000000 data points in one month

    Then we have 720000000 * 12 = 8640000000 data points in one year.

    The process of user feedback involves many mediators. Customer Customer Lead Customer manager Consultant Consultant leadConsultant manager

    Product Owner Developer Team Lead Developer

    In this process of feedback everyone explains their understanding to the next level. In the process, the actual feedback might get corrupted or modified.

    Hence the suggestion here is to make user experience better based on data collected at the source. While the customer is using the app, all the activities are recorded in background and added to a Machine Learning algorithm. This algorithm will help developers identify any unusual behavior in the data. The machine will learn using the data fed, and learn to identify unusual behavior. E.g.: consider one user using an application where there is a list of tasks, and user needs certain amount of time to complete the task. Now when there is an issue, the time taken will be much different compared to the normal pattern. Machine Learning algorithm will help us identify this outlier and show the cause of it.

    Note: Any observation that appears to be inconsistent when compared to the remaining data set available is termed as an Outlier.

  3. INDEX TERMS ELABORATED

    1. Clickstream

      The process of recording and analyzing the user actions or mouse movements in any Web browser or any software pplication is called Clickstream. The cumulative data will

      contain details of the path the user takes in any browser/application and in what order. When the user clicks/types anywhere in the web browser or any software application, the users activity will be recorded. The activity log will consist of which browser/application is loaded, time required for loading, the page/view in the browser/application, number of pages viewed, frequency of the pages viewed, different featured used, User details, etc., This data will provide hints on users flexibility in using the browser/application and where the user finds difficulty while using the app.

    2. Machine Learning

      Machine Learning(ML) involves the process of training the machine. In this process, we try to convert experience into Knowledge. The Machine is trained using large amounts of data, this data acts as an input. The input will be added to a training algorithm, which analyzes the data and recognizes patterns. The outcome will be the knowledge [2].

    3. Why machine learning

      We have many users providing huge amount of data. Every user exhibits different behavior in using the app. This User Experience is influenced by a wide range of behavioral parameters like emotional, experiential, affective and aesthetics [11]. Its not easy to identify the behavior of each user and analyze it only when the user is facing a difficulty in using the app. Users are spread all over the world, working in different time zones, having different levels of experience, different knowledge level etc. [1]

      Users are spread over a wide range. Even within the same continent users belonging to different country exhibit different behavior. This applies to different states in the same country and different cities in the same state. All these combinations are based on just one parameter, geographic locations, we can include many other parameters such as experience level, requirement, type of data base needed, etc., Hence we need a framework that is capable of adjusting to the ever-changing customer scenario. Machine learning algorithms can learn continuously and provide analysis.

    4. MongoDB

      MongoDB is a document oriented database. The data will be stored in JSON style documents. Different documents come together to form a Collection. Each document is independent of the other in terms of fields, content, size etc. MongoDB is easy to scale which makes it extensively used in Big Data use-cases [6].

    5. User activity tracking

      With the help of Clickstream, we can record the activities performed by the user. This tracking will be done with the help of Clickstream. The data captured will include the URL, Application Name, the pages viewed, amount of time spent of different pages, texts entered/edited/deleted in Text boxes, checkbox etc., any item in the UI can be captured.

      NOTE: The data will be captured adhering to the Data Privacy Rules as agreed on the Non-Disclosure Agreement (NDA).

      Fig. 1: Basic Architecture Diagram

      TABLE I: List of User Data that can be recorded

      In this step, we input all the existing experience data which is the training data for the Machine. Making use of all the existing experience the Machine will be trained to analyze different patterns and perform set of grouping.

      The scenario here demands the user activity data as the training dataset to perform the grouping. With the help of the existing user activity data, grouping will be performed by using time as one of the parameters. By analyzing all the data, major grouping can be done which depicts the approximate time consumed to perform different sets of activities.

      Consider User Activities as

      A1 , A2 , A3, A4…….An

      In Fig. 2 the User activity data with respect to time consumption is shown. The Graph shows different users performing activities consuming certain amount of time.

      The activities which consume approximately similar time to complete will be grouped into one.

      Hence, Group1( G1 ) will have set of activities that fall into

      certain time range.

      G1 {A1, A2 , A5 , A8 , A67 , A102 ,…….}

      User ID

      App Name

      Page/V iew

      Name

      Control ID

      Event

      Time

      001

      CloudRep ortingDe

      mo

      page 0

      button0

      press

      2017-09-13

      :14:07.19

      001

      CloudRep ortingDe mo

      page 0

      xmlvie w0–

      userid_inp ut_id

      liveC hange

      2017-09-13

      :14:08.21

      001

      CloudRep ortingDe mo

      page 0

      xmlvie w0–

      userid_inp ut_id

      Chan ge

      2017-09-13

      :14:09.42

      001

      CloudRep ortingDe

      mo

      page 0

      button1

      press

      2017-09-13

      :14:19.37

      001

      CloudRep ortingDe mo

      page 0

      xmlvie w1–

      userid_inp ut_id

      Chan ge

      2017-09-13

      :14:20.33

      Similarly, we will have G1,G2 ,G3 ,…..Gn .

      In Fig. 3 the grouping of different activities is shown by using color coding. Each color depicts different group.

      e.g. The Red color depicts the activities which consume much more time when compared to other colors.

      In the process of tracking the User Activity, we need to analyze what data is needed to fulfill our outcome. Table 1 shows the sample data that can be collected and used as training data set into Machine Learning Algorithm.

  4. MACHINE LEARNING ALGORITHM

    Of the various algorithms available, the scenario determines which one is to be used. In the process of analyzing User behavior, we use some concepts of Distance based Algorithm of detecting outliers [18].

    There are 2 major datasets in ML Process.

    1. Training Dataset

    2. Test Dataset

    A)Training Dataset

    Fig. 2. Training dataset of User Activity(event) against Time

    G1C2 G1TApprox ~ G1TA2

    Then the Cluster is made of all the data for

    G1C1 to GnCn .

    Any data which has more than allotted time difference will be considered as an Outlier.

    Fig. 4. Outlier detection of unusual user behavior

    Fig. 3. Grouping of Training dataset of User Activity(event) against Time

    B)Test Dataset

    Once the analysis of training dataset is accomplished, we input the actual dataset (test data) to obtain results.

  5. IMPLEMENTATION PROCESS Key steps involved

      • Implement custom event listener

      • Store event data in a repository

      • Feed the event data to our ML Algorithm

      • Analysis of output produced by ML

    Consider User Activities represented as

    A1 to

    An .

    These steps are described in detail below.

    Time Consumed for each Activity can be represented as T1

    to Tn

    Initial mapping of activity to the respective group will be performed.

    1. Implement custom event listener

      This step involves creating a custom event listener for the purpose of capturing all the data related to a User activity. Table 1 provides a probable list of interesting data points,

      Considering Activity

      A1 to

      A10

      belongs to G1 .

      which can be extended to include more data points based on

      Time consumed for each Activity will be as

      GnTAn Gn (Tn1 Tn )

      e.g. G1TA1 G1 (T2 T1) – Time consumed for completing Activity1 which belongs to Group1.

      G1TA2 G1 (T3 T2 ) – Time consumed for completing Activity2 which belongs to Group1.

      Approximate Time for each group is provided by the Analysis which can be represented as G1TApprox

      requirement.

    2. p>Store event data in a repository

      Event data captured by the event listener should be stored in a repository. Accumulation of this data would lead to more accurate results provided by the ML Algorithm. Repository should be interchangeable to allow for quick changes in database technology e.g. Switching repository from standard SQL to MongoDB.

    3. Feed event data to our ML Algorithm

      Machine Learning Algorithm will process the data stored in the repository. We will make use of Regression analysis of

      The difference of G1TApprox

      with data

      G1TA1 to GnTAn

      ML and perform outlier detection, which will help us identify

      the event where the user is facing an issue

      will be calculated which can be represented as Check C .

      GnCn GnTApprox ~ GnTAn

    4. Analysis of output produced by ML

    The outcome obtained by the ML can be used as needed. The time series captured along with the event will point to the

    e.g. G C G T

    ~ G TA

    exact issue faced by the user (described in detail in Scenario

    1 1 1

    Approx 1 1

    1). And the analysis of number of users with respect to the event/controller will help us identify the most used features

    in an application which will be used for future enhancements (described in detail in Scenario 2).

  6. REAL TIME CUSTOMER ISSUE HANDLING. The test data from the Customers(Users) are uploaded in

    the ML algorithm and the outcome is described in detail in

    the below two scenarios.

    A)Scenario 1: Help fixing the issue

    Machine Learning has the ability to find different patterns which provide valuable analysis by using large data sets. Our scenario requires the need to identify an issue. Machine is trained with the previous experience (Dataset), so any data other than the usual pattern will be identified. We make use of Novelty Detection where the machine will determine unusual pattern given a set of past experience data. The term unusual is very subjective, hence the approach here can be that each observation will be given some rating based on a degree of novelty [3]. Pattern recognition is one of the application of Novelty Detection. Pattern recognition focuses on recognizing different patterns from a given set of data. Using these concepts the analysis based on assumed data is described. The Chart below shows the connection between user performing different activity in one app vs the time duration. The data is plotted as, User1 performing activity 1 to activity 10. User1 takes 1 minute to complete activity 1, 2 minutes to complete activity 2 and so on. Which means on an average User1 takes 1 minute per activity. But when it comes to activity 8 User1 takes 13 minutes to complete just one activity. This is clearly visible in the chart shown in Fig 5. When we consolidate the data, this behavior is present for other users as well. Fig 6 shows the user behavior with respect to different controls used in the UI. With such graph, it is easy to analyze other parameters as well which is shown on hovering. This consolidated data will be directly sent to the developer. Now the developer can easily figure out which part of UI the customer actually finds it difficult and instantly fix this and upgrade in the immediate next release. This will help optimize human performance and improve user satisfaction [12]. And we might be even able to fix the issue, before the customer even creates the ticket. This will save money and more importantly save time.

    User Activity Tracking

    Fig. 5. Plot of User Activity(event) against Time

    Fig. 6. Plot of User Activity (Control ID) against Time

    B)Scenario 2: Future Enhancement

    Customers use many applications and features across the world. We can use the past behavior of the customer to obtain a pattern of the most used features. This analysis will help predict the features that can be planned for enhancement. The patterns are obtained based on the behavior of similar users; hence it is collaborative in nature. This analysis is termed as collaborative filtering [4]. With this honest feedback development team, will also know the need and choice of the customers. By getting the plot against the customers and the apps used we can identify which app is used by most customers, allowing the development team to plan enhancements on the same. In Fig. 7, its very clear that App1 has more number of users. So, the development team can plan more enhancements for the same. Similarly, App3 and App4 have a lesser percentage of users, which indicates that there might be an issue.

    Fig. 7: Plot showing number of users for different applications

  7. CHALLENGES

    • Performance The Performance of the Customer system will get affected when the tracking data hits the backend very often. Hence, intervals of capturing data should be decided based on the Performance needed.

    • Scaling In the Process of tracking the user activity, the controllers developed should be scalable so that any number of features can be tracked.

    More Scaling better performance for ML analysis More Scaling Reduced System Performance

  8. FURTHER ENHANCEMENT

The analysis obtained from the ML will help us solve various usability issues. The Solution suggested in this paper needs manual effort from the development team to analyze the ML outcome and decide which issues need to be fixed. In future, this process can as well be automated. The ML analysis will exhibit different patterns which displays any unusual behavior of the user while using any Website/Web Application/On-Premise products etc., We can integrate an automated voice recorded system or a pop-up help message in the Users Screen, which will be triggered when any such pattern is recorded. With this the user, can type or speak out the issue faced and with help of AI we can show a set of possible solutions and important links to refer. So, the customer incidents will come up only when User faces issues beyond the help of AI system.

ABBREVIATION

AI-Artificial Intelligence DB- Data Base

JSON- Java Script Object Notation ML- Machine Learning

NDA-Non-Disclosure Agreement SQL – Sequential Query Language UI- User Interface

REFERENCES

  1. Shai Shalev-Shwartz and Shai Ben-David When Do We Need Machine Learning in Understanding Machine Learning from Theory to Algorithms, ew York: Cambridge University Press, 2014, pp. 20-24

  2. Shai Shalev-Shwartz and Shai Ben-David Introduction in Understanding Machine Learning from Theory to Algorithms, New York: Cambridge University Press, 2014, p.19

  3. Alex Smola and S.V.N. Vishwanathan A Taste of Machine Learning in Introduction to Machine Learning, Cambridge: Cambridge University Press, 2008, pp.7-12

  4. Alex Smola and S.V.N. Vishwanathan A Taste of Machine Learning in Introduction to Machine Learning, Cambridge: Cambridge University Press, 2008, p.4

  5. SAP, SAPUI5:UI Development Toolkit for HTML5, Extending Applications. [Online]. Available: https://help.sap.com/saphelp_uiaddon10/helpdata/en/40/7feaf830c94e 4c9de48ce08adabd1c/frameset.htm

  6. Tutorials Point, MongoDB. [Online]. Available: https://www.tutorialspoint.com/mongodb/index.htm

  7. Ben Shneiderman and Catherine Plaisant Design Issues in Designing the User Interface, New York: Pearson Edition,2005

  8. Ben Shneiderman, Designing the User Interface, 3rd ed., Addison- Wesley, 1990.

  9. Context of Use Analysis, Usability Body of Knowledge. [Online].

    Available: http://www.usabilitybok.org/cognitive-walkthrough

  10. Bhaskar N U, Prathap Naidu P, Ravi Chandra Babu S R and Govindarajulu P. General Principles of User Interface Design and Websites, International Journal of Software Engineering (IJSE), vol.2,

    Issue.3. 2011

  11. Law, E. L. -C., Roto, V., Hassenzhal, M., Vermereen, A. P. O. S. and Kort, J. (2009). Understanding, scoping and defining user eperience. Proceedings of the 27th International Conference on Human Factors in Computing Systems CHI 09, 719. http://doi.org/10.1145/1518701.1518813

  12. Nigel Bevan, What is the difference between the purpose of Usability and User Experience Evaluation Methods?

  13. Yao L. et al. (2014) Using Physiological Measures to Evaluate User Experience of Mobile Applications. In: Harris D. (eds) Engineering Psychology and Cognitive Ergonomics. EPCE 2014. Lecture Notes in Computer Science, vol. 8532. Springer, Cham.

  14. Graziotin, Daniel & Wang, Xiaofeng & Abrahamsson, Pekka. (2015). The Affect of Software Developers: Common Misconceptions and Measurements. 10.1109/CHASE.2015.23.

  15. E. Fox, C. Guestrin, Machine Learning: Regression, University of Washington. [Online] Coursera, Available: https://www.coursera.org/learn/ml-regression

  16. Y. S. Kim, B. J. Yum, J. Song and S. M. Kim, Development of a recommender system based on navigational and behavioral patterns of customers in ecommerce sites., in Expert systems with Applications, 2005, pp-381-393.

  17. J. Lee, M. Podlaeck, E. Schonberg and R. Hoch, Visualization and analysis of click stream data of online stores for understanding web merchandizing., in Data Mining and Knowledge Discovery, 2001, pp.58-80

  18. Edwin M.Knorr and Raymond T. Ng. Algorithms for mining distance- based outliers in large datasets. In Proc.24th VLDB, pp.390-400 1998.

Leave a Reply