Evaluating Student's Performance using K-Means Clustering

Shashikant Pradip Borgavakar; Mr. Amit Shrivastava

doi:10.17577/IJERTV6IS050070

Volume 06, Issue 05 (May 2017)

Evaluating Student’s Performance using K-Means Clustering

DOI : 10.17577/IJERTV6IS050070

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 1,212
Total Downloads : 561
Authors : Shashikant Pradip Borgavakar, Mr. Amit Shrivastava
Paper ID : IJERTV6IS050070
Volume & Issue : Volume 06, Issue 05 (May 2017)
DOI : http://dx.doi.org/10.17577/IJERTV6IS050070
Published (First Online): 02-05-2017
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Evaluating Student’s Performance using K-Means Clustering

Mr. Shashikant Pradip Borgavakar

Research Scholar: C.S.E

Swami Vivekanand College of Engineering Indore, India

Abstract Data Clustering is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. In this paper data clustering is used as k-means clustering to evaluate student performance. Evaluating student performance on basis of class test, mid test and final test. As we get cluster of student on this basis of student marks will help to reduce ratio of fail student. This information will help professor to student fail chance before final exam..

Keywords- k-means, Database, academic performance etc.

INTRODUCTION

Data clustering is a process of extracting previously unknown, valid, positional useful and hidden patterns from large data sets (Connolly, 1999). The amount of data stored in educational databases is increasing rapidly. Clustering technique is most widely used technique for future prediction. The main goal of clustering is to partition students into homogeneous groups according to their characteristics and abilities (Kifaya, 2009). These applications can help both instructor and student to enhance the education quality. This study makes use of cluster analysis to segment students into groups according to their characteristics.

LITERATURE SURVEY

Research Paper	Improving the Accuracy and Efficiency of the k- means Clustering Algorithm	An Iterative Improved k- means Clustering	Refining Initial Points for K-Means Clustering	Comparision of various clustering algorithms
Problem being addressed	Lower accuracy and efficiency	Number of Iterations are Less	Estimate is fairly unstable due to elements of the tails appearing in the sample	Which clustering algorithm is best
Importance of the problem	algorithm requires a time complexity	Total number of iterations required by k-means and improved k- means is much larger	Importance of the problem of having a good initial points	Way of Process
Gap in the prior work	Accuracy and Efficiency is most complicated to reducing	Check multiple iterations	To finding Initial Points	Finding algorithm

Mr. Amit Shrivastava

Asst. Professor: Computer Science & Engineering Swami Vivekanand College of Engineering Indore, India

Specific research questions or research objective	To Overcome the problem of Accuracy and Efficiency	This paper presented iterative improved k- means clustering algorithm that makes the k-means more efficient and produce good quality clusters	A fast and efficient algorithm for refining an initial starting point for a general class of clustering algorithms has been presented	data mining is that to discover the data and patterns and store it in an understandable form
Broad outline of how the author solved the problem	Using K- Means clustering Algorithm and The enhanced Method	Iteration improve k- means cluster algorithm	Using Clustering Cluster	Applied DBSCAN and OPTICS algorithms
Details of implementation of procedure	Phase 1 of the enhanced algorithm requires a time complexity of O(n2) for finding the initial centroids, as the maximum time required here is for computing the distances between each data point and all other data-points in the set	Dividing number of parts thencalculate centres and decide memership of patterns then rrepeat same steps	Results on Real Word Data	All clustering algorithm process and find
Key contribution of the paper claimed by the author.	define k centroids, one for each cluster	iterative improved k- means clustering algorithm	Clustering Clusters	K-Means clustering Algorithm

DATA CLUSTERING

Data Clustering is unsupervised and statistical data analysis technique. It is used to classify the same data into a homogeneous group. It is used to operate on a large data-set to discover hidden pattern and relationship helps to make decision quickly and efficiently. In a word, Cluster analysis is used to segment a large set of data into subsets called clusters. Each cluster is a collection of data objects that are similar to one another are placed within the same cluster but are dissimilar to objects in other clusters.
CLUSTERING IN HIGHER EDUCATION

Education is an essential element for the progression and betterment of a country. Education makes a people perfect by which he/she can participate in any progressive work for the country. Education makes a country civilized and well- mannered. Clustering in higher education means it classifies the student by their academic performance. Lack of deep and enough knowledge in higher educational system may prevent system management to achieve quality objectives, data clustering methodology can help bridging this knowledge gaps in higher education system.
PROPOSED MODEL

In university academic performance are measured by internal and external assessment. Internal assessments are class test marks, lab performance, assignment, quiz, attendance. External assessments are previous semester grade and final semester grade. So, by taking the internal assessment and previous exam grade and by using data clustering technique we can predict what will be the final grade of a student.
1. If prev-grade=high, quiz=good,assignment=complete, lab-performance=good ,class-test=good, attendance=regular and then final-grade=good
2. If prev-grade=average, quiz=good, assignment=incomplete lab-performance=good Class- test=average and atendance=regular then final-grade= average
3. If prev-grade=low, quiz=average, assignment=incomplete, lab-performance= poor mid- term=low and attendance=irregular then final- grade=low.
The proposed model try to identify the weak students before final exam in order to save them from serious harm. Teachers can take appropriate steps at right time to improve the performance of student in final exam.
K-MEANS CLUSTERING ALGORITHM

K-means is an old and widely used technique in clustering method. Here, k-means is applied to the processed data to get valuable information .The pseudo-code of k-means clustering is given below.

Step 1: Accept the number of clusters to group data into and the dataset to cluster as input values

Step 2: Initialize the first K clusters – Take first k instances or – Take Random sampling of k elements

Step 3: Calculate the arithmetic means of each cluster formed in the dataset.

Step 4: K-means assigns each record in the dataset to only one of the initial clusters – Each record is assigned to the nearest cluster using a measure of distance (e.g Euclidean distance).

Step 5: K-means re-assigns each record in the dataset to the most similar cluster and re-calculates the arithmetic mean of all the clusters in the dataset.

Fig. Generalized Pseudocode of Traditional k-means.
RESULT AND DISCUSSION The model produced following results:

Graph.1: Shows the relationship between GPA and Attendance ratio.

A. Data Arrangement in tables

We grouped the students regarding their final grades in several ways 3 of which are: Assign possible labels that are same as number of possible grades. Group the students in three classes High Medium and Low. Categorized the students with one of two class labels Passed for grade above 2.20 and Failed for grade less than or equal to 2.20

Table 1

Class

GPA

No of student

Percentage

1

2

3

4

5

2.00-2.20 2.20-

3.00 3.00-3.32

3.32-3.56 3.56-

4.0

5

10

17

15

13

8.33

16.67

28.33

25

21.67

Here, I cluster student among their GPA, that means, from GPA 2.00- 2.20 we have 8.33% student. From 2.20-3.00 student percentage is 16.67%. From 3.00-3.32 we have 28.33%. From 3.32-3.56 percentage is 25% .The percentage is 21.67% between GPA 3.56-4.00. The graphical representation of GPA and the percentage of students among the student is given below.

Graph 2: Number and percentage of students regarding to GPA

Class	GPA	No of student	Percentage
High	>=3.50	28	46.67
Medium	2.20<=GPA<3.5	27	45
Low	<=2.20	5	8.33

After clustering the student, we group the student into three categories. One is High, second is Medium, and the last one is Low. Graphical representation of these three categories is given below:

Graph 3: Shows the percentage of students getting high, medium and low GPA

REFERENCES

Alaa el-Halees (2009) Mining Students Data to Analyze e- Learning Behavior: A Case Study.
Behrouz.et.al., (2003) Predicting Student Performance: An Application of Data Mining Methods With The Educational Web-Based System Lon-CAPA Â© 2003 IEEE, Boulder, CO.
Connolly T., C. Begg and A. Strachan (1999) Database Systems: A Practical Approach to Design, Implementation, and Management (3rd Ed.). Harlow: Addison-Wesley.687
Erdogan and Timor (2005) A data mining application in a student database. Journal of Aeronautic and Space Technologies July 2005 Volume 2 Number 2 (53-57)
Galit.et.al (2007)Examining online learning processes based on log files analysis: a case study. Research, Refelection and Innovations in Integrating ICT in Education.
Han,J. and Kamber, M., (2006) "Data Mining: Concepts and Techniques", 2nd edition. The Morgan Kaufmann Series in Data Management Systems, Jim Gray, Series Editor.
Kifaya(2009) Mining student evaluation using associative classification and clustering. Communications of the IBIMA vol. 11 IISN 1943-7765.
ZhaoHui. Maclennan.J, (2005). Data Mining with SQL Server 2005 Wihely Publishing, Inc

Evaluating Student’s Performance using K-Means Clustering

Leave a Reply