- Open Access
- Authors : S. Peerbasha , M. Mohamed Surputheen
- Paper ID : IJERTV10IS100122
- Volume & Issue : Volume 10, Issue 10 (October 2021)
- Published (First Online): 27-10-2021
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Prediction of Academic Performance in College Students with Bipolar Disorder Using Deep Featured Spectral Scaling Classifier (DFSSC)
-
Peerbasha#1, M. Mohamed Surputheen#2
1 Research Scholar, PG & Research Department of Computer Science, Jamal Mohamed College (Autonomous), Affiliated to Bharathidasan University, Tiruchirappalli, Tamilnadu, India.
2 Associate Professor, PG & Research Department of Computer Science, Jamal Mohamed College (Autonomous), Affiliated to Bharathidasan University, Tiruchirappalli, Tamilnadu, India.
Abstract:- In modern years, the students' performance is analyzed with many difficulties, which is an important problem in all academic institutions. The main idea of this paper is to analyze and evaluate the academic performance of college students with bipolar disorder by applying data mining classification algorithms using Jupyter Notebook, a Python tool. This tool has been generally used as a Decision-Making tool in terms of the students' academic performance. Due to non-related feature evaluation, the classification accuracy is low. To resolve this problem, we proposed to scale out the mutual features to classification performance based on the Deep Featured Spectral Scaling Classifier (DFSSC) to predict the result based on the performance of the bipolar student. This implementation progressed with Mutual Intensive Feature Selection (MIFS) model to select the features to reduce the dimension. Then selected features are trained with a Softmax Decision Tree Classifier to optimize with Deep Multi-Objective Propagation Network (DMOPN) to categorize the student's performance. Such classification model deals with 13 measures like Accuracy, Precision, Recall, F1 Measure, Sensitivity, Specificity, R Squared, Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, TPR, TNR, FPR, and FNR. Therefore conclusion could be reached that the Decision Tree Classifier is better than that of different algorithms.
Keywords: Bipolar disorder data prediction. Student data analysis, deep learning, features selection, neural network, intensive mutual rate.
-
INTRODUCTION
Data Mining could be a promising and prospering wilderness in examining information, and the outcome of the investigation has numerous applications. The bipolar student Information Mining can likewise allude to Knowledge Discovery from Data (KDD). This framework is the machine-driven or helpful extraction of examples addressing information verifiably kept or caught in immense data sets, information from student datasets. Bipolar disorder-based student performance prediction is a developing strategy in data prediction for improving student learning performance.
The use of information mining is generally common in training frameworks for analyzing the student learning category. Feature analysis and classification in information mining is an arising field that can be successfully applied in training. The instructive information mining employments a few thoughts and ideas, for example, Association rule mining, grouping, and bunching. The information that arises can be utilized to all the more likely comprehend students' advancement rate, student consistency standard, students progress rate, and the students prosperity. The information mining framework is urgent and essential to gauge the students execution improvement. The order calculations can characterize and investigate the students information set in a precise way. The attributes that may be used for student's academic performance such as Name, College, Class Tutor Name, Gender, Age, Address, Family size, Parents Status, Mothers Education, Fathers Education, Mothers Job, Fathers Job, Reason to choose this college, Students Guardian, Travel Time, Study time (Weekly), Number of Subjects failed so far, College Support (Extracurricular), Family Educational Support, Paid courses attended, Extracurricular Activities involves, Nursery schools attended, Higher education, Internet access at home, love relationships, Family relationships, Free time after college, Going out with friends, Weekdays Alcohol / Smoking, Weekends Alcohol / Smoking, Current Health Status, Number of days absent, CGPA, Number of Psychological motivation sessions attended, CGPA Grade.
The main idea of this research paper is to use data mining classification algorithms to study and analyze the academic performance of college students with bipolar disorder. This research paper constitutes five sections. Section-1 deals with the introduction; Section-2 enumerates a related work. Section-3 presents the idea and aspects of different classifiers. Section-4 elaborates the Data Pre-processing. Section-5 explains the implementation of model construction. Section-6 handles results and discussions. The conclusion will be given in Section-7.
Students' academic performance in advanced education (HE) is investigated broadly to handle scholarly underachievement, expanded college dropout rates, graduation delays, and other persevering difficulties [1]. In straightforward terms, student performance alludes to the degree of accomplishing present moment and long haul objectives in education [2]. Nonetheless, academicians measure students' accomplishments according to alternate points of view, going from grade point normal (GPA) to future occupation possibilities [3]. The writing offers an abundance of computational efforts to further develop student execution in schools and colleges, most remarkably those determined by data mining and learning investigation
procedures [4]. The reasonable expectation of student execution empowers the identification of low-performing students, along these lines, engaging instructors to mediate right on time during the learning cycle and carry out the necessary intercessions. However, productive intercessions are not restricted to student prompting, execution progress observing, shrewd mentoring frameworks advancement, and policymaking [5]. This undertaking is supported by computational advances in information mining and learning examination [6]. A new complete study features that around 70% of the checked-on work explored student execution expectation utilizing student grades and GPAs. In comparison, just 10% of the investigations reviewed the forecast of student accomplishment utilizing learning results [3]. This hole impelled us to research the work where the learning results are utilized as an intermediary for student scholastic execution.
Result-based instruction is a worldview of schooling that spotlights executing and achieving the purported learning results [7]. As a result, student learning results are objectives that action the degree to which students achieve the proposed capabilities, explicitly information, abilities, and qualities, toward the finish of a specific learning measure. In our view, the student results address a more comprehensive measurement for deciding student scholastic accomplishments than simple appraisal grades. This view agrees that the learning results address basic components of student scholarly achievement [8]. Also, eminent HE accreditation associations, like ABET and ACBSP, utilize the learning results as the structure blocks for surveying the nature of instructive projects [9]. Such significance calls for more exploration endeavors to anticipate the fulfillment of learning results, both at the course and program levels. The absence of systematic reviews exploring the expectation of student execution utilizing student results has roused us to seek after the destinations of this exploration. In an efficient writing survey (i.e., SLR), a bit by bi convention is executed to distinguish, select, and assess the integrated investigations to respond to explicit examination questions [10, 11].
-
RELATED WORKS
M. Silva Guerra et al. (2018) explain that the application of data mining in an academic database is presented, aiming to identify the reasons for student dropouts by preventing failures.
Polyzou et al. (2019) Developing tools to support students and learning in a traditional or online setting is a significant task in today's educational environment. The initial steps towards enabling such technologies using machine learning techniques focused on predicting the student's performance regarding the achieved grades. However, these approaches do not perform as well in predicting poor-performing students.
Y. Yao et al. (2019) focus on the prediction of problems and performance rather than revealing the underlying causal relationships. Based on a unique exam data, we extracted the abilities of examinee from HSCE (High School College Exam) based on the knowledge of educational experts, but it can not only help educational authorities
Azar, G et al. (2015) introduces a semi-automated practice that aids in the preliminary diagnosis of the patient's psychological disorder. This is accomplished based on matching the description of a patient's mental health status with the mental illnesses. Bang, S et al. (2017) developed a quad-phased data mining modeling significant diagnostic criteria are selected that are effective for diagnostics for feature observation.
S. Lai et al. (2020) explain that adaptive e-learning can personalize students' learning environment to meet their demands. Numerous studies have indicated that understanding the role of personality in the learning process can facilitate learning. Hence, personality identification in e-learning is a critical issue in education.
Z. Xu et al. (2021) Blended learning offers the possibility of individualized teaching for teachers. The combination of flipped classrooms and SPOC is a good way to implement blended learning. Still, few studies have verified the predictability of learning performance in such a scenario to explore individualized teaching. Castaldo R et al. (2016) demonstrate mental stress may cause cognitive dysfunctions, cardiovascular disorders, and depression. HRV features were then extracted and analyzed according to the literature using validated software tools. Statistical and data mining analyses were not performed on the extracted HRV features.
Q. Liu et al. (2020) investigation, an experimental methodology was adopted to generate a database. The raw data were preprocessed to fill up missing values, transform values in one form into another, and relevant attribute/ variable selection. However, it has a disadvantage in analyzing. Q. Liu et al. (2021) present a holistic study of student performance prediction. To directly achieve the primary goal of performance prediction. This proposes a general Exercise-Enhanced Recurrent Neural Network (EERNN) framework by exploring both student's exercising records and the text content of corresponding exercises
-
DEEP FEATURED SPECTRAL SCALING CLASSIFIER
This proposed system predicts the mutual features to classification performance based on the Deep Featured Spectral Scaling Classifier (DFSSC) to predict the result based on the performance of the bipolar student. This implementation progressed with Mutual Intensive Feature Selection (MIFS) model to select the features to reduce the dimension. Then selected features are trained with a Softmax Decision Tree Classifier to optimize with Deep Multi-Objective Propagation Network (DMOPN) to categorize the student's performance. Classification is a process that is used to categorize data into predefined unconditional class labels.
Feature weight
Extracted Features
Feature weight
Extracted Features
Preprocessing
Preprocessing
Mutual Intensive Feature Selection
Mutual Intensive Feature Selection
Student Logs
Input dataset
Deep Multi-Objective Propagation Network (DMOPN)
Deep Multi-Objective Propagation Network (DMOPN)
Deep Multi-Objective Propagation
Deep Multi-Objective Propagation
Softmax segmented decision tree
Softmax segmented decision tree
Classification
Categorized classes
Optimization
Classification
Categorized classes
Optimization
Predicted class
Figure 1 Deep Multi-Objective Propagation Network (DMOPN)
Classification can be a two-venture process comprising of training and testing. In the initial step, a model is developed by investigating the data tuples from training data with attributes. Figure 1 shows the Deep Multi-Objective Propagation Network (DMOPN). For each tuple in the preparing data, the value of the class label attribute is understood. The categorization rule is applied to preparing data to frame the model. In the second step of arrangement, test data is utilized to look at the exactness of the model. Assuming the precision of the model is proper, the model can be utilized to group the unknown information tuples.
-
Data Pre-processing
In initially, the college student dataset is preprocessed with several factors that are considered to have attributes as features. These prompting feature labels are considered non-redundant data with uncovered data-filled or non-filled in raw datasets. For this work, recent real-world data is collected from educational institutions to process with a de-noise filter algorithm to remove and prepare the dataset. Dataset collected from college students database are given with the attributes.
Table-1: Symbolic Attribute Description
Attributes
Description
Possible Values
Name
Name of the Student
Text
College
Name of the College
Jamal Mohamed College, Holy Cross College, Bishop Heber College, St. Josephs College, EVR College, National College, Cauvery College for Women, Indira Gandhi College, SRC, MIET, Others
CTN
Name of the Class teacher
Text
Gender
Gender of the student
Male, Female, Transgender
Age
Age of the students
18 30
Address
Whether the students are from city or village
Urban, City
F.S.
Family size
Less than 3, Greater than 3
P.S.
To identify whether they are joined or not
Joined, Apart
ME
Educational qualification of mother
0 (None), 1 (Primary Education, Upto 4th Standard), 2 (5th std to 9th std), 3 (10th Std to 12th std), 4 (UG / PG)
F.E.
Educational qualification of father
0 (None), 1 (Primary Education, Upto 4th Standard), 2 (5th std to 9th std), 3 (10th Std to 12th std), 4 (UG / PG)
M.J.
The job of the mother
Teacher, Healthcare-related, Jobless, Others
F.J.
The job of the father
Teacher, Healthcare-related, Jobless, Others
RTCC
The exact reason for joining the college
Close to home, College Reputation, Course Preference, Others
S.G.
Guardian of the student
Father, Mother, Others
td rowspan=”22″/>
T.T.
Time of the traveling
Less than 15 minutes, 15 minutes 30 minutes, 30 minutes 1hour, more than 1 hour
S.T.
Time duration of the week
Less than 2 hours, 2 5 hrs, 5 10 hrs, more
than 10 hrs
NSFS
Arrears so far in the subjects studied
Nil, 1,2, 3, and more than 3
CSIEC
The importance is given by the college for extracurricular activities
Yes, No
FES
Educational Support is given by the family
Yes, No
PCA
Any additional course for cost attended
Yes, No
ECA
Involved in any extracurricular activities or not
Yes, No
Nursery
Attended in Nursery school or no
Yes, No
HE
Want to join Higher Education
Yes, No
YEAH
To identify whether the internet is accessed or not
Yes, No
AYL
Love relationships
Yes, No
F.R.
About relationships of the family
Very bad, bad, good, very good, excellent
FTAC
Free time after the class
Very bad, bad, good, very good, excellent
GOLF
Time spent with friends
Very bad, bad, good, very good, excellent
WDAC
Alcohol consumption from Monday to Friday
Very bad, bad, good, very good, excellent
WEAC
Alcohol consumption during Saturday and Sunday
Very bad, bad, good, very good, excellent
CHS
Status of the health
Very bad, bad, good, very good, excellent
NDA
Total number of days absent for the classes
0 to 90
CGPA
Cumulative Grade Point Average
0.0 to 10.0
NPMSA
Number of Psychological motivation sessions attended
0 to 10
GCP
The grade for the CGPA
Distinction, First Class, Second Class, Pass, Fail
Datasets used inside the classification algorithm should be clear and pre-processed to take care of absent or excess attributes.
Pseudocode: Preprocessing Dataset
Input: Student academic dataset (Sa) and student bipolar disorder dataset (StBd)
Output: preprocessed dataset
Start
Step 1: Initialize the Collective dataset sa and StBd
Step 2: Read all dataset records RdSa1, Sa2, Sa3 + StBd1, StBd2….
Create Index Iist DxSa+StBd(Index+1)
Step 3: For (Rd combines attributes as feature FiDx) Check Null field if Yes
Terminate record and update record set Rcs.
Else
Return RcsSa+StBd;
End if; End for;
Step 4: Compute Duplicate and non-fill case Rcs Remove duplicate if RcsRepeat row(Rw)
Update Rcs End if
Check if (Rcs!=Empty fill near average)
Step 5: Get the entire future index Rcs.
ReturnFdx= gets real index, where n is total dominant
features with k observed feature.
Return redundant feature set-RdsRcs(Fdx) End if
Stop
The above Pseudo code prepares to process the original data for noise reduction and the collective student dataset from bipolar disorder and academic performance as features that are ordering the feature index with redundant features.
-
Mutual Intensive Feature Selection (MIFS)
The intensive form of relational features selection is based on Mutual Intensive Feature Selection (MIFS). This estimates the mutual reaction of differential features related to academic and disorder feature dependencies. Accuracy for considering related features by extracting disorder from the student attributes from learning strategies similarity to group the measure by the average mean rate of the threshold value. To improve the clustering performance, this presents an efficient nested integrated feature. The Mutual Intensive Feature Selection (MIFS) groups the relational features by searching the scaling values based on marginal weights.
Pseudocode: Mutual Intensive Feature Selection (MIFS)
Input: Preprocessed Rds, Bipolar rate range (Br), Learning rate (Lr)
Output: Mutual Features MIFS
Start
Step 1: Prepare the Feature selection process from Rds
Step 2: Compute Rds Sa learning class intensive rate For all academic dataset Rds(SaDx (Lr))
Extract LrMargins class from Dx
Create Lr indexMargin class Mc1, Mc2.., End for
Return Lr(Fi (Mc))
Step 3: Compute Rds' Br' Bipolar disorder class intensive rate For all academic datasets Rds(StBdDx (Br))
Extract BrMargins class from Dx
Create Br indexMargin class BMc1, BMc2.., End for
Return Br(Fi (BMc))
Step 4: Process the intensive cooperative feature from both class Cif(Lr )
Step 5: Assimilate Bayesian decision Tree Intensification Range of values (Cif) ; Mutual intensive rate (MirNodes (Btree))
Generate cluster index to group constraint weight cif(Mir)
For each decision for F features in i number of mean relevance feature
RnF (i) algorithm on each decision with marginal weight and features.
Compute the Br rate Lr rate mean depth CifMir Extract the Correspondence feature weight Cfs–. Btree node depth
Return Cfs closest F(i) at a sum of mean depth relevance feature
End if End for
Step 6: Select For all the CfsFeature relevance (Frc) If Frc-Ids attain relative margin cluster
Update FrcC-ids(Cfs to all relative index feature list) Return Frc-Ids relative cluster;
End End for
Stop
The suitability of the above Pseudo code prevents the infiltration behavior response function from being characterized for discriminating the dimension of sustainability to group these clustering structures to predict data in every iteration, which is to be used in the next iteration for semantic relation. This relevantly measures the semantic similarity closeness of the cluster group.
-
Soft Max Segmented Decision Tree
The decision nodes are modulated by logical conditions by evaluating Frc-ids features, marginalized as cluster groups. This creates the logical neurons for training the conditional features in weights to spread based on the selective importance of the feature. Softmax Function attains generalization of sigmoid function to a multi-class setting the feature limits by threshold margins. It's popularly used in the final layer of multi-class classification adapted with decision trees.
The activation function trains remain the f(x)= where f(x) remains in integrative margins on Sa and StBd the logistic activation of neuron trained with Intra-class logistic transformation feature
weights, and The training function be iterative till the end the neuron get the
closer mean value.
It takes a vector of kFrc-ids' real number and then normalizes it into a probability distribution consisting of 'k' probabilities corresponding to the exponentials of the input number from feature limits. Based on the feature limits, the DNN is constructed to train the feature in multi-propagation to fix the weights for multi propagation.
-
Deep Multi-Objective Propagation Network (DMOPN)
The marginalized features' Frc- ids are effectively progressed to improve the classification by fed intoclassifier based on marginal weightage the hidden layer performs the softmax neural layer to predict the result and accuracy, reducing the computational complexity of the traditional algorithms. The Deep Multi-Objective Propagation Network (DMOPN) finds the closed data points weightage accuracy for predicting relational categories by the high dependences of Bipolar disorder feature weight with higher-end support values to categorize the class. By predicting the low rank of students is suggested to motivate for improving the performance of students.
Pseudocode: DMOPN
Input: Input recessed feature limits' Frc- ids
Output: Predicted class by student performance category
Start
Step 1: Attain feed-forward perception network FF-NN( each node n)
Step 2: Compute the cluster feature climates (Fcl) as Frc-ids Fcl1, Fcl2 Feature process the ordered List Ord{L1, L2.} to input layer
Step 3: Compute the decision classifier neural counter on the hidden layer For initializing the value i-1 and j=1 asset count
For i=0 to n to fix marginal weightage class as threshold class Compute the coefficient value of ij as random point p
Spread cluster featured value (Scv i to j) Initialized to set definite value P threshold margin
Set the controlling value of fitness value P to each trained node
End For
Step 4: Evaluate the closest feature weights and softmax to each logical node in i and j iterations
Step 5: while the termination criteria are not satisfied, gain (G) from fitness G < Max Generation do
Step 6: For Condition evaluate to feature weight terminate whether not Gain value <max generate value feature to the threshold value
If will Gmax do
Check for i=1:n (compute as all n tasks to pread features) Realized for j=1:n (n tasks) do
Check if (Xj <Xj), move task i towards j; Return class
End for
Fin fi check weightage correlation along neurons
Compute the marginal weight of the class to split category by neighbor class
End for j End for i
Return Threshold margined class
Step 7: End
Step 8: Return categorized class End.
Stop
Finally, predicted classes projected to multi-data point closeness measure the weightage data points are centralized to neighbor clusters. This will improve the classification accuracy. Multi data points perception model is a feed-forward classification that maps a set high-quality dataset of appropriate outputs.
-
-
RESULTS AND DISCUSSION
-
The results are processed in Jupyter Notebook, an open-source platform that permits us to create and share documents containing real code, equations, visualizations, and summarized text. It is used for data cleaning, transformation, numerical simulation, statistical modeling, visualization, machine learning, etc. From the above attributes given, the student analysis.arff
file was created and uploaded into Jupyter Notebook. The above-specified attributes influence the academic performance of the students. The results of such classification model deal with 13 measures like Accuracy, Precision, Recall, F1 Measure, Sensitivity, Specificity, estimated with a confusion matrix.
Table 2 Extraction of required features from the dataset
We have given the results of CGPA Grade, which is based on psychological motivation sessions, demonstrated that the rural students give outstanding performance than urban students.
Figure 2 Analysis of CGPA concerning Psychological motivation session attended
Figure 3 Analysis of Drug usage in Weekdays, Weekends and the Leave Logs
From the above figure 3, we showed that most of the students do not use drugs. Few students use the drug weekly one or two times that also makes them take leave.
Figure 4 Analysis of the Grades obtained by the students
From the results obtained, it is clear that the students obtained first-class than the other grades classes.
Figure 5 Performance Analysis of Rural and Urban Students
Figure 6 Analysis of Study Hours of the Students
The present study confirms that an average of five hours of study will make the students score good grades. Here, the study also reveals that the study hour does not improve grades, and full focus of the study is needed than the hours.
Figure 7 Analysis of the Extra-curricular activities of the students
Another promising finding was that the students with extracurricular activities are performing better than those without extracurricular activities.
Figure 8 Analysis of Psychological Motivation Sessions attended
It is worth discussing the psychological motivation sessions that improve students' performance in terms of CGPA revealed by the above results. The performance evaluation is carried out by windows environment, which is Jupyter's intent. The classification results use parameters to estimate the precision rate, recall rate, classification accuracy, and time complexity to prove the resultant factor.
Performance in %
100
95
90
85
80
Classification accuracy
500 1000 1500 2000 2500
Figure 9 Performance of classification accuracy
Quad technique GA
SVM BPPAM TDP-HIF PLF-ISRDNN HSDNN DFSSC
The classification accuracy is done by evaluating the sensitivity and specificity analysis by testing the records produced by different methods. The Proposed system DFSSC proves the best performance, as shown in figure 9. The estimation is based on the positive and negative values from the True /false optimized category.
Table 3 Performance of classification accuracy
Methods /No of records |
Impact of Classification Accuracy in % |
||||||||
MFPG |
Quad |
GA |
SVM |
BPPAM |
TDP- HIF |
PLF-ISRDNN |
HSDNN |
DFSSC |
|
500 |
86.1 |
87.8 |
89.5 |
91.2 |
92.1 |
93.1 |
94.3 |
94.9 |
95.2 |
1000 |
87.4 |
88.2 |
91.4 |
92.6 |
93.4 |
94.2 |
95.2 |
95.4 |
95.8 |
1500 |
88.9 |
89.4 |
91.8 |
93.1 |
95.2 |
96.1 |
96.3 |
96.7 |
97.1 |
2000 |
89.2 |
90.8 |
92.5 |
93.8 |
94.2 |
95.3 |
96.4 |
96.9 |
97.3 |
2500 |
90.6 |
91.4 |
93.1 |
94.1 |
94.8 |
95.6 |
96.7 |
97.1 |
97.9 |
GA
SVM BPPAM TDP-HIF
PLF-ISRDNN HSDNN
500
GA
SVM BPPAM TDP-HIF
PLF-ISRDNN HSDNN
500
performance in %
performance in %
The classification produced by different methods like Genetic Analysis (G.A.) technique which has shown 90.6 % accuracy, quad technique 91.4 %, and SVM produce 94.1 % lower accuracy is shown in Table 3. The proposed DFSSC system produces high performance compared to the previous system, up to 96.7%.
Analysis of Sensitivity
MFPG
Analysis of Sensitivity
MFPG
Quad technique
98
96
94
92
90
88
86
84
82
80
78
Quad technique
98
96
94
92
90
88
86
84
82
80
78
1000
1000
1500
1500
2000
2000
2500
2500
DFSSC
DFSSC
Figure 10 Performance of sensitivity analysis
The sensitivity analysis is shown in figure 10. The performance of the proposed system PLF-ISRDNN produces higher results than any other method. The existing method SVM produces a low level of 93.4 %, and the proposed performance produces a
96.1 % sensitivity rate.
of records |
MFPG |
Quad |
GA |
SVM |
BPPAM |
TDP- HIF |
PLF-ISRDNN |
HSDNN |
DFSSC |
500 |
84.4 |
87.1 |
89.2 |
89.8 |
91.5 |
92.7 |
92.8 |
93.2 |
93.6 |
1000 |
85.8 |
86.5 |
90.2 |
90.6 |
92.1 |
93.2 |
93.8 |
94.1 |
94.5 |
1500 |
86.2 |
87.6 |
91.2 |
91.8 |
92.8 |
93.4 |
94.2 |
94.6 |
94.5 |
2000 |
86.9 |
88.9 |
91.7 |
92.1 |
93.1 |
94.1 |
95.3 |
95.3 |
95.7 |
2500 |
87.4 |
89.6 |
92.6 |
93.4 |
94.3 |
95.6 |
96.1 |
96.4 |
96.8 |
of records |
MFPG |
Quad |
GA |
SVM |
BPPAM |
TDP- HIF |
PLF-ISRDNN |
HSDNN |
DFSSC |
500 |
84.4 |
87.1 |
89.2 |
89.8 |
91.5 |
92.7 |
92.8 |
93.2 |
93.6 |
1000 |
85.8 |
86.5 |
90.2 |
90.6 |
92.1 |
93.2 |
93.8 |
94.1 |
94.5 |
1500 |
86.2 |
87.6 |
91.2 |
91.8 |
92.8 |
93.4 |
94.2 |
94.6 |
94.5 |
2000 |
86.9 |
88.9 |
91.7 |
92.1 |
93.1 |
94.1 |
95.3 |
95.3 |
95.7 |
2500 |
87.4 |
89.6 |
92.6 |
93.4 |
94.3 |
95.6 |
96.1 |
96.4 |
96.8 |
Methods /No
Table 4 Performance of sensitivity analysis
Table 4 shows the impact of sensitivity rate produced by different methods, in which the proposed system DFSSC
Impact of Sensitivity Analysis in %
MFPG
Quad technique GA
SVM BPPAM TDP-HIF
PLF-ISRDNN
HSDNN DFSSC
MFPG
Quad technique GA
SVM BPPAM TDP-HIF
PLF-ISRDNN
HSDNN DFSSC
performance in %
performance in %
produces a higher sensitivity rate than other methods. By definition, the false-positive values are correlated with confusion matrix defend with true negative divided with false-positive values to defend the classification.
100
Analysis of specificity
100
Analysis of specificity
90
80
70
60
50
90
80
70
60
50
500
1000
1500
2000
2500
500
1000
1500
2000
2500
Figure 11 Performance of Specificity
Figure 11 demonstrates the contrast of Specificity formed by dissimilar approaches and the projected DFSSC method, which has shaped higher performance than additional methods.
Table 5 Performance of Specificity
Methods /No of records |
Impact of Specificity in % |
||||||||
MFPG |
Quad |
GA |
SVM |
BPPAM |
TDP- HIF |
PLF-ISRDNN |
HSDNN |
DFSSC |
|
500 |
82.3 |
87.3 |
89.3 |
90.5 |
91.3 |
92.1 |
92.8 |
92.9 |
93.2 |
1000 |
83.8 |
87.6 |
91.2 |
91.6 |
91.8 |
92.6 |
93.8 |
94.1 |
94.8 |
1500 |
84.2 |
88.5 |
92.6 |
92.7 |
92.8 |
93.1 |
94.2 |
94.7 |
95.3 |
2000 |
85.3 |
88.9 |
92.8 |
92.9 |
93.2 |
94.2 |
95.3 |
95.6 |
95.9 |
2500 |
86.3 |
90.2 |
93.5 |
93.8 |
94.5 |
95.1 |
96.1 |
96.7 |
97.2 |
The harmonic representation of precision values depends on the Specificity of the true positive average mean rate of true positive and false positive values. Table 5 shows the contrast of the specificity rate produced by different methods.
Figure 12 Performance of F-measure
The f-measure defines the sensibility and specificity rate at the mean rate of the absolute error value. The above figure 12 shows the proposed DFSSC false rate as an F-measure value. The accuracy remains the false representation and the lower rate compared to the SVM and previous system.
Table 6 Performance of F-measure
Methods /No of records |
Comparison of F-measure in % |
||||||||
MFPG |
Quad |
GA |
SVM |
BPPAM |
TDP- HIF |
PLF- ISRDNN |
HSDNN |
DFSSC |
|
500 |
7.2 |
6.9 |
6.2 |
5.8 |
5.1 |
4.8 |
4.3 |
4.1 |
3.9 |
1000 |
7.8 |
7.1 |
6.6 |
6.2 |
5.3 |
5.1 |
4.5 |
4.3 |
4.1 |
1500 |
8.3 |
7.5 |
7.1 |
6.8 |
6.4 |
5.3 |
5.2 |
4.8 |
4.4 |
2000 |
8.6 |
8.2 |
7.5 |
6.9 |
6.6 |
5.7 |
5.4 |
5.1 |
4.8 |
2500 |
9.2 |
8.7 |
8.1 |
7.6 |
7.1 |
5.9 |
5.5 |
5.3 |
5.1 |
Table 6 demonstrates the contrast of the DFSSC false classification ratio and its appearances show that the suggested process yields less F-measure.
Figure 13 Performance of time complexity
The time complexity refers to O(n) 's execution rate at the meantime records to be classified at the recommended state. Figure 13 shows the performance of DFSSC time complexity in seconds to classify the result.
Table 7 Performance of time complexity
Methods /No of records |
Impact of Time Complexity in seconds |
||||||||
MFPG |
Quad |
GA |
SVM |
BPPAM |
TDP-HIF |
PLF-ISRDNN |
HSDNN |
DFSSC |
|
500 |
4.6 |
4.2 |
3.9 |
3.7 |
3.6 |
3.1 |
2.9 |
2.8 |
2.7 |
1000 |
5.2 |
5.1 |
4.8 |
4.7 |
4.5 |
3.8 |
3.4 |
3.3 |
3.1 |
1500 |
6.5 |
5.8 |
5.3 |
5.1 |
4.9 |
4.2 |
3.8 |
3.6 |
3.4 |
2000 |
7.2 |
6.7 |
6.2 |
5.7 |
5.1 |
4.9 |
4.3 |
4.1 |
3.8 |
2500 |
7.6 |
7.2 |
6.9 |
6.3 |
5.8 |
5.1 |
4.6 |
4.3 |
4.1 |
Above table 6 shows the time performance with execution utilized in different methods. Table 7 shows that the proposed DFSSC method produces 4.1 seconds lower time complexity than other conventional methods SVM produces 6.3 Seconds.
5. CONCLUSION
The paper presented different prediction models to predict the academic performance of students with bipolar disorder based on their address, study_hours, extra_curriculum, internet_access, family_interaction, drug_use_weekdays, drug_use_weeends, leave_log, CGPA, Psychological motivation session attended, CGPA_Grade. However, this study is limited to the college students of Jamal Mohamed College and a few other college students. From the observation, it is found that the performance of a decision tree classifier is best than that of different algorithms applied in this research. The performance of Deep Multi-Objective Propagation Network (DMOPN) is compared based on the various performance evaluation measures such as accuracy, precision, recall, F1 measure, sensitivity, specificity dependant with true positive rate, true negative rate, false-positive rate, and false-negative rate. This proposed Deep Featured Spectral Scaling Classifier (DFSSC) produces high performance and recommended for educational institutions for the prediction of academic performance in college students affected with bipolar disorder.
REFERENCES
[1]. M.S. Mythili, A.R. Mohamed Shanavas, An Analysis of Students Performance using Classification Algorithms, IOSR Journal of Computer Engineering, Volume 16, Issue I, Ver. III, pp. 63-69. [2]. Al-Radaideh, Q., Al-Shawakfa, E. and Al-Najjar, M. (2006) "Mining Student Data Using Decision Trees", The 2006 International Arab Conference on Information Technology (ACIT'2006) Conference Proceedings. [3]. Ayesha, S., Mustafa, T., Sattar, A. and Khan, I. (2010) "Data Mining Model for Higher Education System", European Journal of Scientific Research, vol. 43, no. 1, pp. 24-29. [4]. Baradwaj, B. and Pal, S. (2011) "Mining Educational Data to Analyze Student s" Performance", International Journal of Advanced Computer Science and Applications, vol. 2, no. 6, pp. 63-69. [5]. Chandra, E., and Nandhini, K. (2010) "Knowledge Mining from Student Data", European Journal of Scientific Research, vol. 47, no. 1, pp. 156-163. [6]. El-Halees, A. (2008) "Mining Students Data to Analyze Learning Behavior: A Case Study", The 2008 international Arab Conference of InformationTechnology (ACIT2008) Conference Proceedings, University of Sfax, Tunisia, Dec 15- 18.
[7]. Han, J. and Kamber, M. (2006) Data Mining: Concepts and Techniques, 2nd edition. The Morgan Kaufmann Series in Data Management Systems, Jim Gray, Series Editor. [8]. Kumar, V. and Chadha, A. (2011) "An Empirical Study of the Applications of Data Mining Techniques in Higher Education", International Journal of Advanced Computer Science and Applications, vol. 2, no. 3, pp. 80-84. [9]. Mansur, M. O., Sap, M., and Noor, M. (2005) "Outlier Detection Technique in Data Mining: A Research Perspective", In Postgraduate Annual Research Seminar. [10]. Romero, C. and Ventura, S. (2007) "Educational Data Mining: A Survey from 1995 to 2005", Expert Systems with Applications (33), pp. 135-146 [11]. Q. A. AI-Radaideh, E. W. AI-Shawakfa, and M. I. AI-Najjar, Mining student data using decision trees, International Arab Conference onInformation Technology (ACIT'2006), Yarmouk University, Jordan, 2006.
[12]. U. K. Pandey, and S. Pal, "A Data mining view on classroom teaching language", (IJCSI) International Journal of Computer Science Issue, Vol. 8, Issue 2, pp. 277-282, ISSN:1694-0814, 2011. [13]. Shaeela Ayesha, Tasleem Mustafa, Ahsan Raza Sattar, M. Inayat Khan, Data mining model for higher education system, European Journal of Scientific Research, Vol.43, No.1, pp.24-29, 2010. [14]. M. Bray, The shadow education system: private tutoring and its implications for planners, (2nd ed.), UNESCO, PARIS, France, 2007. [15]. B.K. Bharadwaj and S. Pal. Data Mining: A prediction for performance improvement using classification, International Journal of Computer Science and Information Security (IJCSIS), Vol. 9, No. 4, pp. 136-140, 2011. [16]. J. R. Quinlan, "Introduction of the decision tree: Machine learn", 1: pp. 86-106, 1986. [17]. Vashishta, S. (2011). Efficient Retrieval of Text for Biomedical Domain using Data Mining Algorithm. IJACSA – International Journal of Advanced Computer Science and Applications, 2(4), 77-80. [18]. M. Silva Guerra, H. AssessNeto and S. Azevedo Oliveira, "A Case Study of Applying the Classification Task for Students' Performance Prediction," in IEEE Latin America Transactions, vol. 16, no. 1, (2018) pp. 172-177. [19]. Polyzou and G. Karypis, "Feature Extraction for Next-Term Prediction of Poor Student Performance," in IEEE Transactions on Learning Technologies, vol. 12, no. 2, (2019) pp. 237-248, 1. [20]. Azar, G., et. Al Intelligent data mining and machine learning for mental health diagnosis using genetic algorithm. IEEE Int. Conf.Electro. Inf.Technol.;201206, 2015.
[21]. Y.Yao Al Using linguistic and topic analysis to classify subgroups of online depression communities. Multimed. Tools Appl. 76(8):1065310676, 2019. [22]. S. Lai, B. Sun, F. Wu and R. Xiao, "Automatic Personality Identification Using Students' Online Learning Behavior," in IEEE Transactions on Learning Technologies, vol. 13, no. 1, pp. 26-37, 1 Jan.-March 2020, DOI: 10.1109/TLT.2019.2924223. [23]. Castaldo, R., et. Al Detection of mental stress due to oral academic examination via ultra-short-term HRV analysis. Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. EMBS. ;38053808, 2016 [24]. Z. Xu, H. Yuan, and Q. Liu, "Student Performance Prediction Based on Blended Learning," in IEEE Transactions on Education, vol. 64, no. 1, pp.66-73, Feb. 2021, DOI: 10.1109/TE.2020.3008751.
[25]. Q. Liu et al., "EKT: Exercise-Aware Knowledge Tracing for Student Performance Prediction," in IEEE Transactions on Knowledge and Data Engineering, vol. 33, no. 1, pp. 100-115, 1 Jan. 2021, DOI: 10.1109/TKDE.2019.2924374. [26]. S. Peerbasha, M. Mohamed Surputheen, Prediction of Academic Performance of College Students with Bipolar Disorder using different Deep Learning and Machine Learning algorithms International Journal of Computer Science and Network Security, Volume 21, Issue No: 7, pp 350 358.