Manuscript Title:

DESIGN AND DEVELOPMENT OF HYBRID PRINCIPAL COMPONENT ANALYSIS (HPCA) ALGORITHM FOR ACADEMIC PERFORMANCE PREDICTION

Author:

Chitra Mehra, Dr. Rashmi Agrawal

DOI Number:

DOI:10.17605/OSF.IO/SZUAD

Published : 2021-06-23

About the author(s)

1. Chitra Mehra - Assistant Professor, Manav Rachna International Institute of Research and Studies, Faridabad.
2. Dr. Rashmi Agrawal - Professor, Manav Rachna International Institute of Research and Studies, Faridabad.

Full Text : PDF

Abstract

Data mining and its applications are ubiquitous for business purposes since its beginning. Data mining techniques are used by many fields for knowledge discovery as well as for strategic decisions. However, in the present era, some new and emerging areas like education systems are also using data mining successfully to discover meaningful patterns from the pool of data. The primary focus of all academic institutions is the prediction of student's academic performance. To achieve this, educational data mining (EDM) is used. All over the world, educational data mining (EDM) is gaining popularity among the researchers because of its need and importance for the society. To handle the complexity of large volume of educational institutions data, various informative technologies are used. Machine learning is used by many researchers to mine knowledge from the educational database for the improvement in students and instructor’s performance. The most challenging task in prediction models is to select the efficient technique by which satisfactorily results can be produced. A hybrid algorithm of principal component analysis (HPCA) in conjunction with four machines learning (ML) algorithms: random forest (RF), support vector machine (SVM), naïve Bayes (NB) of Bayes network and C5.0 of decision tree (DT)is introduced in this paper so that there is always an improvement in the performances of classification. We evaluated our proposed model on three datasets taken from kaggle. In this paper, assessment metrics of the proposed model are classification accuracy, root mean square error (RSME), precision and recall. 10-fold cross-validation is also applied on these datasets for the evaluation of predictive performance. The proposed algorithm produced satisfactorily results of prediction which shows that HPCA is best for the optimal prediction method to get good result.


Keywords

Student Performance, Machine Learning Algorithms, K-Fold Coss Validation, Principal Component Analysis; Support Vector Machine, Random Forest, Naïve Bayes