What is it about?

A comparative analysis of ensemble autoML machine learning prediction accuracy of STEM student grade prediction: a multi-class classification prospective

Featured Image

Why is it important?

This study addresses the research gap in selecting an appropriate AutoML model for multiclass student grade classification, aiming to predict bachelor's degree outcomes. It evaluates the predictive accuracy of an ensemble AutoML approach for STEM students based on their academic performance from high school through internal assessments. Among 78 recommended AutoML models, nine were fine-tuned and cross-validated to identify optimal hyperparameters and assess performance. The GBM_4_AutoML_1 model achieved the highest accuracy at 28%, followed by StackedEnsemble_BestOfFamily_5 (31%), DRF (28%), XRT (30%), DeepLearning_grid (56%), and GLM (35%). The optimized GBM model achieved a 100% match between predicted and actual student grades. Additionally, feature importance analysis provided insights into classification performance, highlighting key factors influencing grade prediction.

Perspectives

Statisticians in the early stages in research aiming to predict the relationship between dependent and independent variables for better classification. This association may exhibit either a negative or positive correlation to target features, albeit with varying degrees of reliability. This paper tries to fulfill the research gap between the selection of appropriate Automl model for multiclass student grade classification for the predication bachelor degree grade. Consequently, this study endeavors to assess the predictive accuracy of an ensemble AutoML (Automated Machine Learning) model for science, technology, engineering, and management students letter grading. This assessment is based on their subject grades from high school through to internal evaluation of bachelor's degree, to predict bachelor's degree (final) outcomes when the target variables are in multiclass letter grading in a modern system. The ensemble AutoML approach is employed to forecast upcoming grades. Nine out of 78 recommended Automl models undergo fine-tuning and cross-validation for performance metrics, evaluating the best-optimized hyperparameters and assessing their performance after best-optimized hyperparameters. The study analyzed the performance of various models in classifying STEM students, focusing on their accuracy and prediction error rates and miss classification between train and predicated values. The GBM_4_AutoML_1 model scored the highest at 0.28 (28%), followed by StackedEnsemble_BestOfFamily_5 at 0.31 (31%), DRF at 0.28 (28%), XRT at 0.30 (30%), DeepLearning_grid at 0.56 (56%), and GLM at 0.35 (35%). Furthermore, the confusion matrices when the optimized model of the GBM scored 100% matched the true and predicated student grades. The history scoring of each model tuned recommended hyperparameters to achieve the best model. The feature importance of dependent and independent features was analyzed comprehensively to true and predicated and internal multiclass grading classification of STEM student and contrasted to provide a detailed explanation of their respective performance

Dr. YAGYANATH RIMAL
Pokhara University

Read the Original

This page is a summary of: A comparative analysis of ensemble autoML machine learning prediction accuracy of STEM student grade prediction: a multi-class classification prospective, Multimedia Tools and Applications, March 2025, Springer Science + Business Media,
DOI: 10.1007/s11042-024-20554-8.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page