Data driven decisions in education using a comprehensive machine learning framework for student performance prediction

  • Muhammad Nadeem Gul
  • , Waseem Abbasi*
  • , Muhammad Zeeshan Babar*
  • , Abeer Aljohani
  • , Muhammad Arif
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

48 Downloads (Pure)

Abstract

Accurately predicting student performance is essential for improving educational outcomes and guiding targeted interventions. This study applies eight advanced machine learning models-Decision Trees, Random Forest, Lasso, K-Nearest Neighbors, XGBoost, CatBoost, AdaBoost, and Gradient Boosting to analyze student performance based on demographic and academic features. Among these, CatBoost achieved the highest accuracy (87.46%) and less error rates, outperforming Gradient Boosting (87.28%) and Decision Trees (82.42%). Model evaluation was conducted using Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE), demonstrating the robustness of the proposed approach. The results highlight the effectiveness of data-driven methods in early identification of at-risk students, enabling educators to implement personalized learning strategies. This study underscores the transformative potential of machine learning in education, paving the way for more adaptive and student-centered learning environments.
Original languageEnglish
Article number153
JournalDiscover Computing
Volume28
DOIs
Publication statusPublished - 18 Jul 2025

Keywords

  • Machine learning
  • XG-Boost
  • Gradient Boosting
  • Common regression metrics (MAE, MSE and RMSE)
  • Ada-Boost
  • Decision Trees
  • Random Forest
  • Cat-Boosting
  • K-Neighbours
  • Performance prediction
  • Lasso

Fingerprint

Dive into the research topics of 'Data driven decisions in education using a comprehensive machine learning framework for student performance prediction'. Together they form a unique fingerprint.

Cite this