Abstract
Accurately predicting student performance is essential for improving educational outcomes and guiding targeted interventions. This study applies eight advanced machine learning models-Decision Trees, Random Forest, Lasso, K-Nearest Neighbors, XGBoost, CatBoost, AdaBoost, and Gradient Boosting to analyze student performance based on demographic and academic features. Among these, CatBoost achieved the highest accuracy (87.46%) and less error rates, outperforming Gradient Boosting (87.28%) and Decision Trees (82.42%). Model evaluation was conducted using Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE), demonstrating the robustness of the proposed approach. The results highlight the effectiveness of data-driven methods in early identification of at-risk students, enabling educators to implement personalized learning strategies. This study underscores the transformative potential of machine learning in education, paving the way for more adaptive and student-centered learning environments.
| Original language | English |
|---|---|
| Article number | 153 |
| Journal | Discover Computing |
| Volume | 28 |
| DOIs | |
| Publication status | Published - 18 Jul 2025 |
Keywords
- Machine learning
- XG-Boost
- Gradient Boosting
- Common regression metrics (MAE, MSE and RMSE)
- Ada-Boost
- Decision Trees
- Random Forest
- Cat-Boosting
- K-Neighbours
- Performance prediction
- Lasso
Fingerprint
Dive into the research topics of 'Data driven decisions in education using a comprehensive machine learning framework for student performance prediction'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver