PREDIKSI KELULUSAN MAHASISWA MENGGUNAKAN ALGORITMA XGBOOST

Authors

  • Syaiful Imron Institut Teknologi Dan Bisnis PGRI Dewantara Jombang
  • Arbiati Faizah Institut Teknologi Dan Bisnis PGRI Dewantara Jombang
  • Sugianto Sugianto Institut Teknologi Dan Bisnis PGRI Dewantara Jombang

DOI:

https://doi.org/10.36080/skanika.v9i1.3647

Keywords:

classification, random forest, student graduation, XGBoost

Abstract

Student graduation times are often difficult to predict early, a major challenge facing institutions. Manual evaluations often fail to identify problematic students, leading to inaccurate graduation times that are detrimental to both students and institutions. This is crucial because study duration and timely graduation are important criteria in assessing institutional accreditation and quality. As an innovative solution, this study developed a graduation prediction model using the XGBoost and Random Forest algorithm, applying hyperparameter optimization techniques through Grid Search Cross Validation. The results showed that with default parameters, Random forest was superior to XGBoost. However, after hyperparameter tuning, XGBoost achieved better accuracy than Random Forest with a significant increase in accuracy, from 88.15% to 92.66% (precision 91.87%, recall 91.67%, and F1-score 91.38%). This confirms that appropriate hyperparameter tuning is a strategic key to maximizing the effectiveness of classification models. Thus, this model can be a tool for institutions to monitor and intervene early on in potential student delays.

Downloads

Download data is not yet available.

References

[1] N. Hasanah, F. Syahfitri and T. Pujahadi, “Sosialisasi Tentang Pentingnya Pendidikan Tingkat Perguruan Tinggi Kepada Masyarakat Desa Jaring Halus,” Jurnal Pengabdian Kepada Masyarakat, vol. 2, no. 1, pp. 23-29, 2021.

[2] H. Mudarti and Y. Fatrisna, “Sistem Penjaminan Mutu Eksternal dan Akreditasi Dalam Lembaga Pendidikan Indonesia,” vol. 10, 2025.

[3] E. Haryatmi and S. P. Hervianti, “Penerapan Algoritma Support Vector Machine untuk Model Prediksi Kelulusan Mahasiswa Tepat Waktu,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi, vol. 5, no. 2, pp. 386-392, 2021, doi: 10.29207/resti.v5i2.3007.

[4] E. Novianto, A. Hermawan, and D. Avianto, “Klasifikasi Algoritma K-Nearest Neighbor, Naive Bayes, Decision Tree Untuk Prediksi Status Kelulusan Mahasiswa S1,” rabit, vol. 8, no. 2, pp. 146–154, 2023, doi: 10.36341/rabit.v8i2.3434.

[5] X. Shu and Y. Ye, "Knowledge Discovery: Methods from data mining and machine learning," Social Science Research, vol. 110, p. 102817, 2023, doi: 10.1016/j.ssresearch.2022.102817.

[6] S. A. Alwarthan, N. Aslam, and I. U. Khan, “Predicting Student Academic Performance at Higher Education Using Data Mining: A Systematic Review,” Applied Computational Intelligence and Soft Computing, vol. 2022, pp. 1–26, 2022, doi: 10.1155/2022/8924028.

[7] D. A. Shafiq, M. Marjani, R. A. A. Habeeb, and D. Asirvatham, “Student Retention Using Educational Data Mining and Predictive Analytics: A Systematic Literature Review,” IEEE Access, vol. 10, pp. 72480–72503, 2022, doi: 10.1109/ACCESS.2022.3188767.

[8] H. Pallathadka, A. Wenda, E. Ramirez-Asís, M. Asís-López, J. Flores-Albornoz, and K. Phasinam, “Classification and prediction of student performance data using various machine learning algorithms,” Materials Today: Proceedings, vol. 80, pp. 3782–3785, 2023, doi: 10.1016/j.matpr.2021.07.382.

[9] M. Kumar, N. Singh, J. Wadhwa, P. Singh, G. Kumar, and A. Qtaishat, "Utilizing Random Forest and XGBoost Data Mining Algorithms for Anticipating Students’ Academic Performance," Int. J. Mod. Educ. Comput. Sci., vol. 16, no. 2, pp. 29–44, 2024, doi: 10.5815/ijmecs.2024.02.03.

[10] S. E. Herni Yulianti, Oni Soesanto, and Yuana Sukmawaty, “Penerapan Metode Extreme Gradient Boosting (XGBOOST) pada Klasifikasi Nasabah Kartu Kredit,” Jomta, pp. 21–26, 2022, doi: 10.31605/jomta.v4i1.1792

[11] L. G. A. Putri, S. A. Wicaksono, dan B. Rahayudi, "Analisis Klasifikasi Spam Email Menggunakan Metode Extreme Gradient Boosting (XGBoost)," Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 9, no. 2, pp. 1-8, 2025.

[12] F. A. P. Prasetya and P. H. P. Rosa, “Klasifikasi Kegagalan Pembayaran Kredit Nasabah Bank dengan Algoritma XGBoost,” Seminar Nasional Informatika Bela Negara (Santika) 2024, vol. 4, 2024, pp. 366-371.

[13] A. Agustiningsih, Y. Findawati, and I. Alnarus Kautsar, “Classification Of Vocational High School Graduates’ Ability In Industry Using Extreme Gradient Boosting (XGBoost), Random Forest, And Logistic Regression,” J. Tek. Inform. (JUTIF), vol. 4, no. 4, pp. 977–985, 2023, doi: 10.52436/1.jutif.2023.4.4.945.

[14] Ali, Z.H.; Burhan, A.M. Hybrid Machine Learning Approach for Construction Cost Estimation: An Evaluation of Extreme Gradient Boosting Model. Asian J. Civ. Eng. 2023, 24, 2427–2442.

[15] A. A. Khalil, Z. Liu, A. Fathalla, A. Ali, and A. Salah, “Machine Learning Based Method for Insurance Fraud Detection on Class Imbalance Datasets With Missing Values,” IEEE Access, vol. 12, pp. 155451–155468, 2024, doi: 10.1109/ACCESS.2024.3468993.

[16] Q. A. Hidayaturrohman and E. Hanada, “Impact of Data Pre-Processing Techniques on XGBoost Model Performance for Predicting All-Cause Readmission and Mortality Among Patients with Heart Failure,” BioMedInformatics, vol. 4, no. 4, pp. 2201–2212, 2024, doi: 10.3390/biomedinformatics4040118.

[17] R. Wang, J. Zhang, B. Shan, M. He, and J. Xu, “XGBoost Machine Learning Algorithm for Prediction of Outcome in Aneurysmal Subarachnoid Hemorrhage,” NDT, vol. Volume 18, pp. 659–667, 2022, doi: 10.2147/NDT.S349956.

[18] T. Kavzoglu and A. Teke, “Predictive Performances of Ensemble Machine Learning Algorithms in Landslide Susceptibility Mapping Using Random Forest, Extreme Gradient Boosting (XGBoost) and Natural Gradient Boosting (NGBoost),” Arab. J. Sci. Eng., vol. 47, no. 6, pp. 7367–7385, 2022, doi: 10.1007/s13369-022-06560-8.

[19] A. Asselman, M. Khaldi, and S. Aammou, “Enhancing the prediction of student performance based on the machine learning XGBoost algorithm,” Interact. Learn. Environ., vol. 31, no. 6, pp. 3360–3379, Aug. 2023, doi: 10.1080/10494820.2021.1928235.

[20] Q. T. Phan, Y. K. Wu, and Q. D. Phan, “A Hybrid Wind Power Forecasting Model with XGBoost, Data Preprocessing Considering Different NWPs,” Applied Sciences, vol. 11, no. 3, p. 1100, 2021, doi: 10.3390/app11031100.

[21] J. Quist, L. Taylor, J. Staaf, and A. Grigoriadis, “Random Forest Modelling of High-Dimensional Mixed-Type Data for Breast Cancer Classification,” Cancers, vol. 13, no. 5, p. 991, 2021, doi: 10.3390/cancers13050991.

[22] E. Fitri, “Analisis Perbandingan Metode Regresi Linier, Random Forest Regression dan Gradient Boosted Trees Regression Method untuk Prediksi Harga Rumah,” J. Appl. Comput. Sci. Technol., vol. 4, no. 1, pp. 58–64, July 2023, doi: 10.52158/jacost.v4i1.491.

[23] S. Wang, J. Zhuang, J. Zheng, H. Fan, J. Kong, and J. Zhan, “Application of Bayesian Hyperparameter Optimized Random Forest and XGBoost Model for Landslide Susceptibility Mapping,” Front. Earth Sci., vol. 9, p. 712240, 2021, doi: 10.3389/feart.2021.712240.

[24] M. M. Taye, “Understanding of Machine Learning with Deep Learning: Architectures, Workflow, Applications and Future Directions,” Computers, vol. 12, no. 5, p. 91, 2023, doi: 10.3390/computers12050091.

Downloads

Published

2026-01-31

How to Cite

[1]
S. Imron, A. Faizah, and S. Sugianto, “PREDIKSI KELULUSAN MAHASISWA MENGGUNAKAN ALGORITMA XGBOOST”, SKANIKA, vol. 9, no. 1, pp. 76–86, Jan. 2026.