IMPLEMENTASI ALGORITMA MULTINOMIAL NAÏVE BAYES UNTUK MENDETEKSI TWEET UJARAN KEBENCIAN BAHASA INDONESIA TERHADAP PSSI

Authors

  • Riskiana Wulan Universitas Budi Luhur
  • Indra Hertanto Universitas Budi Luhur

DOI:

https://doi.org/10.36080/skanika.v8i1.3355

Keywords:

Hate speach, Text mining, PSSI, MMultinominal Naïve Bayes, TF-IDF

Abstract

This study focuses on the application of the Multinomial Naive Bayes algorithm to detect hate speech in Indonesian tweets and test its accuracy level. According to The 2022 World Football Report, around 69% of Indonesia's population shows a high interest in football, creating a positive digital environment. The Dataset used consists of tweet data related to PSSI and politic taken from Twitter, which is then manually labeled into three classes, namely non-HS (Hate Speech), insults and provocations. The Dataset used consists of 2,210 tweets taken from Twitter, then manually labeled into three classes, namely non-HS (Hate Speech), insults, and provocations. Before dividing the Dataset into train and test data, an undersampling technique was applied to handle class imbalance, with the aim of ensuring a balanced distribution between the three categories. After undersampling, the training Dataset consisted of 350 tweets and the test Dataset consisted of 88 tweets. Evaluation of each method was carried out using matrix precision, recall, and F1-score. The results of the study indicate that the Multinomial Naïve Bayes algorithm obtained an accuracy of 62%. This accuracy result is expected to be useful for developing an effective and accurate hate speech detection model on social media platforms, especially Twitter, so that it can help reduce the awareness of the Indonesian people about the dangers of the spread of hate speech.

Downloads

Download data is not yet available.

References

[1] A. C. Sitepu, W. Wanayumini, and Z. Situmorang, “Determining Bullying Text Classification Using Naive Bayes Classification on Social Media,” Jurnal Varian, vol. 4, no. 2, pp. 133–140, 2021.
[2] “Statista Research Department. 2024. Leading Countries Based on Number of X (Formerly Twitter) Users as of April 2024.” Available: https://www.statista.com/statistics/242606/number-of-active-twitter-users-in-selected-countries/. [Accessed: 08-Des-2024].
[3] R. Grimm, and N. Boyon, “More Than Half of Adults Across 34 Countries Plan to Watch the 2022 Fifa World Cup.” Available: https://www.ipsos.com/sites/default/files/ct/news/documents/2022-11/Ipsos%202022%20FIFA%20World%20Cup%20Global%20Advisor%20Survey%20-%20Global%20Press%20Release.pdf. [Accessed: 08-Des-2024].
[4] I. Firdaus (2023, April 1), “Kegagalan Indonesia Jadi Tuan Rumah Piala Dunia U20, Keriuhan Warganet dan Gocekan Para Politisi.” Available: https://www.kompas.tv/nasional/393772/kegagalan-indonesia-jadi-tuan-rumah-piala-dunia-u20-keriuhan-warganet-dan-gocekan-para-politisi? [Accessed: [28-Jan-2025].
[5] M. Murni, I. Riadi, and A. Fadlil, “Analisis Sentimen HateSpeech pada Pengguna Layanan Twitter dengan Metode Naïve Bayes Classifier (NBC),” JURIKOM (Jurnal Riset Komputer), vol. 10, no. 2, pp. 566-575, 2023.
[6] V. Geetha, N. Sujatha, L. N. Valli, “Naïve Bayes Classification of Sentiments on Subset using Tweets-during Covid-19,” International Journal of Intelligent Systems and Applications in Engineering, vol. 12, no. 21s, pp. 249-255, 2024.
[7] Yuyun, N. Hidayah, and S. Sahibu, “Algoritma Multinomial Naïve Bayes Untuk Klasifikasi Sentimen Pemerintah Terhadap Penanganan Covid-19 Menggunakan Data Twitter,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 4, pp. 820–826, 2021.
[8] K. Karsito and S. Susanti, “Klasifikasi Kelayakan Peserta Pengajuan Kredit Rumah Dengan Algoritma Naive Bayes Di Perumahan Azzura Residence,” Jurnal SIGMA, vol. 9, no. 3, pp. 43-48, 2019.
[9] J. Han, M. Kamber, and J. Pei, “Data Mining. Concepts and Techniques, 3rd Edition (The Morgan Kaufmann Series in Data Management Systems),” Copyright © 2011 Elsevier Inc. All rights reserved, 2012.
[10] H. Chen, S. Hu, R. Hua, and X. Zhao, “Improved Naive Bayes Classification Algorithm for Traffic Risk Management,” EURASIP Journal on Advances in Signal Processing, pp.1-12, 2021.
[11] A. W. Syaputri, E. Irwandi, and M. Mustakim, “Naïve Bayes Algorithm for Classification of Student Major’s Specialization,” Journal of Intelligent Computing & Health Informatics, vol. 1, no. 1, p. 17, 2020.
[12] M. K. Khatami, “Analisis Sentimen Twitter Menggunakan Naive Bayes dan Support Vector Machine Terhadap KPU pada Pemilihan Umum Presiden 2024,” Prodi TI Sains Teknologi UIN JKT : Jakarta, Ciputat, 2024.
[13] K. A. Lubis, M. T. A. Bangsa, and A. Yudertha, “Analisis Sentimen Opini Masyarakat Terhadap Pindahnya Ibu Kota Indonesia dengan Menggunakan Klasifikasi Naïve Bayes,” Jurnal Teknoinfo, vol. 18, no. 1, pp. 226-238, 2024.
[14] R. Prasetya, “Penerapan Teknik Data Mining dengan Algoritma Classification Tree untuk Prediksi Hujan”, Jurnal Widya Climago, Vol.2 No.2, pp. 13-23, 2020.
[15] A. S. Sedghpour, M. R. S. Sedghpour, “Web Document Categorization Using Naive Bayes Classifier and Latent Semantic Analysis.”2020.
[16] M. H. Humaidi, Sutrisno, and P. W. Laksono, “Implementation of Machine Learning for Text Classification Using the Naive Bayes Algorithm in Academic Information Systems at Sebelas Maret University Indonesia,” in E3S Web of Conferences ICEMECE 2023, pp. 1-5, 2023
[17] S. Samsir, et al, “Naives Bayes Algorithm for Twitter Sentiment Analysis,” in Journal of Physics: Conference Series, pp. 1-6, 2021.

Downloads

Published

2025-01-30

How to Cite

[1]
R. Wulan and I. Hertanto, “IMPLEMENTASI ALGORITMA MULTINOMIAL NAÏVE BAYES UNTUK MENDETEKSI TWEET UJARAN KEBENCIAN BAHASA INDONESIA TERHADAP PSSI”, SKANIKA, vol. 8, no. 1, pp. 193–203, Jan. 2025.