Evaluasi Kinerja Bert Dan Roberta Dalam Analisis Ulasan Korupsi Di Pertamina
DOI:
https://doi.org/10.31849/nfrdk715Keywords:
Sentiment Analysis, BERT, BERT, RoBERTa, Pertamina Corruption, Public Opinion, Natural Language ProcessingAbstract
Social media platforms like Twitter are vital channels for public discourse on critical issues, including the Pertamina corruption case. Automated sentiment analysis is essential for gauging and comprehending public opinion at scale. This study evaluates and compares the performance of two state-of-the-art Natural Language Processing (NLP) models, BERT and RoBERTa, on the task of sentiment analysis using a dataset of Indonesian tweets concerning the Pertamina corruption case. The methodology involves several stages: data collection from Twitter, text preprocessing for data cleaning, sentiment classification using fine-tuned BERT and RoBERTa models, and a comparative evaluation based on sentiment distribution and confidence scores. The results indicate that both models successfully identified negative sentiment as the predominant public opinion. However, the RoBERTa model exhibited a higher distribution of confidence scores, suggesting superior generalization ability and contextual understanding compared to BERT on this specific dataset. This study concludes that RoBERTa delivers a more robust performance for analyzing the sentiment of complex and nuanced social media text.
References
[1] Wilie, B., Vincentio, K., Winata, G. I., et al. (2020). IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding. Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing.
[2] Fajri, M. F., Nugroho, R. A., & Santoso, A. B. (2021). A Comparative Study of BERT and IndoBERT for Indonesian Text Classification. Journal of King Saud University - Computer and Information Sciences.
[3] Wibawa, I. P. A. K., Arifianto, A., & Fauzi, M. A. (2022). BERT-Based Sentiment Analysis for Indonesian Social Media Texts. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control.
[4] Irdan, F. I., Purwarianti, A., & Putri, T. A. T. (2023). RoBERTa-Based Model for Hate Speech and Abusive Language Detection in Indonesian Twitter. Proceedings of the International Conference on Asian Language Processing (IALP).
[5] Lestari, A. D. P., Tahir, S. M. P., & Rahmat, R. F. (2024). The Effectiveness of Fine-Tuning BERT-based Models for Sentiment Analysis on Indonesian Public Service Complaints. Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi).
[6] Cahyadi, R., Rachman, F. H., & Hidayanto, A. N. (2022). Sentiment Analysis of Public Opinion on the COVID-19 Vaccine in Indonesia: A Comparative Study Using IndoBERT and XLM-RoBERTa. IEEE International Conference on Communication, Networks and Satellite (COMNETSAT).
[7] Novita, S., Aji, D. P. P., & Permanasari, A. E. (2023). Enhancing Indonesian Sentiment Analysis by Combining RoBERTa with Convolutional Neural Networks. Jurnal Teknologi dan Sistem Komputer (JUTISKOM).
[8] Koto, F., Rahimi, A., Lau, J. H., & Baldwin, T. (2020). IndoLEM and IndoBERT: A New Benchmark Dataset and Pre-trained Language Model for Indonesian NLP. Proceedings of the 28th International Conference on Computational Linguistics.
[9] Putra, Y. K., Sari, D. P. D. P., & Hayadi, B. H. (2024). Analyzing Public Sentiment on Indonesia’s New Capital City (IKN) using RoBERTa and Topic Modeling. International Journal of Intelligent Systems and Applications in Engineering (IJISAE).
[10] Putra, W. P. N., Abdullah, S. R. S. S., & Abdullah, A. F. H. (2023). A Survey on Transformer-Based Pre-trained Language Models for Indonesian. IEEE Access.
