Hyperparameter Optimization of IndoBERT for DeepSeek User Sentiment Analysis

Authors

  • Andro Nicus Saragih Universitas Lancang Kuning
  • Lisnawita* Universitas Lancang Kuning

DOI:

https://doi.org/10.31849/digitalzone.v17i1.33024

Keywords:

IndoBERT, Sentiment Analysis, Hyperparameter Tuning, DeepSeek, NLP

Abstract

This study aims to evaluate the performance of the IndoBERT model in sentiment classification of user reviews for the DeepSeek application on the Google Play Store. The reviews were categorized into three sentiment classes: positive, neutral, and negative. The dataset was collected through web scraping of Indonesian-language reviews and processed using several preprocessing stages, including cleaning, stopword removal, and stemming. This study contributes by systematically comparing hyperparameter optimization methods using Grid Search and Random Search under two data split schemes (60:20:20 and 80:20). In addition, oversampling and Focal Loss techniques were implemented to address class imbalance and improve neutral class classification. Experimental results show that the best performance was achieved using Grid Search with an 80:20 data split, resulting in a testing accuracy of 80.40% and a macro F1-score of 70.85%. This configuration also produced a lower GAP value, indicating better model generalization and reduced overfitting. The findings demonstrate that appropriate hyperparameter optimization significantly improves IndoBERT performance for Indonesian sentiment analysis tasks

References

[1] L. Septian, T. Aljauza, and Juliane, “Analisis Sentimen Putusan Mahkamah Konstitusi Terhadap Batas Usia Capres dan Cawapres Menggunakan IndoBERT,” The Indonesian Journal of Computer Science, vol. 12, no. 6, 2023. https://doi.org/10.33022/ijcs.v12i6.3614

[2] N. Agustina, D. H. Citra, W. Purnama, C. Nisa, and Kurnia, “Implementasi algoritma Naive Bayes untuk analisis sentimen ulasan Shopee pada Google Play Store,” MALCOM: Indonesian Journal of Machine Learning and Computer Science, vol. 2, no. 1, pp. 47–54, 2022, https://doi.org/10.57152/malcom.v2i1.195

[3] A. S. Rusydiana and Marlina, “Analisis sentimen terkait sertifikasi halal,” Journal of Economics and Business Aseanomics, vol. 5, no. 1, pp. 69–85, 2020, https://doi.org/10.33476/j.e.b.a.v5i1.1405

[4] G. A. B. Baliputra, S. Kacung, and B. Santoso, “Sentiment Analysis of Rohingya Refugees in Aceh Using Support Vector Machine (SVM) and Multinomial Logistic Regression,” Sistemasi: Jurnal Sistem Informasi, vol. 14, no. 3, 2025, https://doi.org/10.32520/stmsi.v14i3.5159.

[5] A. L. Basuki, B. Rahayudi, and D. Pramono, “Analisis Sentimen Ulasan Aplikasi Ajaib Kripto menggunakan IndoBERT dan Metode Root Cause Analysis,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 9, no. 4, 2025. [Online]. Available: https://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/14709.

[6] N. Azka, W. Sayudha, R. Graha, and R. Nugroho, “Analisis Sentimen Ulasan Aplikasi Mobile JKN di Google PlayStore Menggunakan IndoBERT,” Jurnal JTIK (Jurnal Teknologi Informasi dan Komunikasi), vol. 9, no. 2, pp. 495–505, 2025, https://doi.org/10.35870/jtik.v9i2.3340

[7] G. Hakim, T. N. Fatyanosa, and A. W. Widodo, “Analisis Sentimen Masyarakat terhadap Kereta Cepat Whoosh pada Platform X menggunakan IndoBERT,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 8, no. 10, 2024. [Online]. Available: https://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/14291.

[8] M. S. Adhantoro, F. A. Haq, D. Hartanto, and A. P. Sudaryanto, “IndoBERT-Based Sentiment Analysis of Electric Motorcycle Policy in Indonesia Using Instagram Data,” Jurnal Penelitian Sains Teknologi, vol. 2, no. 2, pp. 165–181, 2026, https://doi.org/10.23917/saintek.v2i2.17021

[9] R. Merdiansah and Ridha, “Analisis Sentimen Pengguna X Indonesia Terkait Kendaraan Listrik Menggunakan IndoBERT,” Jurnal Ilmu Komputer dan Sistem Informasi (JIKOMSI), vol. 7, no. 1, pp. 221–228, 2024, https://doi.org/10.55338/jikomsi.v7i1.2895.

[10] W. Nurfitri and A. Chowanda, “Analisis Sentimen Pada Kasus Positif Covid-19 Berdasarkan Pemberitaan Media di Indonesia Menggunakan IndoBERT,” Progresif: Jurnal Ilmiah Komputer, vol. 20, no. 1, pp. 580–593, 2024, https://doi.org/10.35889/progresif.v20i1.1897

[11] I. M. Gananta, I. N. Purnama, and I. N. Fredlina, “Optimasi Prediksi Harga Emas Dengan Metode Support Vector Regression (SVR) Menggunakan Algoritma Grid Search,” JATI (Jurnal Mahasiswa Teknik Informatika), vol. 7, no. 6, pp. 3160–3165, 2023, https://doi.org/10.36040/jati.v7i6.8000

[12] M. Yusuf, R. Y. Ruimassa, E. A. B. Wambrauw, E. B. Pala’langan, and S. Aras, “Sentiment Analysis on Shopee Product Reviews Using IndoBERT,” Journal of Information Systems and Informatics, vol. 6, no. 3, pp. 1616–1627, 2024, https://doi.org/10.51519/journalisi.v6i3.814

[13] B. Wilie et al., “IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding,” arXiv preprint arXiv:2009.05387, 2020. [Online]. Available: https://doi.org/10.18653/v1/2020.aacl-main.85

[14] F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” arXiv preprint arXiv:2011.00677, 2020 https://doi.org/10.18653/v1/2020.coling-main.66

[15] A. Iskoko, I. Tahyudin, and P. Purwadi, “Hyperparameter Optimization of IndoBERT Using Grid Search, Random Search, and Bayesian Optimization in Sentiment Analysis of E-Government Application Reviews,” Jurnal Teknik Informatika (JUTIF), vol. 6, no. 5, pp. 3430–3444, 2025, https://doi.org/10.52436/1.jutif.2025.6.5.4897.

[16] F. Koto, J. H. Lau, and T. Baldwin, “IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization,” in Proc. EMNLP 2021, 2021, pp. 10660–10668, https://doi.org/10.18653/v1/2021.emnlp-main.833

[17] J. F. Kusuma and A. Chowanda, “Indonesian Hate Speech Detection Using IndoBERTweet and BiLSTM on Twitter,” JOIV: International Journal on Informatics Visualization, 2023. http://dx.doi.org/10.30630/joiv.7.3.1035

[18] A. Simanjuntak et al., “Research and Analysis of IndoBERT Hyperparameter Tuning in Fake News Detection,” Jurnal Nasional Teknik Elektro dan Teknologi Informasi, vol. 13, no. 1, pp. 60–67, 2024, https://doi.org/10.22146/jnteti.v13i1.8532.

[19] J. C. Setiawan, K. M. Lhaksmana, and B. Bunyamin, “Sentiment Analysis of Indonesian TikTok Review Using LSTM and IndoBERTweet Algorithm,” JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika), vol. 8, no. 3, pp. 774–780, 2023, https://doi.org/10.29100/jipi.v8i3.3911.

[20] H. Jayadianti, W. Kaswidjanti, A. T. Utomo, S. Saifullah, F. A. Dwiyanto, and R. Drezewski, “Sentiment Analysis of Indonesian Reviews Using Fine-Tuning IndoBERT and R-CNN,” ILKOM Jurnal Ilmiah, vol. 14, no. 3, pp. 348–354, 2022, https://doi.org/10.33096/ilkom.v14i3.1505.348-354.

Published

2026-05-31

How to Cite

Hyperparameter Optimization of IndoBERT for DeepSeek User Sentiment Analysis. (2026). Digital Zone: Jurnal Teknologi Informasi Dan Komunikasi, 17(1). https://doi.org/10.31849/digitalzone.v17i1.33024