PENERAPAN METODE LOGISTIC REGRESSION UNTUK KLASIFIKASI SENTIMEN PADA DATASET TWITTER TERBATAS
DOI:
https://doi.org/10.31849/zn.v7i1.24804Keywords:
Logistic Regression, Klasifikasi Sentimen, Twitter, TF-IDFAbstract
Kecepatan dan akurasi menjadi semakin penting dalam analisis sentimen publik, terutama di media sosial seperti Twitter, yang sering digunakan untuk menyampaikan opini terkait berbagai isu terkini. Penelitian ini mengaplikasikan metode Logistic Regression untuk klasifikasi sentimen pada dataset terbatas yang terdiri dari 300 sampel, yang dikategorikan menjadi sentimen positif, negatif, dan netral. Studi kasus mengeksplorasi respons masyarakat terhadap pengangkatan Kaesang Pangarep sebagai Ketua Umum Partai Solidaritas Indonesia (PSI) di Twitter. Data eksternal dari vaksinasi COVID-19 dan topik umum (open topic) digunakan dalam penelitian ini untuk meningkatkan proses klasifikasi. Metode TF-IDF digunakan untuk meningkatkan representasi teks. Grid Search digunakan untuk mengoptimalkan hyperparameter model. Evaluasi dilakukan menggunakan metrik F1-score untuk mengukur precision dan recall. Hasil baseline menunjukkan F1-score sebesar 40,83%, sementara berdasarkan hasil eksperimen yang dilakukan optimasi menghasilkan peningkatan hingga 52,68% dengan akurasi 61,76% pada eksperimen terbaik (C7). Penelitian ini menunjukkan bahwa metode Logistic Regression yang dioptimalkan dapat melakukan klasifikasi dengan dataset terbatas, yang relevan untuk analisis sentimen.
References
[2] S. Suryono, E. Utami, and E. T. Luthfi, “Klasifikasi Sentimen Pada Twitter Dengan Naive Bayes Classifier,” Angkasa J. Ilm. Bid. Teknol., vol. 10, no. 1, p. 89, 2018, doi: 10.28989/angkasa.v10i1.218.
[3] B. N. Indonesia, “Kaesang resmi menjadi Ketum PSI, apa artinya bagi pertarungan Pilpres 2024?,” vol., no., p., Sep. 25, 2023. [Online]. Available: https://www.bbc.com/indonesia/articles/crg8mpexwxgo
[4] Y. Pratama, D. T. Murdiansyah, and K. M. Lhaksmana, “Analisis Sentimen Kendaraan Listrik Pada Media Sosial Twitter Menggunakan Algoritma Logistic Regression dan Principal Component Analysis,” vol. 7, pp. 529–535, 2023, doi: 10.30865/mib.v7i1.5575.
[5] N. L. P. C. Savitri, R. A. Rahman, R. Venyutzky, and N. A. Rakhmawati, “Analisis Klasifikasi Sentimen Terhadap Sekolah Daring pada Twitter Menggunakan Supervised Machine Learning,” J. Tek. Inform. dan Sist. Inf., vol. 7, no. 1, pp. 47–58, 2021, doi: 10.28932/jutisi.v7i1.3216.
[6] A. Khatua, A. Khatua, and E. Cambria, “A tale of two epidemics: Contextual Word2Vec for classifying twitter streams during outbreaks,” Inf. Process. Manag., vol. 56, no. 1, pp. 247–257, 2019, doi: 10.1016/j.ipm.2018.10.010.
[7] H. A. O. Liu, X. I. Chen, and X. Liu, “A Study of the Application of Weight Distributing Method Combining Sentiment Dictionary and TF-IDF for Text Sentiment Analysis,” IEEE Access, vol. 10, pp. 32280–32289, 2022, doi: 10.1109/ACCESS.2022.3160172.
[8] E. B. Setiawan, D. H. Widyantoro, and K. Surendro, “Feature expansion for sentiment analysis in twitter,” Int. Conf. Electr. Eng. Comput. Sci. Informatics, vol. 2018-Octob, pp. 509–513, 2018, doi: 10.1109/EECSI.2018.8752851.
[9] S. Kumar, N. Kaur, Kavita, and A. Joshi, “Tweet Sentiment Analysis using Logistic Regression,” IET Conf. Proc., vol. 2023, no. 11, pp. 332–336, 2023, doi: 10.1049/icp.2023.1801.
[10] Imamah and F. H. Rachman, “Twitter sentiment analysis of Covid-19 using term weighting TF-IDF and logistic regresion,” Proceeding - 6th Inf. Technol. Int. Semin. ITIS 2020, pp. 238–242, 2020, doi: 10.1109/ITIS50118.2020.9320958.
[11] A. Gaydhani, V. Doma, S. Kendre, and L. Bhagwat, “Detecting Hate Speech and Offensive Language on Twitter using Machine Learning : An N-gram and TFIDF based Approach”.
[12] W. Ahmed, “Sentiment Analysis on Twitter Using Machine Learning Techniques and TF-IDF Feature Extraction : A Comparative Study,” vol. 10, pp. 2–7, 2023.
[13] O. E. Ojo, O. O. Adebanji, H. Calvo, A. Gelbukh, A. Feldman, and G. Sidorov, “Hate and Offensive Content Identification in Indo-Aryan Languages using Transformer-based Models,” CEUR Workshop Proc., vol. 3681, no. November, pp. 383–392, 2023, doi: 10.13140/RG.2.2.31562.54721.
[14] M. Ihsan, Benny Sukma Negara, and Surya Agustian, “LSTM (Long Short Term Memory) for Sentiment COVID-19 Vaccine Classification on Twitter,” Digit. Zo. J. Teknol. Inf. dan Komun., vol. 13, no. 1, pp. 79–89, 2022, doi: 10.31849/digitalzone.v13i1.9950.
[15] M. K. Kusairi and S. Agustian, “Metode SVM dengan Fitur Representasi FastText untuk Klasifikasi Sentimen Twitter Mengenai Program Vaksinasi Covid-19,” J. Teknol. Inf. dan Komun., vol. 13, no. 2, pp. 140–150, 2022.
[16] Ash Shiddicky and Surya Agustian, “Analisis Sentimen Masyarakat Terhadap Kebijakan Vaksinasi Covid-19 pada Media Sosial Twitter menggunakan Metode Logistic Regression,” J. CoSciTech (Computer Sci. Inf. Technol., vol. 3, no. 2, pp. 99–106, 2022, doi: 10.37859/coscitech.v3i2.3836.
[17] S. Agustian, M. I. Syah, N. Fatiara, and R. Abdillah, “New Directions in Text Classification Research : Maximizing The Performance of Sentiment Classification from Limited Data Arah Baru Penelitian Klasifikasi Teks : Memaksimalkan Kinerja Klasifikasi Sentimen dari Data Terbatas,” pp. 1–10, 2024, [Online]. Available: https://arxiv.org/abs/2407.05627
[18] A. Poornima and K. S. Priya, “A Comparative Sentiment Analysis of Sentence Embedding Using Machine Learning Techniques,” 2020 6th Int. Conf. Adv. Comput. Commun. Syst. ICACCS 2020, pp. 493–496, 2020, doi: 10.1109/ICACCS48705.2020.9074312.
[19] M. F. Karaca, “Effects of preprocessing on text classification in balanced and imbalanced datasets,” KSII Trans. Internet Inf. Syst., vol. 18, no. 3, pp. 591–609, 2024, doi: 10.3837/tiis.2024.03.004.
[20] V. Amrizal, “Penerapan Metode Term Frequency Inverse Document Frequency (Tf-Idf) Dan Cosine Similarity Pada Sistem Temu Kembali Informasi Untuk Mengetahui Syarah Hadits Berbasis Web (Studi Kasus: Hadits Shahih Bukhari-Muslim),” J. Tek. Inform., vol. 11, no. 2, pp. 149–164, 2018, doi: 10.15408/jti.v11i2.8623.
[21] M. R. Hasan, M. Maliha, and M. Arifuzzaman, “2019 International Conference on Computer, Communication, Chemical, Materials and Electronic Engineering (IC4ME2),” 2019 Int. Conf. Comput. Commun. Chem. Mater. Electron. Eng., pp. 1–4, 2019.
[22] Mega Kurnia Maulidina, “ANALISIS SENTIMEN KOMENTAR WARGANET TERHADAP POSTINGAN INSTAGRAM MENGGUNAKAN METODE NAIVE BAYES CLASSIFIER DAN TF-IDF (Studi Kasus: Instagram Gubernur Jawa Barat ridwan Kamil),” Perpustakaan Universitas Teknologi Yogyakarta. Perpustakaan Universitas Teknologi Yogyakarta., 2020. http://lib.uty.ac.id/index.php?p=show_detail&id=13386
[23] F. D. Pramakrisna, F. D. Adhinata, and N. A. F. Tanjung, “Aplikasi Klasifikasi SMS Berbasis Web Menggunakan Algoritma Logistic Regression,” Teknika, vol. 11, no. 2, pp. 90–97, 2022, doi: 10.34148/teknika.v11i2.466.
[24] N. R. Robynson and Y. Sibaroni, “Analisis Tren Sentimen Masyarakat Terhadap Pembatasan Sosial Berskala Besar Kota Jakarta Menggunakan Algoritma Support Vector Machine,” e-Proceeding Eng., vol. 8, no. 5, pp. 10166–10178, 2021.
[25] J. (2016). Brownlee, “How to grid search hyperparameters for deep learning models in python with keras,” machinelearningmastery, 2016. https://machinelearningmastery.com/grid-search-hyperparameters-deep-learning-models-python-keras/
[26] P. Yohana, S. Agustian, and S. K. Gusti, “Klasifikasi Sentimen Masyarakat terhadap Kebijakan Vaksin Covid-19 pada Twitter dengan Imbalance Classes Menggunakan Naive Bayes,” Semin. Nas. Teknol. …, pp. 69–80, 2022, [Online]. Available: http://ejournal.uin-suska.ac.id/index.php/SNTIKI/article/view/19012%0Ahttp://ejournal.uin-suska.ac.id/index.php/SNTIKI/article/viewFile/19012/8336
Downloads
Published
Issue
Section
License
CC BY-SA 4.0
Attribution-ShareAlike 4.0
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.
