Comparative Analysis of KNN and Neavy Bayes Algorithms in Socio-Economic Data Classification in Indonesia
DOI:
https://doi.org/10.31849/digitalzone.v15i2.23337Keywords:
Socio-economics, Data Mining, Classification, K-NN, Naïve Bayes, PythonAbstract
The global economy continues to recover as trade flows, employment, and incomes improve. However, the economic recovery is uneven across countries and business sectors. The economic recovery has also resulted in structural changes, meaning that some sectors, jobs, technologies and behaviors will not return to pre-pandemic trends. Future developments depend on local economic conditions. The economy has the most important aspect in a country where the economy makes a country capable of meeting its needs by utilizing limited resources. This study aims to compare two data mining classification algorithms, namely Naïve Bayes and K-Nearest Neighbor, in analyzing socio-economic data in Indonesia. Based on this problem, the data mining classification method is used in determining the algorithm that is suitable for predicting socio-economic data in Indonesia. The two algorithms used are K-NN and Naive Bayes. After testing the two algorithms using confusion matrix and K-Fold Cross Validation, the results obtained from the two models have an accuracy of Naïve Bayes 98.25% and K-NN 97.78% and the results of K-Fold Cross Validation Naïve Bayes 98% and K-NN 96%. Naïve Bayes is superior to K-NN in this context of socioeconomic data classification in Indonesia, especially in terms of accuracy. Although K-NN shows good consistency, Naïve Bayes provides more accurate results.
References
A. H. Wibowo dan T. I. Oesman, “The comparative analysis on the accuracy of k-NN, Naive Bayes, and Decision Tree Algorithms in predicting crimes and criminal actions in Sleman Regency,” dalam Journal of Physics: Conference Series, Institute of Physics Publishing, Mar 2020. doi: https://doi.org/10.1088/1742-6596/1450/1/012076.
Badan Pusat Statistik Provinsi DKI Jakarta, “Badan Pusat Statistik Provinsi DKI Jakarta,” 2022.
J. K. Maranatha, “Perbedaan Algoritma C4.5 dan Naïve Bayes dalam Kepuasan Mahasiswa terhadap Pelayanan di Perguruan Tinggi”, Diakses: 6 Desember 2024. [Daring]. Tersedia pada: https://www.researchgate.net/publication/351271819
E. Firasari, N. Khasanah, U. Khultsum, D. N. Kholifah, R. Komarudin, dan W. Widyastuty, “Comparation of K-Nearest Neighboor (K-NN) and Naive Bayes Algorithm for the Classification of the Poor in Recipients of Social Assistance,” dalam Journal of Physics: Conference Series, IOP Publishing Ltd, Nov 2020. https://doi.org/10.1088/1742-6596/1641/1/012077.
S. Y. Irianto. S. Imaniar Ikko Mulya Rizky*, “Perbandingan Kinerja Algoritma Naive Bayes, Support Vector Machine dan Random forest untuk Prediksi Penyakit Ginjal Kronis,” 2023.
E. Ozturk Kiyak, B. Ghasemkhani, dan D. Birant, “High-Level K-Nearest Neighbors (HLKNN): A Supervised Machine Learning Model for Classification Analysis,” Electronics (Switzerland), vol. 12, no. 18, Sep 2023, https://doi.org/10.3390/electronics12183828.
V. Bayu Anwari dan Yuliazmi, “Implementasi Algoritma K-Nearest Neighbors Pada Analisis Sentimen Masyarakat Terhadap Penerapan Pemberlakuan Pembatasan Kegiatan Masyarakat.” https://doi.org/10.36080/skanika.v5i1.2912.
Utomo, “Perbandingan Algoritma Machine Learning Untuk Penentuan Klasifikasi Kemiskinan Multidimensi Di Provinsi Nusa Tenggara Timur,” BPS Provinsi NTT: JURNAL STATISTIKA TERAPAN, vol. V2I01.24, no. 10.5300/JSTAR., hlm. 1–12, 2022, https://doi.org/10.5300/JSTAR.V2I01.24.
S. S. R. H. A. T. M. S. Eghi Ditendra, “Comparison of Classification Algorithms for Sentiment Analysis of Islam Nusantara in Indonesia,” MALCOM: Indonesian Journal of Machine Learning and Computer Science, vol. 2, hlm. 71–77, 2022, https://doi.org/10.57152/malcom.v5i1.
C. Schröer, F. Kruse, dan J. M. Gómez, “A systematic literature review on applying CRISP-DM process model,” dalam Procedia Computer Science, Elsevier B.V., 2021, hlm. 526–534. doi: https://doi.org/10.1016/j.procs.2021.01.199.
V. Plotnikova, M. Dumas, dan F. Milani, “Adaptations of data mining methodologies: A systematic literature review,” PeerJ Comput Sci, vol. 6, hlm. 1–43, 2020, https://doi.org/10.7717/peerj-cs.267.
N. Cholifah Sastya dan D. I. Nugraha, “Penerapan Metode CRISP-DM dalam Menganalisis Data untuk Menentukan Customer Behavior di MeatSolution,” Jurnal Pendidikan Dan Aplikasi Industri, vol. 10, 2023, https://doi.org/10.33592/unistek.v10i2.3079.
M. A. Hasanah, S. Soim, dan A. S. Handayani, “Implementasi CRISP-DM Model Menggunakan Metode Decision Tree dengan Algoritma CART untuk Prediksi Curah Hujan Berpotensi Banjir,” Journal of Applied Informatics and Computing (JAIC), vol. 5, no. 2, hlm. 103, 2021, [Daring]. Tersedia pada: http://jurnal.polibatam.ac.id/index.php/JAIC
Walim, “Analisis Perbandingan Algoritma Naive Bayes, Random Forest Dan C.45 Dalam Klasifikasi Kelayakan Masyarakat Untuk Mendapatkan Bantuan Dana Desa,” 2018.
D. Normawati dan S. A. Prayogi, “Implementasi Naïve Bayes Classifier Dan Confusion Matrix Pada Analisis Sentimen Berbasis Teks Pada Twitter,” Jurnal Sains Komputer & Informatika (J-SAKTI, vol. 5, no. 2, hlm. 697–711, 2021, http://dx.doi.org/10.30645/j-sakti.v5i2.369.
A. Nugroho dan A. Amrullah, “Evaluasi Kinerja Algoritma K-Nn Menggunakan K-Fold Cross Validation Pada Data Debitur Ksp Galih Manunggal,” JINTEKS, vol. 5, no. 2, 2023, https://doi.org/10.51401/jinteks.v5i2.2506.
M. F. Essy Rahma Meilaniwati, “Klasifikasi penduduk miskin penerima PKH menggunakan metode naïve bayes dan KNN,” 2022, [Daring]. Tersedia pada: http://journal.student.uny.ac.id/ojs/index.php/jktm
M. Iksan Maulana dan U. Hayati, “Perbandingan Algoritma Naïve Bayes Dan K-Nearest Neighbors Untuk Klasifikasi Topik Berita Pada Situs Detik.Com,” Jurnal Mahasiswa Teknik Informatika, vol. 8, no. 3, Jun 2024, https://doi.org/10.36040/jati.v8i3.9779.






