Naïve Bayes Alpha Parameter Optimization with Ant Colony for Clinical Text Classification
DOI:
https://doi.org/10.31849/digitalzone.v16i1.24118Keywords:
Classification, NLP, Naive Bayes, ACO, classification, NLP, Naïve Bayes, ACO, medical textAbstract
This study addresses the challenges of text classification in domain-specific Natural Language Processing (NLP) within the medical field, which differs significantly from general NLP due to the presence of complex medical jargon and informal language in clinical documents. The primary objective of this research is to develop and evaluate a cancer-related text classification model by integrating the Naïve Bayes algorithm with Laplacian smoothing and optimizing its alpha parameter using Ant Colony Optimization (ACO). Specifically, the study aims to determine whether ACO can effectively identify the optimal alpha value that enhances the classification performance of the Naïve Bayes model. Experimental results demonstrate that with an alpha value of 0.27, the proposed model achieves an accuracy of 81.05%. This indicates that the combination of ACO and Naïve Bayes significantly improves classification efficiency and accuracy. The findings contribute to more accurate interpretation of clinical cancer-related texts, supporting better-informed decision-making in medical contexts
References
[1] K. Zhang, “Study of text classification Natural Language Processing algorithms for four European areas’ English dialects,” in 2021 International Conference on Computer Information Science and Artificial Intelligence (CISAI), 2021, pp. 348–352. doi: 10.1109/CISAI54367.2021.00073.
[2] R. Li, M. Liu, D. Xu, J. Gao, and F. Wu, A Review of Machine Learning Algorithms, vol. 2. Springer Singapore. doi: 10.1007/978-981-16-9229-1.
[3] O. Kaurova, M. Alexandrov, and X. Blanco, “Classification Of Free Text Clinical Narratives ( Short Review ),” Pp. 124–135.
[4] J. Patrick and C. Street, “Text Mining in Clinical Domain : Dealing with Noise,” 2015.
[5] J. Jasmir, S. Nurmaini, R. F. Malik, and D. Zaenal, “Text Classification of Cancer Clinical Trials Documents Using Deep Neural Network and Fine Grained Document Clustering,” vol. 172, no. Siconian 2019, 2020.
[6] G. Mujtaba et al., “Clinical Text Classification Research Trends: Systematic Literature Review and Open Issues,” Expert Syst. Appl., 2018, doi: 10.1016/j.eswa.2018.09.034.
[7] K. M. El Hindi, R. R. Aljulaidan, and H. AlSalman, “Lazy fine-tuning algorithms for naïve Bayesian text classification,” Appl. Soft Comput., vol. 96, p. 106652, 2020, doi: https://doi.org/10.1016/j.asoc.2020.106652.
[8] I. Wickramasinghe and H. Kalutarage, “Naive Bayes: applications, variations and vulnerabilities: a review of literature with code snippets for implementation,” Soft Comput., vol. 25, no. 3, pp. 2277–2293, 2021, doi: 10.1007/s00500-020-05297-6.
[9] D. R. S. Saputro, “Classification Data Mining with Laplacian Smoothing,” vol. 030004, 2022.
[10] S. M. Hossain and M. A. Ayub, “Parameter Optimization of Classification Techniques for PDF based Malware Detection,” in 2020 23rd International Conference on Computer and Information Technology (ICCIT), 2020, pp. 1–6. doi: 10.1109/ICCIT51783.2020.9392685.
[11] Z. Wang, M. Zhang, R. Chu, and L. Zhao, “Modeling and Planning Multimodal Transport Paths for Risk and Energy Efficiency Using AND/OR Graphs and Discrete Ant Colony Optimization,” 2020. doi: 10.1109/access.2020.3010376.
[12] Y. Chen, “A Copula-Based Supervised Learning Classification for Continuous and Discrete Data,” 2021. doi: 10.6339/jds.201610_14(4).0010.
[13] C. G. Bell, K. P. Treder, J. Kim, M. E. Schuster, A. I. Kirkland, and T. J. A. Slater, “Trainable Segmentation for Transmission Electron Microscope Images of Inorganic Nanoparticles,” 2022. doi: 10.1111/jmi.13110.
[14] I. N. K. Bayu, I. M. A. D. Suarjaya, and P. W. Buana, “Classification of Indonesian Population’s Level Happiness on Twitter Data Using N-Gram, Naïve Bayes, and Big Data Technology,” 2022. doi: 10.18517/ijaseit.12.5.14387.
[15] L. J. Muhammad, E. A. Algehyne, S. S. Usman, A. S. Ahmad, C. Chakraborty, and I. A. Mohammed, “Supervised Machine Learning Models for Prediction of COVID-19 Infection Using Epidemiology Dataset,” 2020. doi: 10.1007/s42979-020-00394-7.
[16] H. T. Sueno, “Multi-Class Document Classification Using Support Vector Machine (SVM) Based on Improved Naïve Bayes Vectorization Technique,” 2020. doi: 10.30534/ijatcse/2020/216932020.
[17] D. Oliva, S. Hinojosa, and M. V Demeshko, “Engineering applications of metaheuristics: an introduction,” J. Phys. Conf. Ser., vol. 803, no. 1, p. 12111, Jan. 2017, doi: 10.1088/1742-6596/803/1/012111.
[18] S. Young-Bo, S. Lee, and S. Lee, “The ACO routing agent implementation for the real network,” in 2016 Eighth International Conference on Ubiquitous and Future Networks (ICUFN), 2016, pp. 763–768. doi: 10.1109/ICUFN.2016.7537141.
[19] S. Sengupta, S. Basak, and R. A. Peters, “Particle Swarm Optimization: A Survey of Historical and Recent Developments with Hybridization Perspectives,” Mach. Learn. Knowl. Extr., vol. 1, no. 1, pp. 157–191, 2019, doi: 10.3390/make1010010
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Digital Zone: Jurnal Teknologi Informasi dan Komunikasi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.






