Clustering Of Library’s Patron Behavior Using Machine Learning
DOI:
https://doi.org/10.31849/digitalzone.v16i1.19680Keywords:
Patron Behavior, Machine Learning, University Library, clusteringAbstract
Libraries collect a lot of important transaction data, but they rarely use this information to improve how consumers interact with them. This work tries to bridge this gap by offering a novel use of machine learning to analyze and classify library patron behavior. Customers were categorized based on their age range, checkouts, and renewals using the KMeans clustering technique. Dimensionality reduction methods like PCA and t-SNE were used to visually clarify the generated patterns. Our research revealed three different user groups: Rare Borrowers, who typically make 5.4 checkouts and 2.0 renewals; Occasional Borrowers, who typically make 20.8 checkouts and 7.1 renewals; and Frequent Borrowers, who often make 50.3 checkouts and 15.4 renewals. The clustering model performed quite well, as evidenced by its Calinski-Harabasz Index of 320.12, Davies-Bouldin Index of 0.45, and Silhouette Score of 0.62. Beyond these metrics, the study’s novelty lies in its practical implications—offering libraries a data-driven framework to tailor services, improve user satisfaction, and optimize resource allocation. This study highlights the transformative potential of machine learning in library science offering a data-driven framework for libraries to personalize services, optimize book recommendations, and enhance outreach efforts based on patron behavior. By segmenting users, libraries can better allocate resources and improve user experience. Limitation of this study lies on the data bias which may affect generalizability due to demographic differences across libraries. Additionally, KMeans clustering assumes predefined clusters, which may not fully capture nuanced behaviors.
References
P. K. Yadav, S. Sharma, and A. Singh, “Big Data and cloud computing: An emerging perspective and future trends,” in 2019 International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT), 2019, pp. 1–4. https://doi.org/10.1109/ICICT46931.2019.8977674
G. Karya, B. Sitohang, S. Akbar, and V. S. Moertini, “Basic Knowledge Construction Technique to Reduce The Volume of Low-Dimensional Big Data,” in 2020 Fifth International Conference on Informatics and Computing (ICIC), 2020, pp. 1–8. https://doi.org/10.1109/ICIC50835.2020.9288550
T. M. Kalpana and S. Gopalakrishnan, “Self-Sustainability in Academic Libraries in the Digital Era,” in Challenges of Academic Library Management in Developing Countries, IGI Global, 2013, pp. 47–67.
P. Wang, “Library User Behavior and Service Optimization Using Artificial Intelligence,” in 2024 IEEE 4th International Conference on Electronic Communications, Internet of Things and Big Data (ICEIB), 2024, pp. 510–513. https://doi.org/10.1109/ICEIB61477.2024.10602640
T. Kwanya, “Information seeking behaviour in digital library contexts,” in Information seeking behavior and challenges in digital libraries, IGI Global Scientific Publishing, 2016, pp. 1–25. DOI: 10.4018/978-1-5225-0296-8.ch001
L. Pokorná, M. Indrák, M. Grman, F. Stepanovsky, and M. Smetánková, “Silver lining of the COVID-19 crisis for digital libraries in terms of remote access,” Digit Libr Perspect, vol. 36, no. 4, pp. 389–401, 2020. https://doi.org/10.1108/DLP-05-2020-0026
D. Mehta and X. Wang, “COVID-19 and digital library services--a case study of a university library,” Digit Libr Perspect, vol. 36, no. 4, pp. 351–363, 2020. https://doi.org/10.1108/DLP-05-2020-0030
F. O. Omotayo and A. Haliru, “Perception of task-technology fit of digital library among undergraduates in selected universities in Nigeria,” The Journal of Academic Librarianship, vol. 46, no. 1, p. 102097, 2020. https://doi.org/10.1016/j.acalib.2019.102097
M. Ashiq, F. Jabeen, and K. Mahmood, “Transformation of libraries during Covid-19 pandemic: A systematic review,” The journal of academic librarianship, vol. 48, no. 4, p. 102534, 2022. https://doi.org/10.1016/j.acalib.2022.102534
S. Tufail, H. Riggs, M. Tariq, and A. I. Sarwat, “Advancements and challenges in machine learning: A comprehensive review of models, libraries, applications, and algorithms,” Electronics (Basel), vol. 12, no. 8, p. 1789, 2023. https://doi.org/10.3390/electronics12081789
S. Zhou, X. Wang, W. Zhou, and C. Zhang, “Recognition of the scale-free interval for calculating the correlation dimension using machine learning from chaotic time series,” Physica A: Statistical Mechanics and its Applications, vol. 588, p. 126563, 2022. https://doi.org/10.1016/j.physa.2021.126563
T. Bezdan et al., “Hybrid fruit-fly optimization algorithm with k-means for text document clustering,” Mathematics, vol. 9, no. 16, p. 1929, 2021. https://doi.org/10.3390/math9161929
R. Norum, “K-means clustering of student behavioral patterns and advanced visualization methods of learning technology data,” Unpublished undergraduate thesis]. Worcester Polytechninc Institute, 2022. https://digital.wpi.edu/downloads/ww72bf91j
J. Tang et al., “Statistical and density-based clustering of geographical flows for crowd movement patterns recognition,” Appl Soft Comput, vol. 163, p. 111912, 2024.
P. Bhattacharjee and P. Mitra, “A survey of density based clustering algorithms,” Front Comput Sci, vol. 15, pp. 1–27, 2021. https://doi.org/10.1016/j.asoc.2024.111912
X. Ran, Y. Xi, Y. Lu, X. Wang, and Z. Lu, “Comprehensive survey on hierarchical clustering algorithms and the recent developments,” Artif Intell Rev, vol. 56, no. 8, pp. 8219–8264, 2023. https://link.springer.com/article/10.1007/s10462-022-10366-3
T. Elguebaly and N. Bouguila, “Simultaneous high-dimensional clustering and feature selection using asymmetric Gaussian mixture models,” Image Vis Comput, vol. 34, pp. 27–41, 2015. https://doi.org/10.1016/j.imavis.2014.10.011
L. McInnes, J. Healy, and S. Astels, “hdbscan: Hierarchical density based clustering,” J Open Source Softw, vol. 2, no. 11, p. 205, 2017. https://joss.theoj.org/papers/10.21105/joss.00205
X. Gao, X. Ding, T. Han, and Y. Kang, “Analysis of influencing factors on excellent teachers’ professional growth based on DB-Kmeans method,” EURASIP J Adv Signal Process, vol. 2022, no. 1, p. 117, 2022. https://doi.org/10.1186/s13634-022-00948-2
E. Zhu, Z. Wang, F. Liu, and Z. Ma, “Dh-Kmeans: an improved K-means clustering algorithm based on dynamic initial cluster center determination and hierarchical clustering,” in 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD), 2022, pp. 170–176. https://doi.org/10.1109/CSCWD54268.2022.9776225
C. Li, Y. Zhang, M. Jiao, and G. Yu, “Mux-Kmeans: multiplex Kmeans for clustering large-scale data set,” in Proceedings of the 5th ACM workshop on Scientific cloud computing, 2014, pp. 25–32. https://doi.org/10.1145/2608029.2608033
A. M. Ikotun and A. E. Ezugwu, “Boosting k-means clustering with symbiotic organisms search for automatic clustering problems,” PLoS One, vol. 17, no. 8, p. e0272861, 2022. https://doi.org/10.1371/journal.pone.0272861
J. Gu, “Comparative analysis based on clustering algorithms,” in Journal of Physics: Conference Series, 2021, p. 12024. DOI 10.1088/1742-6596/1994/1/012024
T. P. Shibla and K. B. S. Kumar, “Improving efficiency of DBSCAN by parallelizing kd-tree using spark,” in 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS), 2018, pp. 1197–1203. https://doi.org/10.1109/ICCONS.2018.8663169
C. Deng, J. Song, S. Cai, R. Sun, Y. Shi, and S. Hao, “K-DBSCAN: an efficient density-based clustering algorithm supports parallel computing,” International Journal of Simulation and Process Modelling, vol. 13, no. 5, pp. 496–505, 2018. https://doi.org/10.1504/IJSPM.2018.094740
B. Aarthi, P. Selvakumar, S. Subiksha, S. Chhavi, and S. Parathasarathy, “Comparative Analysis Implementation of Queuing Songs in Players Using Audio Clustering Algorithm,” in Advances in Artificial and Human Intelligence in the Modern Era, IGI Global, 2023, pp. 76–94. DOI: 10.4018/979-8-3693-1301-5.ch004
S. P. Tamba, M. D. Batubara, W. Purba, M. Sihombing, V. M. M. Siregar, and J. Banjarnahor, “Book data grouping in libraries using the k-means clustering method,” in Journal of Physics: Conference Series, 2019, p. 12074. DOI 10.1088/1742-6596/1230/1/012074
I. S. Ritonga, A. Candra, and M. A. Budiman, “Utilization of K-Means Clustering to Examine Library User Segmentation’s Impact on Student Graduation Rates,” in 2024 2nd International Conference on Technology Innovation and Its Applications (ICTIIA), 2024, pp. 1–6. https://doi.org/10.1109/ICTIIA61827.2024.10761813






