Aplikasi Pendeteksi Tingkat Kesamaan Dokumen Teks: Algoritma Rabin Karp Vs. Winnowing

  • Sugiono Sugiono STMIK Amik Riau
  • Herwin Herwin STMIK Amik Riau
  • Hamdani Hamdani STMIK Amik Riau
  • Erlin Erlin STMIK Amik Riau
Keywords: Rabin-Karp, Winnowing, Deteksi Plagiat, Kesamaan Teks

Abstract

Tindakan copy paste dokumen teks sering terjadi dalam penulisan karya ilmiah tanpa memberikan kredit kepada yang mempunyai dokumen teks tersebut. Tindakan melanggar kode etik ini disebabkan karena tersedianya fasilitas menyalin dan menempel teks pada aplikasi pengolah kata. Tujuan dari penelitian ini adalah untuk membangun sebuah aplikasi yang mampu mendeteksi tingkat kesamaan dokumen teks dengan terlebih dahulu membandingkan tingkat kehandalan dari dua algoritma pendeteksi kesamaan teks yaitu algoritma rabin-karp dan algoritma winnowing. Perbandingan dilakukan terhadap dua variabel yaitu tingkat kemampuan mendeteksi dan waktu pemrosesan. Hasil menunjukkan bawah algoritma winnowing lebih unggul dibandingkan algoritma rabin-karp dari sisi tingkat akurasi maupun dari sisi waktu pemrosesan.

 Abstract

The behavior of copy pastes the text document often occurs in scientific writing without giving credit to those who have the text document. The behavior of this missing code of conduct due to the availability of facility to copy and paste the text in a word processing application. The purpose of this study is to build an application that can detect the index of similarity of text documents by first comparing the level of reliability of the two text similarity algorithms, i.e., Rabin-Karp and Winnowing. The comparison is measured based on two variables; the level of capability of detecting and processing time. The result shows that Winnowing algorithm outperforms Rabin-Karp in term of both accuracy and processing time.

 

Keywords: Rabin-Karp, Winnowing, Plagiarism Detection, Text Similarity

 

Downloads

Download data is not yet available.

References

Oxford English Dictionary, https://en.oxforddictionaries.com/definition/plagiarism, retrieved 24 April 2018.

Turnitin, https://www.turnitinuk.com/login_page.asp?lang=en_gb, retrieved 27 April 2018.

Plagiarism Checker X. https://plagiarismcheckerx.com/, retrieved 27 April 2018.

Grammarly, https://app.grammarly.com/, retrieved 27 April 2018.

Dupli Checker, https://www.duplichecker.com/, retrieved 27 April 2018.

J. Lin. Brute Force and Indexed Approaches to Pairwise Document Similarity Comparisons with MapReduce. SIGIR. Boston, Massachusetts, USA. 2009.

F. Ture, T. Elsayed, J. Lin. No Free Lunch: Brute Force vs. Locality-Sensitive Hashing for Cross-lingual Pairwise Similarity. SIGIR. Beijing, China. 2011.

G. Sidorov, D. Pinto. Computing Text Similarity using Tree Edit Distance. World Conference on Soft Computing (WconSC). Redmont, WA, USA. 2015.

F. Hofmann. Levenshtein Distance and Text Similarity in Python.

http://stackabuse.com/levenshtein-distance-and-text-similarity-in-python/, retrieved 28 April 2018.

L. Salmela, J. Tarhio, P. Kalsi. Approximate Boyer-Moore String Matching for Small Alphabets. Algorithmica. 2010. Vol. 58:591.

N. Ben Nsira, T. Lecrog, M. Elloumi. A Fast Boyer-Moore Type Pattern Matching Algorithm for Highly Similar Sequence. International Journal of Data Mining and Bioinformatics. 2015. Vol. 13(3). pp. 266-288.

B. Leonardo, S. Hansun. Text Documents Plagiarism Detection using Rabin-Karp and Jaro-Winker Distance Algorithms. Indonesian Journal of Electrical Engineering and Computer Science. 2017. Vol. 5, No. 2. pp. 462-471.

C. Supriyanto, S. Rakasiwi, A. Syukur. A Comparison of Rabin Karp and Semantic-Based Plagiarism Detection. 3rd International Conferences on Soft Computing, Intelligent System and Information Technology (ICSIIT). Bali, Indonesia. 2012.

A. P. Utama Siahaan, Mesran, R. Rahim, D. Siregar. K-Gram as A Determinant of Plagiarism Level in Rabin-Karp Algorithm. International Journal of Scientific and Technology Reserach. 2017. Vol. 6, Issue 07.

R. Sutoyo, I. Ramadhani, A. Dwi Ardiatma, et. al. Detecting Documents Plagiarism using Winnowing Algorithm and K-Gram Method. International Conference on Cybernetics and Computational Intelligence (CyberneticsCom). Phuket, Thailand. 2017.

X. Duan, M. Wang, J. Mu. A Plagiarism Detection based on Extended Winnowing. MATEC Web of Conferences. International Conference on Electronic Information Technology and Computer Engineering (EITCE). Zhuhai, China. 2017. Vol. 128.

K. T. Tung, N. D. Hung, L. T. My Hanh. A Comparison of Algorithms used to Measure the Similarity between Two Documents. International Journal of Advanced Research in Computer Engineering and Technology (IJARCET). 2015. Vol. 4 Issue 4.

Published
2018-05-31
How to Cite
Sugiono, S., Herwin, H., Hamdani, H., & Erlin, E. (2018). Aplikasi Pendeteksi Tingkat Kesamaan Dokumen Teks: Algoritma Rabin Karp Vs. Winnowing. Digital Zone: Jurnal Teknologi Informasi Dan Komunikasi, 9(1), 82-93. https://doi.org/10.31849/digitalzone.v9i1.1242
Abstract viewed = 1429 times
PDF downloaded = 576 times