A Brief Review on Plagiarism Detection Methods

  • Amir Namazi Department of Computer Science, Isfahan (Khorasgan) Branch, Islamic Azad University, Isfahan, Iran
Keywords: Smart plagiarism, Plagiarism detection, Document similarity


Plagiarism, a scientific misbehavior, has turned into a serious problem for researchers and publishers due to increasing and easy access to web-based scientific resources. Plagiarism includes using the scientific content of other documents without referencing the sources and is performed as copy the content directly or copy and change it. Most of the research in different fields are performed well in dealing with direct plagiarism, however, indirect plagiarism is often challenging for them. It is beneficial to study the preceding researches and find their strengths and weaknesses in order to find new ideas for future works. The present paper studies some of the plagiarism methods and a number of researches conducted in the area so far.


Z. Ceska, M. Toman, and K. Jezek, "Multilingual plagiarism detection," Artificial intelligence: Methodology, systems, and applications, pp. 83-92, 2008.

M. Potthast, B. Stein, A. Barrón-Cedeño, and P. Rosso, "An evaluation framework for plagiarism detection," in Proc. 23rd International conference on computational linguistics: Posters., pp. 997-1005.

R. Lukashenko, V. Graudina, and J. Grundspenkis,"Computer-based plagiarism detection methods and tools: an overview," in Proc. 2007 international conference on Computer systems and technologies., p. 40.

S. Gruner and S. Naven, "Tool support for plagiarism detection in text documents," in Proc. 2005 ACM symposium on Applied computing., pp. 776-781.

S. M. Alzahrani, N. Salim, and A. Abraham, "Understanding plagiarism linguistic patterns, textual features, and detection methods," IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 42, pp. 133-149, 2012.

M. Potthast, T. Gollub, M. Hagen, , M. Tippmann, J. Kiesel, P. Rosso, et al., "Overview of the 4th International Competition on Plagiarism Detection," in CLEF [Online] Working Notes/Labs/Workshop), 2010.

A. Barrón-Cedeno, P. Rosso, D. Pinto, and A. Juan, "On Cross-lingual Plagiarism Analysis using a Statistical Model," in Proc. of PAN'08, 2008, pp. 1-10.

M. Potthast, A. Barrón-Cedeño, B. Stein, and P. Rosso, "Cross-language plagiarism detection," Language Resources and Evaluation, vol. 45, pp. 45-62, 2011.

D. Pinto, J. Civera, A. Barrón-Cedeno, A. Juan, and P. Rosso, "A statistical approach to crosslingual natural language tasks," Journal of Algorithms, vol. 64, pp. 51-60, 2009.

B. Stein, S. M. zu Eissen, and M. Potthast, "Strategies for retrieving plagiarized documents," in Proc. 2007 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 825-826.

M. Elhadi and A. Al-Tobi, "Duplicate detection in documents and webpages using improved longest common subsequence and documents syntactical structures," in 2009 Fourth International Conference on Computer Sciences and Convergence Information Technology.

B. Stein, N. Lipka, and P. Prettenhofer, "Intrinsic plagiarism analysis," Language Resources and Evaluation, vol. 45, pp. 63-82, 2011.

S. Meyer zu Eissen, B. Stein, and M. Kulig, "Plagiarism detection without reference collections," Advances in data analysis, pp. 359-366, 2007.

A. Joseph and R. P. Haroon, "A Survey On Plagiarism Detection In Documents," Imperial Journal of Interdisciplinary Research, vol. 3, 2016.

T. Lancaster and F. Culwin, "Classifications of plagiarism detection engines," Innovation in Teaching and Learning in Information and Computer Sciences, vol. 4, pp. 1-16, 2005.

A. Barrón-Cedeño, M. Potthast, P. Rosso, B. Stein, and A. Eiselt, "Corpus and Evaluation Measures for Automatic Plagiarism Detection," in Proc. of the 2010 International Conference on Language Resources and Evaluation.

M. Zini, M. Fabbri, M. Moneglia, and A. Panunzi, "Plagiarism detection through multilevel text comparison," in Automated Production of Cross Media Content for Multi-Channel Distribution, 2006. AXMEDIS'06. Second International Conference, pp. 181-185.

A. Barrón-Cedeño, M. Vila, M. A. Martí, and P. Rosso, "Plagiarism meets paraphrasing: Insights for the next generation in automatic plagiarism detection," Computational Linguistics, vol. 39, pp. 917-947, 2013.

K. Leilei, Q. Haoliang, W. Shuai, D. Cuixia, W. Suhong, and H. Yong, "Approaches for candidate document retrieval and detailed comparison of plagiarism detection," Notebook for PAN at CLEF 2012.

M. Elhadi and A. Al-Tobi, "Use of text syntactical structures in detection of document duplicates," in IEEE 3rd International Conference on Digital Information Management., , London, UK..

C. Leacock, G. A. Miller, and M. Chodorow, "Using corpus statistics and WordNet relations for sense identification," Computational Linguistics, vol. 24, pp. 147-165, 1998.

M. Shamsfard, "Developing FarsNet: A lexical ontology for Persian," in 4th Global WordNet Conference, Szeged, Hungary, 2008.

S. Torres and A. Gelbukh, "Comparing similarity measures for original WSD lesk algorithm," Research in Computing Science, vol. 43, pp. 155-166, 2009.

A. Abdi, S. M. Shamsuddin, N. Idris, R. M. Alguliyev, and R. M. Aliguliyev, "A linguistic treatment for automatic external plagiarism detection," Knowledge-Based Systems, 2017.

S. S. Sonawane and P. A. Kulkarni, "Graph based representation and analysis of text document: A survey of techniques," International Journal of Computer Applications, vol. 96, 2014.

M. Franco-Salvador, P. Gupta, P. Rosso, and R. E. Banchs, "Cross-language plagiarism detection over continuous-space-and knowledge graph-based representations of language," Knowledge-Based Systems, vol. 111, pp. 87-99, 2016.

H. Ahangarbahan and G. A. Montazer, "A Mixed Fuzzy Similarity Approach to Detect Plagiarism in Persian Texts," in International Work-Conference on Artificial Neural Networks, 2015, pp. 525-534.

M. Mansoorizadeh, T. Rahgooy, and I. Hamedan, "Persian Plagiarism Detection Using Sentence Correlations," in FIRE (Working Notes), 2016, pp. 163-166.

E. Gharavi, K. Bijari, K. Zahirnia, and H. Veisi, "A Deep Learning Approach to Persian Plagiarism Detection," in FIRE (Working Notes), 2016, pp. 154-159.

N. Ehsan and A. Shakery, "A Pairwise Document Analysis Approach for Monolingual Plagiarism Detection," in FIRE (Working Notes), 2016, pp. 145-148.

F. Safi-Esfahani, S. Rakian, and M. Nadimi-Shahraki, "English-Persian Plagiarism Detection based on a Semantic Approach," Journal of AI and Data Mining, vol. 5, pp. 275-284, 2017.

B. Minaei and M. Niknam, "An n-gram based Method for nearly Copy Detection in Plagiarism Systems," in FIRE (Working Notes), 2016, pp. 172-175.

A. Talebpour, M. S. Laskoukelayeh, and Z. Aminolroaya, "Plagiarism Detection Based on a Novel Trie-based Approach," in FIRE (Working Notes), 2016, pp. 180-183.

M. Momtaz, K. Bijari, M. Salehi, and H. Veisi, "Graph-based Approach to Text Alignment for Plagiarism Detection in Persian Documents," in FIRE (Working Notes), 2016, pp. 176-179.