Handling Illusive Text in Document to Improve Accuracy of Plagiarism Detection Algorithm | SpringerLink
Skip to main content

Handling Illusive Text in Document to Improve Accuracy of Plagiarism Detection Algorithm

  • Conference paper
  • First Online:
Proceedings of the 11th International Conference on Robotics, Vision, Signal Processing and Power Applications

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 829))

  • 1916 Accesses

Abstract

Plagiarism Detection is being one of the challenging tasks in the academic research world to ensure the integrity/authenticity of a document. Currently, many efficient algorithms are available to sufficiently detect plagiarism in a document. Pre-processing of a document typically remains a master key to achieving the maximum stable goal. Although all algorithms, before checking plagiarism, initially perform some sort of pre-processing on documents and convert the document into a particular format like by removing whitespaces and all special characters, etc. In this paper, we focus on two possible techniques, which can be used for plagiarism, which existing plagiarism detection algorithms are omitting. First is replacing the white spaces with a hidden character with white color (background color) between consecutive words so apparently, they seem to be distinct words, but the algorithm/computer will incorrectly consider them as a single word. So even a 100% copied statement would not be identified as plagiarized content. The second is hiding spam text behind images to falsely report the maximum number of words count in a document but as they are hidden so human eye can’t discover them and the algorithm will consider them as some words resulting in less percentile score of the plagiarized document. Our proposed (pre-processing) technique can efficiently handle these two critical problems which result in improved accuracy and authenticity of plagiarism checking algorithms. We have compared the performance of our algorithm considering these critical issues with other state-of-art algorithms (particularly with Turnitin) and our algorithm handles these issues efficiently.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 34319
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
JPY 42899
Price includes VAT (Japan)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    After paper presentation in conference RoVISP2021, the issues mentioned in this paper have been addressed by Turnitin. So, now, Turnitin can identify the illusive text behind the images and illusive text among the words.

References

  1. Ekbal, A., Saha, S., Choudhary, G.: Plagiarism detection in text using vectorspace model. In: 12th International Conference on Hybrid Intelligent Systems, p. 363371 (2012)

    Google Scholar 

  2. Malik, S.R., Gulia, M.: Rabin-Karp algorithm with hashing a string matching tool. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 4(3), 389–392 (2014)

    Google Scholar 

  3. Saini, A., Bahl, A., Kumari, S., Singh, M.: Plagiarism checker: text mining. Int. J. Comput. Appl. 134 (2016)

    Google Scholar 

  4. Jiffriya, M.A.C., Jahan, A.M.A.C., Ragel, R.G.: plagiarism detection on electronic text based assignments using vector space model. In: Information and Automation for Sustainability (ICIAfS), Colombo, Sri Lanka (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zahid Iqbal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Iqbal, Z., Murtaza, S., Chan, H.Y., Ghori, M.R., Ahmed, N., Ayub, H. (2022). Handling Illusive Text in Document to Improve Accuracy of Plagiarism Detection Algorithm. In: Mahyuddin, N.M., Mat Noor, N.R., Mat Sakim, H.A. (eds) Proceedings of the 11th International Conference on Robotics, Vision, Signal Processing and Power Applications. Lecture Notes in Electrical Engineering, vol 829. Springer, Singapore. https://doi.org/10.1007/978-981-16-8129-5_9

Download citation

Publish with us

Policies and ethics