Parallel String Matching for Urdu Language Text | SpringerLink
Skip to main content

Parallel String Matching for Urdu Language Text

  • Conference paper
  • First Online:
Intelligent Technologies and Applications (INTAP 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 932))

Included in the following conference series:

  • 1613 Accesses

Abstract

String matching is one of the essential problems in computer science. The language used in Pakistan is Urdu. For Urdu language texts, its characters are encoded by utf-8, and the utf-8 is a length-variable encoding. If we implement string matching algorithms for Urdu language texts by ASCII encoding, the correct matched positions may not be obtained. This paper analyzes the characteristics of Urdu language texts and studies the character encoding presentation for Urdu language texts and recognizes that the correct matched positions can be obtained when the wchar_t type and Unicode encoding is used to process Urdu language texts, then, this paper implements parallel algorithms for Boyer-Moore string matching, Knuth-Morris-Pratt string matching, and Sunday string matching for Urdu language texts and evaluate the execution performance of these four string matching algorithms on a large number of Urdu language patterns and text strings via experimental testing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 11439
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 14299
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Al-Dabbagh, S.S., Barnouti, N.H., Naser, M.A., Ali, Z.G.: Parallel quick search algorithm for the exact string matching problem using openMP. J. Comput. Commun. 4(13), 1–11 (2016)

    Google Scholar 

  2. Ďurian, B., Holub, J., Peltola, H., Tarhio, J.: Improving practical exact string matching. Inf. Process. Lett. 110(4), 148–152 (2010)

    Google Scholar 

  3. Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Commun. ACM 20(10), 762–772 (1977)

    Google Scholar 

  4. Horspool, R.N.: Practical fast searching in strings. Softw. Pract. Exp. 10(6), 501–506 (1980)

    Google Scholar 

  5. Hume, A., Sunday, D.: Fast string searching. Softw. Pract. Exp. 21(11), 1221–1248 (1991)

    Google Scholar 

  6. Ajwani, D., Dementiev, R., Meyer, U.: A computational study of external-memory BFS algorithms. In: Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm, pp. 601–610. Society for Industrial and Applied Mathematics (2006)

    Google Scholar 

  7. Korf, R.E.: Depth-first iterative-deepening: an optimal admissible tree search. Artif. Intell. 27(1), 97–109 (1985)

    Google Scholar 

  8. Sunday, D.M.: A very fast substring search algorithm. Commun. ACM 33(8), 132–142 (1990)

    Google Scholar 

  9. Knuth, D.E., Morris, J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput. 6(2), 323–350 (1977)

    Google Scholar 

  10. Hussain, S., Afzal, M.: Urdu computing standards: urdu zabta takhti (UZT) 1.01. In: Proceedings of IEEE International 2001 Multi Topic Conference, pp. 223–228. IEEE (2001)

    Google Scholar 

  11. ISO GENEVA: ISO/IEC 14651: Information Technology - International String Ordering and Comparison- Method for Comparing Character Strings and Description of Common Template Tailorable Ordering. ISO, Geneva, Switzerland, MuqtadraQaumizuban, Jadeed Urdu Lulghat (جديداردولغت), MuqtadraQaumiZuban, Islamabad, Pakistan (2001). www.iso.ch

  12. Afzal, M.: Urdu Software Industry: Prospects, Problems and Need for Standards. Science Vision Corp. (1999)

    Google Scholar 

  13. Hussain, S., Gul, S., Waseem, A.: Developing lexicographic sorting: an example for Urdu. ACM Trans. Asian Lang. Inf. Process. 6(3) (2007). Article no. 10

    Google Scholar 

  14. Claude, F., Navarro, G., Peltola, H., et al.: String matching with alphabet sampling. J. Discret. Algorithms 1, 37–50 (2012)

    Google Scholar 

  15. Erdem, O.: Tree-based string pattern matching on FPGAs. Comput. Electr. Eng. 49, 117–133 (2016)

    Google Scholar 

  16. Al-Ssulami, A.M., Mathkour, H.: Faster string matching based on hashing and bit-parallelism. Inf. Process. Lett. 123, 51–55 (2017)

    Google Scholar 

  17. Chung, K.-L.: O(1)-time parallel string-matching algorithm with VLDCs. Pattern Recogn. Lett. 17(5), 475–479 (1996)

    Google Scholar 

  18. Park, J.H., George, K.M.: Efficient parallel hardware algorithms for string matching. Microprocess. Microsyst. 23(3), 155–168 (1999)

    Google Scholar 

  19. Pungila, C.-P., Reja, M., Negru, V.: Efficient parallel automata construction for hybrid resource-impelled data-matching. Future Gener. Comput. Syst. 36, 31–41 (2014)

    Google Scholar 

  20. Qu, J., Zhang, G., Fang, Z., et al.: A parallel Aho-Corasick algorithm with non-deterministic finite automaton based on OpenMP. In: Proceedings of 2015 Seventh International Conference on Advanced Communication and Networking, pp. 52–55. IEEE (2015)

    Google Scholar 

  21. Trana, T.T., Liu, Y., Schmidta, B.: Bit-parallel approximate pattern matching: Kepler GPU versus Xeon Phi. Parallel Comput. 54, 128–138 (2016)

    Google Scholar 

  22. Yoon, J.M., Choi, K.-I., Kim, H.J.: A memory accessing method for the parallel Aho-Corasick algorithm on GPU. In: Proceedings of 2016 International Conference on Information Science and Security, pp. 1–3. IEEE (2016)

    Google Scholar 

  23. Lin, C.-H., Li, J.-C., Liu, C.-H., et al.: Perfect hashing based parallel algorithms for multiple string matching on graphic processing units. IEEE Trans. Parallel Distrib. Syst. 28(9), 2639–2650 (2017)

    Google Scholar 

  24. Zengin, S., Schmidt, E.G.: A fast and accurate hardware string matching module with Bloom filters. IEEE Trans. Parallel Distrib. Syst. 28(2), 305–317 (2017)

    Google Scholar 

  25. Baúto, J., Canelas, A., Neves, R., Horta, N.: Parallel SAX/GA for financial pattern matching using NVIDIA’s GPU. Expert Syst. Appl. 105, 77–88 (2018)

    Google Scholar 

  26. Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Commun. ACM 20, 762–772 (1977)

    Google Scholar 

  27. Lee, C.Y.: An algorithm for path connections and its applications. IRE Trans. Electron. Comput. EC-10(3), 346–365 (1961)

    Google Scholar 

  28. Knuth, D.E., Morris Jr., J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput. 6(1), 323–350 (1977)

    Google Scholar 

Download references

Acknowledgments

This work was partly supported by the National Natural Science Foundation of China (No. 61762010).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mirza Baber Baig .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Baig, M.B., Li, T.S. (2019). Parallel String Matching for Urdu Language Text. In: Bajwa, I., Kamareddine, F., Costa, A. (eds) Intelligent Technologies and Applications. INTAP 2018. Communications in Computer and Information Science, vol 932. Springer, Singapore. https://doi.org/10.1007/978-981-13-6052-7_32

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-6052-7_32

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-6051-0

  • Online ISBN: 978-981-13-6052-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics