Total views : 336

An Efficient Text Pattern Matching Algorithm for Retrieving Information from Desktop

Affiliations

  • Department of Computer Science, School of Computer Science and Engineering, Bharathiar University, Coimbatore - 641046, Tamil Nadu, India

Abstract


Objectives: To retrieve the information after analyzing the contents of the documents which are stored in the desktop by applying string matching algorithms. Methods/Statistical Analysis: To analyze the content of the documents, the various pattern matching algorithms are used to find all the occurrences of a limited set of patterns within an input text or input document. In order to perform this task, this research work used four existing string matching algorithms; they are Brute Force algorithm, Knuth-Morris-Pratt algorithm (KMP), Boyer Moore algorithm and Rabin Karp algorithm. This work also proposes three new string matching algorithms. They are Enhanced Boyer Moore algorithm, Enhanced Rabin Karp algorithm and Enhanced Knuth-Morris-Pratt algorithm. Findings: For experimentation, this work has used two types of documents, i.e. .txt and .docx. Performance measures used are search time, number of iterations and accuracy. From the experimental results, it is realized that the enhanced KMP algorithm gives better accuracy compared to other string matching algorithms. Application/Improvements: Normally, these algorithms are used in the field of text mining, document classification, content analysis and plagiarism detection. In future, these algorithms have to be enhanced to improve their performance and the various types of documents will be used for experimentation.

Keywords

Brute Force, Boyer Moore, Information Retrieval, Knuth-Morris-Pratt, Pattern Matching, Rabin Karp.

Full Text:

 |  (PDF views: 645)

References


  • Verma A, Kaur I, Singh I. Comparative analysis of data mining tools and techniques for information retrieval. Indian Journal of Science and Technology. 2016 Mar; 9(11):1–16.
  • Al-Mazroi A, Rashid NA. A Fast Hybrid Algorithm for the Exact String Matching Problem. American Journal of Engineering and Applied Sciences. 2011; 4(1):102–07.
  • Shweta C, Dharmadhikari D, Ingle M, Kulkarni P. Empirical Studies on Machine Learning Based Text Classification Algorithms. Advanced Computing. An International Journal (ACIJ). 2011; 2(6):161–69.
  • Bist AS. Pattern matching algorithms for computer virus detection. International Journal of Engineering Sciences and Research Technology. 2013; 2(1):28–9.
  • Naser MAS, Rashid NA, FaizAboalmaaly M. Quick-Skip search hybrid algorithm for the exact string matching problem. International Journal of Computer Theory and Engineering. 2012; 4(2):1–7.
  • Jony AI. Analysis of Multiple String Pattern Matching Algorithm. International Journal of Advanced Computer Science and
  • Information Technology (IJACSIT). 2014; 3(4):344–53.
  • Moh’dMhashi M, Alwakeel M. New Enhanced Exact String, Searching Algorithm. IJCSNS International Journal of Computer Science and Network Security. 2010; 10(1):1–10.
  • Boyer RS, Moore JS. A fast string searching algorithm. Communication of the ACM. 1977; 20(10):762–72.
  • Pandiselvam P, Marimuthu T, Lawrance R. A Comparative Study on String Matching Algorithms of Biological Sequences, Springer Berlin Heidelberg. 2009; 510–17.
  • Charras C, Lecroq T, Daniel J. A Very fast string searching algorithm for small alphabets and long patterns, Combinational Pattern Matching, 9th Annual Symposium, CPM 98 Piscataway, New Jersey, USA. 2005; 1448:54–8.
  • Robert S, Boyer B, Moore JS. A fast string Searching Algorithm. Communication of the ACM. 1997; 20(10):762–72.
  • Hossein G, Shokoufeh S, Abozar S. A Survey of Pattern Matching Algorithm in Intrusion Detection System Tehran, Iran. Indian Journal of Science and Technology. 2016 Jun; 9(21):1–7.
  • Rahul M, Diwate B, Satish J, Alaspurkar A. Study of Different Algorithms for Pattern Matching. International Journal of Advanced Research in Computer Science and Software Engineering. 2013; 3(3):1–8.
  • Bhandari J, Kumar A. String Matching Rules Used By Variants of Boyer-Moore Algorithm. Journal of Global Research in Computer Science. 2014; 5(1).
  • Shivaji SK, Prabhudeva S. Plagiarism Detection by using Karp-Rabin and String Matching Algorithm Together. International Journal of Computer Applications. 2015; 116(23):1–5.
  • Wahlstrom S. Evaluation of String Searching Algorithms, Italy. 2004; 1–22
  • Gope AP, Behera RN. A Novel Pattern Matching Algorithm in Genome Sequence Analysis. (IJCSIT) International Journal of Computer Science and Information Technologies. 2014; 5(4):5450–57.
  • Jony AI. Analysis of Multiple String Pattern Matching Algorithms. International Journal of Advanced Computer Science and Information Technology (IJACSIT). 2014; 3(4):344–53.
  • Harini R, Chandrasekar C. Efficient Sequential Pattern Matching Algorithm for Classified Brain Image. Indian Journal of Science and Technology. 2015 Jul; 8(14):1–10.

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.