Total views : 194

Authorship Identification for Tamil Classical Poem (Mukkoodar Pallu) using Bayes Net Algorithm

Affiliations

  • Department of Computer Science and Engineering, SRM University, Kattankulathur, Chennai - 603203, Tamil Nadu, India

Abstract


Objective: To classify the authors of unknown Tamil dataset based on the work of known authors. Methods/Analysis: Text processing is the method of deriving high quality information from text that includes statistical patterns from the text. This paper proposes text processing method to extract features and perform classification on the same. Findings: The accuracy of the classifier turns out to be 94.1%. Classifier accuracy is improved from 88.23% to 94.1% by varying the classification algorithm (Bayes Net). Novelty/Improvement: This method can be further extended to all regional languages. By doing this, authors of various other poems in Tamil language can be identified which will be helpful to the society.

Keywords

Authorship, Classification, Feature Selection, Tamil Articles.

Full Text:

 |  (PDF views: 149)

References


  • Iqbal F, Binsalleeh H, Benjamin CM. Fung, Debbabi M.Mining write prints from anonymous e-mails for forensic investigation. Digital Investigation. 2010; 7:56-64.
  • Meluch L, Tokarova I, Farkas P, Schindler F. Simple method based on complexity for authorship detection of text.MIPRO; 2016; 10(12).
  • Tokarova, Meluch L, Farkas P, Ruzicky E. Forensic author identification with help of compaction algorithm, (In Slovak). Slaboproudy obzor. 2015 Oct; 71(3):1-5.
  • Pratanwanich N, Lio P. Who wrote this? Textual modeling with authorship attribution in big data. IEEE International Conference on Data Mining Workshop; 2014.
  • Sanjanasri JP, Anand Kumar M. A computational framework for Tamil document classification using random kitchen sink. IEEE International Conference on Advances in Computing, Communications and Informatics (ICACCI); 2015.
  • Bhargava UK, Ramakrishnan AG, Mohammad S.Recognition of open vocabulary. IEEE Online Tamil Handwritten Pages in Tamil Script. 2014; 42(3):6-9.
  • Ramakrishnan AG, Arulmozhi. Language models in recognition applications: a new approach. Proceedings of the Tamil Internet Conference; Kuala Lumpur, Malaysia. 2013.
  • Khonji M, Iraqi Y, Jones A. An evaluation of authorship attribution using random forests. IEEE, International Conference on Information and Communication Technology Research (ICTRC’15); 2015.
  • Ramakrishnan AG, Urala B. Global and local features for recognition of online handwritten numerals and Tamil characters. Proceedings of International Workshop on Multilingual OCR (MOCR); 2013.
  • Sundaram S, Ramakrishnan AG. Performance enhancement of online handwritten Tamil symbol recognition with reevaluation techniques. Pattern Analysis and Applications.2013; 1–23.
  • Rampalli R, Ramakrishnan AG. Fusion of complementary online and offline strategies for recognition of handwritten Kannada characters. Journal of Universal Computer Science. 2011; 17(1):81–93.

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.