Total views : 300

Classification using Latent Dirichlet Allocation with Naive Bayes Classifier to detect Cyber Bullying in Twitter

Affiliations

  • Bharathiyar University, Coimbatore - 641046, Tamil Nadu, India
  • Panimalar Engineering College, Chennai - 600123, Tamil Nadu, India

Abstract


Objectives: Social networks are becoming a risk for minors especially those are using it regularly. This action can also lead to Cyber bullying. The unstructured texts which are present in the enormous amount of information cannot simply be used for further processing by computers. So, the specific preprocessing methods and algorithms are needed in order to extract useful patterns. Methods/Analysis: One of the important research issues in the field of text mining is Text Classification. The Twitter corpus is used as the training and test data to build a sentiment classifier. The positive or negative sentiments of a new tweet are used to detect Cyber Bullying messages in Twitter using LDA with Naive Bayes classifier. Findings: The result shows that our model gives the better result of precision, recall and F-measure as nearly 70%. Naive Bayes is the most appropriate algorithm comparing with other algorithms like J48 and Knn. The CPU processing time for Naive Bayes algorithm is comparatively less than the other two classification algorithm. Improvements: The performance of the system can be improved by adding extra features to more amount of data.

Keywords

Cyber Bullying, LDA, Naive Bayes, Text Mining, Twitter.

Full Text:

 |  (PDF views: 398)

References


  • Blei DM, Ng AY, Jordan MI. Latent Dirichlet Allocation. Journal of Machine Learning Research. 2003 Mar; 3(4-5):993–1022.
  • Jeevanandam J, Koteeswaran S. Feature selection using random forest method for sentiment analysis. Indian Journal of Science and Technology. 2016 Jan; 9(3):1–7.
  • Mcghee I, Bayzick J, Kontostathis A, Edwards L, Mcbride A, Jakubowski E. Learning to identify Internet sexual predation. International Journal on Electron Commerce. 2011 Apr; 15:103–22.
  • Yin D, Davison BD, Xue Z, Hong L, Kontostathis A, Edwards L. Detection of harassment on Web 2.0. Proceedings of the Content Analysis in the Web 2.0. (CAW2.0) Workshop at WWW2009; 2009 Apr.
  • Dadvar M, Jong FD, Ordelman R, Trieschnigg D. Improved Cyber bullying detection using gender information. Proceedings of the Twelfth Dutch-Belgian Information Retrieval Workshop (DIR 2012); 2012 Feb. p. 23–5.
  • Kontostathis A, Edwards L, Leatherman A. ChatCoder: Toward the tracking and categorization of Internet predators. Proceedings of Text Mining Workshop 2009 held in conjunction with the Ninth SIAM International Conference on Data Mining (SDM 2009); 2009 May.
  • Reynolds K, Kontostathis A, Edwards L. Using machine learning to detect Cyber bullying. Proceedings of the 2011 10th International Conference on Machine Learning and Applications Workshops (ICMLA 2011). 2011 Dec; 2:241–4.
  • Dinakar K, Reichart R, Lieberman H. Modeling the detection of textual Cyber bullying. International Conference on Weblog and Social Media - Social Mobile Web Workshop; Barcelona, Spain. 2011.

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.