Total views : 278

Evaluation of Unsupervised Learning based Extractive Text Summarization Technique for Large Scale Review and Feedback Data


  • Department of CE, Institute of Technology, Nirma University, Ahmedabad – 382481, Gujarat, India
  • CMPICA, CHARUSAT University, Changa – 388421, Gujarat, India


Background/Objectives: Supervised techniques uses human generated summary to select features and parameter for summarization. The main problem in this approach is reliability of summary based on human generated parameters and features. Many researches have shown the conflicts in summary generated. Due to diversity of large scale datasets, supervised techniques based summarization also fails to meet the requirements. Big data analytics for text dataset also recommends unsupervised techniques than supervised techniques. Unsupervised techniques based summarization systems finds representative sentences from large amount of text dataset. Methods/Statistical Analysis: Co-selection based evaluation measure is applied for evaluating the proposed research work. The value of recall, precision, f-measure and similarity measure are determined for concluding the research outcome for the respective objective. Findings: The algorithms like KMeans, MiniBatchKMeans, and Graph based summarization techniques are discussed with all technical details. The results achieved by applying Graph Based Text Summarization techniques with large scale review and feedback data found improvement over previously published results based on sentence scoring using TF and TF-IDF. Graph based sentence scoring method is much efficient than other unsupervised learning techniques applied for extractive text summarization. Application/Improvements: The execution of graph based algorithm with Spark's Graph X programming environment will secure execution time for this types of large scale review and feedback dataset which is considered under Big Data Problem.


Extractive Text Summarization, Sentence Scoring Methods, Unsupervised Learning.

Full Text:

 |  (PDF views: 252)


  • Verma JP, Patel B, Patel A. Big Data Analysis. Recommendation System with Hadoop Framework, IEEE International Conference on Computational Intelligence & Communication Technology. 2015. p. 1–6. PMCid:PMC4410521
  • Ferreira R, Cabral LS, Lins RD, Silva GP, Freitas F, George DC, Cavalcanti A, Lima RA, Steven J, Simske B, Favaro L. Assessing sentence scoring techniques for extractive text summarization. Expert Systems with Applications. 2013; 40:5755–64. Crossref
  • Xiang Z, Schwartz Z, John H, Gerdes J, Uysal M. What can big data and text analytics tell us about hotel guest experience and satisfaction. International Journal of Hospitality Management. 2015; 44:120–30. Crossref
  • Ganesan K, Zhai C, Han. Opinosis. A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions. Proceedings of the 23rd International Conference on Computational Linguistic, Beijing, China: 2010. p. 1–9.
  • Ittoo A. Text analytics in industry: Challenges, desiderata and trends. Comput Industry. 2016. Crossref.
  • Khan A, Salim N, Obasa AI. An Optimized Semantic Technique for Multi- Document Abstractive Summarization. Indian Journal of Science and Technology. 2015 Nov; 8(32):1–11. Crossref
  • Lloret E, Palomar M. Tackling redundancy in text summarization through different levels of language analysis. Computer Standards & Interfaces. 2013; 35:507–18.
  • Bridge D, Healy P. The GhostWriter-2.0 Case-Based Reasoning system for making content suggestions to the authors of product reviews. Knowledge-Based Systems. 2012; 29:93–103. Crossref
  • Online Shopping touched new heights in India in 2012. Hindustan Times, 31 December 2012. 2014 July; 3(7):1–7, Retrieved on 31 December 2012.
  • Bing LI, Keith CC, Chan. A Fuzzy Logic Approach for Opinion Mining on Large Scale Twitter Data. IEEE/ ACM 7th International Conference on Utility and Cloud Computing, 2014. p. 652–7.
  • Ghorpade T, Ragha L. Hotel Reviews using NLP and Bayesian Classification. International Conference on Communication, Information & Computing Technology (ICCICT), Mumbai: 2012 Oct 19-20; 84(6):17–22.
  • Khan A, Baharudin B. Sentiment Classification Using Sentence-level Semantic Orientation of Opinion Terms from Blogs IEEE. IEEE. 2011; 1–17.
  • Thiago S, Guzella, Walmir M, Caminhas. A review of machine learning approaches to Spam filtering. Elsevier Journal - Expert Systems with Applications. 2009; 36:10206–22. Crossref
  • Sheshasaayee A, Jayanthi R. A Text Mining Approach to Extract Opinions from Unstructured Text. Indian Journal of Science and Technology. 2015 Dec; 8(36):1–4. Crossref
  • Nomoto T, Matsumoto Y. A New Approach to Unsupervised Text Summarization. SIGIR’01, Septe, New Orleans, Louisiana, USA: 2001. p. 1–9.
  • Sulthana AR, Subburaj R. An Improvised Ontology based K-Means Clustering Approach for Classification of Customer Reviews, Indian Journal of Science and Technology. 2016 Apr; 9(15):1–6. Crossref
  • Anuradha G, Varma DJ. Fuzzy Based Summarization of Product Reviews for Better Analysis. Indian Journal of Science and Technology. 2016 Aug; 9(31):1–9. Crossref


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.