Total views : 225
Document Clustering using a New Similarity Measure based on Energy of a Bipartite Graph
Objectives: This paper aims at clustering documents using a new similarity measure based on energy of a bipartite graph. Methods/Statistical Analysis: We have made use of bipartite representation of documents and clustered them. The proposed algorithm has been illustrated for a small document set. The documents have been clustered using the new similarity measure based on energy of a bipartite graph introduced by us. Findings: Our proposed algorithm gives a better clustering quality comparing with the k means clustering algorithm. Application/Improvements: This proposed algorithm can be further extended and applied to cluster large document sets.
Bipartite Graph, Cluster Quality, Document Clustering, Energy, Similarity Measure.
- Nagaraj R, Thiagarasu V. Correlation similarity measure based document clustering with directed ridge regression. Indian Journal of Science and Technology. 2014 Jan; 7(5):1-6.
- Kriegel HP, Kroger P, Sander J, Zimek A. Density-based clustering. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2011; 1(3):231–40.
- Chakrabarti D. Tools for large graph miners [thesis]. Center for Automated Learning and Discovery, School of Computer Science, Carnegie Mellon University, CMU-CALD-05-107; 2005. p. 1-117.
- Sharief BS, Kartheek E. Laplacian energy of an intuitionistic fuzzy graph. Indian Journal of Science and Technology. 2015; 8(33):1-7.
- West DB. Introduction to Graph Theory. Prentice Hall; 2001. p. 244.
- Jack H, Koolen K. Maximal energy graphs. Advances in Applied Mathematics. 2001; 26(1):47-52.
- Balakrishnan R. The energy of a graph. Linear Algebra and its Applications. 2004; 387:287-95.
- Grace GH, Desikan K. Reduced term set based document clustering using bipartite graph representation. Proceedings (eBook) of the International Workshop on Graph Algorithms (IWGA2015); USM, Penang, Malaysia. 20015. p. 269-74.
- Koolen JH, Moulton MV. Maximal energy bipartite graphs. Graphs Combinatorics. 2003; 19(1):131–5.
- Pritam C, Gaigole G, Patil LH, Chaudhari PM. Pre-processing techniques in text catagorization. National Conference on Innovative Paradigms in Engineering and Technology (NVIPET-2013); 2013; p. 137-142.
- Anil A, Kumar S, Chandrasekar C. Text Data preprocessing and dimensionality reduction techniques for document clustering. International Journal of Engineering Research and Technology. 2012; 1(5):1-6.
- Rama Subramanian C, Ramya R. Effective pre-processing activities in text mining using improved porters stemming algorithm. International Journal of Advanced Research in Computer and Communication Engineering. 2013; 2(12):4536-8.
- Murilo C, Naldi N, Richardo AFJGB, Campello C. Comparison among methods for k estimation in k-means. IEEE 9th International Conference on Intelligent Systems Design and Application; Brazil. 2009.
- Rao AS, Ramakrishna S, Babu PC. MODC: Multi-objective distance based optimal document clustering by GA. Indian Journal of Science and Technology. 2016 Jul; 9(28):1-8.
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 3.0 License.