Total views : 171

H-D and Subspace Clustering of Paradoxical High Dimensional Clinical Datasets with Dimension Reduction Techniques – a Model

Affiliations

  • Bharathiyar University, Coimbatore - 641046, India
  • Dr. M.G.R. Educational and Research Institute, Chennai - 600095, India
  • Velammal Engineering College, Chennai - 600066, India

Abstract


Objectives: Heterogeneous High dimensional data clustering is the analysis of data with multiple dimensions. Large dimensions are not easy to handle. The complexity increases exponentially with the dimensionality. Dimensionality reduction is the conversion of high dimensional data into a considerable representation of reduced dimensionality that corresponds to the essential dimensionality of the data. To solve the problem we put forward a general framework for clustering high dimensional datasets. Methods: Clustering is the method of finding groups of objects, such that the objects in the group will be similar to each another and different from the objects in other groups. In our framework, a heterogeneous high dimensional clustering is partitioned into several one or two dimensional clustering phases. Findings: In this paper, a model is designed in which Hierarchical-Divisive clustering; subspace clustering is used to make non-overlapping clusters and combined with dimension reduction techniques to reduce the dimensions of paradoxical high dimensional clinical datasets. Applications: solution for processing the heterogeneous high dimensional dataset such as PCA, LDA, and PSO etc.

Keywords

High Dimensional Data, Hierarchical-Divisive (H-D) Clustering, Subspace Clustering.

Full Text:

 |  (PDF views: 178)

References


  • Aastha Joshi, Rajneet Kaur. A Review: Comparative Study of Various Clustering Techniques in Data Mining. International Journal of Advanced Research in Computer Science and Software Engineering. 2013 Mar; 3(3):55-7.
  • Smyth P. Clustering using Monte Carlo cross-validation. Learning, Probability, & Z Graphical Models. 1996; p. 126-33.
  • Painthankar Rashmi, Tidke Bharat. A H-K clustering algorithm for high dimensional data using ensemble learning. International Journal of Information Technology Convergence and Services. 2014 Dec; 4(5/6):1-9.
  • Muller Emmanuel. Evaluating Clustering in subspace projections of high dimensional Data. Proceedings of the VLDB Endowment. 2009 Aug; 2(1):1270-81.
  • A novel approach for high dimensional data clustering. Date Accessed: 9/01/2010: Available from: http://ieeexplore.ieee.org/document/5432636/.
  • Parsons Lance, Haque Ehtesham, Liu Huan. Subspace clustering for high dimensional Data: A Review. ACM SIGKDD Explorations Newsletter. 2004 Jun; 6(1):90-105.
  • Strehl A, Ghosh J. Cluster ensembles – A knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research. 2003 Jan; 3:583-617.
  • He Ying , Wang Jian, Liang-Xi Qin, Mei Lin. A H-K Clustering-algorithm for high dimensional data using ensemble learning. IET International Conference on Smart and Sustainable City 2013 (ICSSC 2013). 2013 Aug; p. 300–305.
  • Jiawei Han, Kamber Michaline. Morgan Kaufmann Publishers: Data Mining Concepts and Techniques, 3rd(Edn). 2011 Jul.
  • Sim K, Gopala Krishnan V, Zimek A, Kong G. A survey on enhanced subspace clustering. Data mining and Knowledge Discovery. 2013 Mar; 26(2):332-97.
  • Moise G, Zimek A, Knoger P, Kriegal HP, Sander J. Subspace and Projected Clustering: Experiment Evaluation and Analysis. Knowledge and Information Systems. 2009 Dec; 21:299-326.

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.