Total views : 170
Privacy Preserving Data Mining for Ordinal Data using Correlation Based Transformation Strategy (CBTS)
Objectives: Preservation of privacy is a significant aspect of data mining. The main objective of PPDM is to hide or provide privacy to certain sensitive information so that they can be protected from unauthorized parties or intruders. Methods/ Statistical Analysis: Though privacy is achieved by hiding the sensitive or private data, it will affect the data mining algorithms in knowledge extraction, so an effective method or strategy is required to provide privacy to the data and simultaneously protecting the quality of data mining algorithms. Instead of removing or encrypting sensitive or private data, we make use of data transformation strategies that keep the statistical, semantic and heuristic nature of data while protecting the sensitive or private data. Findings: In this paper we studied the technical feasibility of realizing Privacy Preserving Data Mining. In the proposed work, Correlation Based Transformation Strategy for Privacy Preserving Data Mining is used for ordinal data. We apply the method on few datasets namely soybean, Breast Cancer, Nursery dataset and Car dataset. We tabulate the end results applying the proposed strategy on both the original and the transformed dataset and observe correlation difference, Information Entropy and Classification Accuracy with different machine learning algorithms and Clustering Quality. Application/Improvements: As an improvement, the proposed work can be extended by use of vector marking techniques where these techniques help in increasing the efficiency by avoiding unauthorised access to the information.
Correlation Analysis, Nominal Data, Ordinal Data, Privacy Preserving Data Mining, Transformation Strategy.
- Lamba S, Abbas Q. A model for preserving privacy of sensitive data. International Journal of Technical research and Applications. 2013 Jul-Aug; 1(3):7–11. ISSN: 2320-8163.
- Naik, DP, Ghule AN. An advanced data transformation algorithm for categorical data protection. International Journal of Computer Science and Information Technologies. 2013; 4(6):899–902.
- Boora RK, Shukla R, Misra AK. An improved approach to high level privacy preserving itemset mining. 2009 Dec; 6(3):216–23. ISSN: 1947 5500.
- Sun C, Fu Y, Zhou J, Ga H. Personalized privacy-preserving frequent itemset mining using randomized response. The Scientific World Journal. 2014 Mar; 2014:10 pages.
- Zhu T, Xiong, Li G, Zhou W. Correlated differential privacy: Hiding information in non-IID dataset.IEEE
- Transactions on Information Forensics and Security. 2015 Feb; 10(2):229–42.
- Zhang Z, McDonnell K, Zadok E, Mueller K. Visual correlation analysis of numerical and categorical data on the correlation map. IEEE Transactions on Visualization and Computer Graphics. 2015 Feb; 21(2):289–303.
- Samanthula B, Elmehdwi Y, Jiang W. k-nearest neighbor classification over semantically secure encrypted relational data. IEEE Transactions on Knowledge and Data Engineering. 2015 May; 27(5):1261–73.
- Liu X, Lu R, Ma J, Chen L, Qin B. Privacy-preserving patient-centric clinical decision support system on naive Bayesian classification. IEEE Journal of Biomedical and Health Informatics. 2015 Jan; 20(1):1-1.
- Sang Y, Shen H, Tian H. Effective reconstruction of data perturbed by random projections. IEEE Transactions on Computers.2012 Jan; 61(1):101–17.
- Fong PK. Privacy preserving decision tree learning using unrealized data sets. IEEE Transactions on Knowledge and Data Engineering. 2012 Feb; 24(2):353–64.
- Hosain AA. Shear-based spatial transformation to protect proximity attack in outsourced database. IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom); 2013.
- Patel S, Amin KR. Privacy preserving based on PCA transformation using data perturbation technique. International Journal of Computer Science Engineering Technology.2013; 4(35):477–84.
- Xu S, Zhang J, Han D, Wang J. Singular value decomposition based data distortion strategy for privacy
- protection. Knowledge and Information Systems. 2006; 10(3):383–97.
- Wang J, Zhong W, Zhang J, Xu S. Selective data distortion via structural partition and SSVD for privacy preservation.IKE: Citeseer; 2006. p. 114–20.
- Ling G. Randomization based privacy preserving categorical data analysis. Diss. The University of North Carolina at Charlotte; 2010.
- Veryhios VS, Bertino E, Fovino IN, Provenza LP, Saygin Y, Theodoridis Y. State-of-the-art in privacy preserving data mining. SIGMOD Record. 2004 Mar; 33(1):50–7.
- Vijayarani S, Tamilarasi A. An efficient masking technique for sensitive data protection. 2011 IEEE International Conference on Recent Trends in Information Technology (ICRTIT); 2011.
- Singh, AP, Mathur A. A chaotic based approach for privacy preserving data mining applications with multilevel trust.2013 IEEE International Conference on Green Computing, Communication and Conservation of Energy (ICGCE); 2013.
- Nethravathi NP, Rao PG, Shenoy PD, Indiramma M, Venugopal KR. CBTS: Correlation Based Transformation Strategy for Privacy Preserving Data Mining. IEEE WIECON-ECE; Dhaka, Bangladesh. 2015 Dec 19–20.
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 3.0 License.