Total views : 392

Analyzing Diabetic Data using Classification Algorithms in Data Mining


  • SRM Arts and Science College, SRM Nagar, Kattankulathur - 603203, Tamil Nadu, India
  • PG and Research Department of Computer Science, D.G. Vaishnav College, 833, E.V.R. Periyar High Road, Arumbakkam, Chennai - 600106, Tamil Nadu, India


Backgrounds/Objectives: Huge medical datasets available in various data repositories which are used for real world applications. To visualize the useful information stored in data warehouses, the Data Mining (DM) methods are enormously utilized. One of such domain is medical domain, in which the function of DM approach raises speedy recovery of sickness over indications. On the way to categorize and predict symptoms in medicinal data, a variety of DM methods are utilized by different researchers. From many techniques of DM, classification is one of the main techniques. The classification techniques classify the unseen information in all areas including medical diagnostic field. The very dangerous disease in medicinal field is diabetes disease which is affected for many peoples in popular countries like India. Methods/Statistical Analysis: The impact of categorization is very important in authentic earth applications in all fields. To categorize the rudiments allowing to the applications of the elements during the predefined set of modules are used by classification methods. Very popular classification algorithms J48, Support Vector Machines (SVM), Classification and Regression Tree CART and k-Nearest Neighbor (kNN) for diabetic data are used for this research work. Findings: To discover the presentation of these classification methods, diabetic data as an input. For the most part, this research work is supported out to associate the techniques in the calculation of the presentation accurateness in diabetic data. The above mentioned techniques are used for diabetic data to categorize its accuracy in terms of its performance. Methods: The conclusion of this research work is choosing the top algorithm for the input data for the best classifier. Applications/Improvements: Some of other algorithms are analyzed using the same data set for the similar type of results is discussed in future. Also, some of the clustering algorithms are applied using the same data set to find highly affected diabetic patients.


CART Algorithm, Classification, J48 Algorithm, kNN Algorithm, SVM Algorithm.

Full Text:

 |  (PDF views: 978)


  • Iyer Aiswarya, Jeyalatha S and Sumbaly Ronak. Diagnosis of diabetes using classification mining techniques. International Journal of Data Mining & Knowledge Management Process. 2015; 5:1-14.
  • Velide Phani Kumar and Velide Lakshmi. A Data Mining Approach for Prediction and Treatment of diabetes Disease. International Journal of Science Inventions Today. 2014; 3:73-9.
  • Sanakal Ravi and Jayakumari T. Prognosis of Diabetes Using Data mining Approach-Fuzzy C Means Clustering and Support Vector Machine. International Journal of Computer Trends and Technology. 2014; 11:94-8.
  • Sharma Arvind and Gupta PC. Predicting the Number of Blood Donors through their Age and Blood Group by using Data Mining Tool. International Journal of Communication and Computer Technologies. 2012; 01:6-10.
  • Yasodha P, Kannan M. Analysis of a Population of Diabetic Patients Databases in WEKA Tool. International Journal of Scientific & Engineering Research. 2011; 2:1-5.
  • Karegowda Asha Gowda, Jayaram MA, Manjunath AS. Cascading K-means Clustering and k Nearest Neighbor Classifier for Categorization of Diabetic Patients. International Journal of Engineering Advanced Technology. 2012; 1:147-51.
  • Maniya Hardik, Mosin I Hasan, Komal P Patel. Comparative study of Naive Bayes Classifier and kNN for Tuberculosis. International Journal of Computer Applications. 2011; p. 22-6.
  • Yu W and Zhengguo W. A Fast kNN algorithm for text categorization. Hong Kong: Proceedings of the Sixth International Conference on Machine Learning and Cybernetics. 2007; 50:3436-41.
  • Angeline Christobel Y, Sivaprakasam P. A New Class wise k Nearest Neighbor (CkNN) Method for the Classification of Diabetes Dataset. 2013; 2:396-400.
  • Estebanez C, Aler R and Valls M. Genetic Programming Base Data Projections for Classification Tasks. World Academy of Science, Engineering and Technology. 2005; p. 56-61.
  • Salama GI, Abdelhalim MB, Zeid MA. Experimental comparison of classifiers for breast cancer diagnosis. International Conference on Computer Engineering & Systems. 2012; 98:180-5.
  • Ianchao Han J, Juan C Rodriguze, Beheshti Mohsen. Diabetes Data Analysis and Prediction model discovery. Second International conference on future generation communication and networking. 2011; p. 96-9.
  • Asma A Aljarullah. Decision tree discovery for the diagnosis type-2 diabetes. International conference on innovation in information technology. 2011; p. 303-7.
  • Patil BM, Joshi RC, Toshniwal Durga. Hybrid prediction model for type-II diabetic patients. Expert Systems with Applications. 2012; p. 8102-8.
  • Yildirim EG, Karachoca A and Uear T. Dosage Planning for diabetes patients using data mining methods. Procedia Computer Science. 2011; p. 1374-80.
  • Parthiban G, Rajesh A, Srivatsa SK. Diagnosis of Heart Disease for Diabetic Patients using Naive Bayes Method. International Journal of Computer Applications. 2011; 24:7-11.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.