Total views : 371

A Novel Hybrid Model for Diabetic Prediction using Hidden Markov Model, Fuzzy based Rule Approach and Neural Network


  • Department of Computer Science and Applications, Maharshi Dayanand University, Rohtak - 124001, Haryana, India


Objectives: Data mining approaches are used for developing the decision making systems. The current study proposes a novel hybrid model for diabetic prediction by using data mining techniques. The main objective of this study is to improve the accuracy rate by significantly reducing the size of the data under analysis at every stage. Methods/Statistical Analysis: To achieve the objectives, the PIMA Female Diabetic dataset, extracted from UCI repository, is used. The 10-fold cross validation method is used for extracting the testing and the training samples. Three rank based selection techniques are used for the attribute selection. The association between different attributes is identified and then clustering is performed under criticality using HMM and Fuzzy improved Neural Network. Findings: The data size reduces significantly when appropriate selection methods are applied in the respective sequence. For categorical data, the gain ratio attribute selection method out performs. Clustering is more effective when performed after identifying the exact associations among attributes. The proposed hybrid model achieved 92% of overall accuracy. The blend of supervised and un-supervised techniques achieved better results than the techniques when applied individually on the same data, as figured by the comparative analysis. The earlier prediction models worked either on classification or clustering. But in this present study, the classifiers and the clustering are performed. The Fuzzy improved Neural Networks are used for predicting the diabetes disease over the data. The result analysis proved that the prediction accuracy is poor (Naïve Bayes: 76.30%, Neural Networks: 75.13, Support Vector Machine: 77.47, K-Nearest neighbor: 69.79, Decision Tree (J48): 74.21), when the classifiers are implemented separately but when these are amalgamated with each other, produces better results. Application/ Improvements: The proposed hybrid model can be used as an expert system application, under the guidance of diabetic expert to assist the physicians for taking the decisions regarding the early diagnosis of the disease. In future, the proposed model can be applied on gender independent dataset. Further, the accuracy rate of the model can be improved by replacing the missing values of the dataset with the most appropriate value.


Associative Clustering, Diabetes, Fuzzy Improved NN, Hidden Markov Model, Information Gain.

Full Text:

 |  (PDF views: 366)


  • Varma KVSRP, Rao AA, Lakshmi TSM, Rao PVN. A computational intelligence approach for a better diagnosis of diabetic patients. Computers and Electrical Engineering. 2014; 40(5):1758–65.
  • Shi S, Yuankai Y, Hu FB. The global implications of diabetes and cancer. The Lancet. 2014; 383(9933):1947–8.
  • Melmed S, Kenneth S, Polonsky PMDP, Larsen RMD, Kronenberg HMMD. Williams book of endocrinology. 12th ed. Philadelphia: Elsevier/Saunders; 2015. p. 1371435.
  • Alex J S A, Mukhedkar A S, Venkatesan N. Performance analysis of SOFM based reduced complexity feature extraction methods with back propagation neural network for multilingual digit recognition. Indian Journal of Science and Technology. 2015 Aug; 8(18):1–8.
  • Obenshain KM. MAT application of data mining techniques to healthcare data. Statistics for Hospital Epidemiology. 2004; 25(8):690–5.
  • Sigurdardottir AK, Jonsdottir H, Benediktsson R. Outcomes of educational interventions in Type 2 diabetes: WEKA data mining analysis. Patient Education and Counseling. 2007; 67(1-2):21–31.
  • Dyck RF, Hayward MN, Harris SB. Prevalence, determinants and co morbidities of chronic kidney disease among First Nations adults with diabetes: Results from the circle study. BMC Nephrology. 2012; 13(1):57.
  • Jia W, Gao X, Pang C, Hou X, BaoY, Liu W. Prevalence and risk factors of albuminuria and chronic kidney disease in Chinese population with T2DM and impaired glucose regulation: Changhai diabetic complications study (SHDCS). Nephrology Dialysis Transplantation. 2009; 24(12):3724–31.
  • Su CT, Yang CH, Hsu KH, Chiu WK. Data mining for the diagnosis of type II diabetes from three dimensional body surface anthropometrical scanning data. Computers and Mathematics with Applications. 2006; 51(6-7):1075–92.
  • Bellazzi R, Zupan B. Predictive data mining in clinical medicine: Current issues and guidelines. International Journal of Medical Informatics. 2008; 77(2):81–97.
  • Cho BH, Yu H, Kim KW , Kim TH , Kim IY, Kim SI. Application of irregular and unbalanced data to predict diabetic nephropathy using visualization and feature selection methods. Artificial Intelligence in Medicine. 2008; 42(1):37–53.
  • Park J, Edington WD. A sequential neural network model for diabetes prediction. Artificial Intelligence in Medicine. 2001; 23(3):277–93.
  • Shankaracharya S. Computational intelligence in early diabetes diagnosis: A review. Rev Diab Stud. 2010; 7:252–62.
  • Mostafa M, Fathi G, Mohammad M, Saniee A. A fuzzy classification system based on ant colony optimization for diabetes disease diagnosis. Expert System with Applications. 2011; 38(12):14650–9.
  • Kasemthaweesab P, Kurutach W. Association analysis of Diabetes Mellitus (DM) with complication states based on association rules. 7th Conference on Industrial Electronics and Applications; Singapore. 2011. p. 1453–7.
  • Patil BM, Joshi RC, Durga T. Hybrid prediction model for type-2 diabetic patients. Expert Systems with Applications. 2010; 37(12):8102–8.
  • Pang-Ning T, Steinbach M, Kumar V. Introduction to data mining. USA: Addison-Wesley; 2006.
  • Aljumah AA, Ahamad MG, Siddiqui MK. Application of data mining: Diabetes health care in young and old patients. Journal of King Saud University-Computer and Information Sciences. 2013; 25(2):127–36.
  • Koklu M. Pima Diabetic data set. AIM. 1994; 7(8):1–3.
  • Singh GN, Pooja M. A computational hybrid model with two level classification using SVM and neural network for predicting the diabetes disease. Journal of Theoretical and Applied Information Technology. 2016; 87(1):1–10.
  • Han J, Micheline K. Data mining: Concepts and techniques. 2nd ed. USA: Morgan Kaufmann Publishers; 2006.
  • Devi MN, Balamurugan AA, Kris MR. Developing a modified logistic regression model for diabetes mellitus and identifying the important factors of type II DM. Indian Journal of Science and Technology. 2016 Jan; 9(4):1–8.
  • Karthikeyan T, Vembandasamy K. A novel algorithm to diagnosis type II diabetes mellitus based on association rule mining using MPSO-LSSVM with outlier detection method. Indian Journal of Science and Technology. 2015 Apr; 8(S8):1–11.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.