Total views : 68

Improved Rule Based Classifier Based on Decision Trees (IRBC-DT) for Gastric Cancer Data Classification

Affiliations

  • Department of Computer Science, Karpagam University, KAHE, Coimbatore – 641 021, Tamil Nadu, India
  • Department of Information Technology, Karpagam University, Coimbatore – 641 021, Tamil Nadu, India

Abstract


Objectives: To design and develop an improved rule based classifier based on decision trees (IRBC-DT) for Gastric Cancer data classification with increased accuracy, hit rate and substantial reduction of elapsed time. Methods/Analysis: At the initial stage, IRBC-DT mingles a pair of techniques, namely the boosting and arbitrary sub-space, in order to build rules based on classification. As a result, the subsequent level divides the dataset into two parts in which the first set for training the data and the second one for pruning. Then a decision tree is built for analyzing the misclassified instances. Findings: Each feature is tested and assigned with precise weight for which the k-nearest neighbor classifier is applied, based on the weighted features. As a final point, the algorithm will get updated with the instances which contain the misclassified class labels. Once subsequent to analysis and updating the instances of misclassified class labels, the conflicting rules are checked and the same are removed. Attribute bagging is a set of classifier that operate on a sub-space of the original element space, and produces the class corresponds to the result of those unique classifiers. Random subspace scheme has a striking option of data classification that ensemble with considerably more number of features, such as cancer data. Also, boosting is modeled particularly for classification, which alters the weak classifiers into strong ones by means of an iterative process. Boosting mechanism makes use of selecting the apt classification in order to coalesce the complete classifier results. Applications/Improvements: IRBC-DT is implemented in MATLAB and can be applied in healthcare sector. From the results it is perceived that the method gains better performance than that of the existing algorithms for gastric cancer data classification.

Keywords

Accuracy, Decision Trees, Elapsed Time, Gastric Cancer, Hit Rate, IRBC-DT, Misclassified Instances

Full Text:

 |  (PDF views: 70)

References


  • Thara L, Gunasundari R. Significance of Data mining techniques in disease diagnosis and Biomedical Research - A survey. The IIOAB Journal. 2016 Nov; 284–92.
  • Brettingham-Moore KH, Duong CP, Heriot AG, Thomas RJ, Phillips WA. Using gene expression profiling to predict response and prognosis in gastrointestinal cancers-the promise and the perils. Ann Surg Oncol. 2011 May; 1484– 91. Crossref, PMid:21104326
  • Balasubramanian SP. Evaluation of the necessity for gastrectomy with lymph node dissection for patients with submucosal invasive gastric cancer. Br J Surg. 2001 Aug; 1133–4. PMid:11494983
  • Boussioutas A, Li H, Liu J, Waring P, Lade S, Holloway AJ, Taupin D, Gorringe K, Haviv I, Desmond PV, Bowtell DD.Distinctive patterns of gene expression in premalignant gastric mucosa and gastric cancer. Cancer Res. 2003 May; 2569–77. PMid:12750281
  • Vecchi M, Nuciforo P, Romagnoli S, Confalonieri S, Pellegrini C, Serio G, Quarto M, Capra M, Roviaro GC, Avesani CE, Corsi C, Coggi G, Di Fiore PP, Bosari S. Gene expression analysis of early and advanced gastric cancers. Oncogene. 2007 Jun; 4284–94. Crossref, PMid:17297478
  • Nam S, Lee J, Goh SH. Differential gene expression pattern in early gastric cancer by an integrative systematic approach. Int J Oncol. 2012 Nov; 1675–82. PMid:22961301, PMCid:PMC3982715
  • Kim H, Eun JW, Lee H, et al. Gene expression changes in patient-matched gastric normal mucosa, adenomas, and carcinomas. Exp Mol Pathol. 2010 Sep; 201–9.PMid:21185829
  • Lei Z, Tan IB, Das K. Identification of molecular subtypes of gastric cancer with different responses to PI3-kinase inhibitors and 5-fluorouracil. Gastroenterology. 2013; 554–65. Crossref, PMid:23684942
  • Pavlidis N, Pentheroudakis G. Cancer of unknown primary site. Lancet. 2012; 1428–35. Crossref
  • Monzon FA, Koen TJ. Diagnosis of metastatic neoplasms: molecular approaches for identification of tissue of origin.
  • Archieves of Pathology and Laboratory Medicine. 2010 Feb; 216–24.
  • Greco FA, Spigel DR, Yardley DA. Molecular profiling in unknown primary cancer: accuracy of tissue of origin prediction. Oncologist. 2010 Apr; 500–6. Crossref, PMid:20427384 PMCid:PMC3227979
  • Hainsworth JD, Rubin MS, Spigel DR. Molecular gene expression profiling to predict the tissue of origin and direct site-specific therapy in patients with carcinoma of unknown primary site: a prospective trial of the Sarah Cannon research institute. J Clin Oncol. 2013 Jan; 217–23.Crossref, PMid:23032625
  • Kirshners A, Parshutin S, Leja M. Research on application of data mining methods to diagnosing gastric cancer, advances in data mining. Proceedings of Industrial Conference on Data Mining, Lecture Notes in Computer Science. 2012; 7377:24–37. Crossref
  • Silvera SAN, Mayne ST, Marilie D, Gammon D. Diet and lifestyle factors and risk of subtypes of esophageal and gastric cancers: classification tree analysis. Ann Epidemiol. 2015 Jan; 50–7.
  • Wang X, Duren Z, Zhang C, et al. Clinical data analysis reveals three subtypes of gastric cancer. Proceedings of IEEE 6th international conference on systems biology, 2012. p. 315–20.
  • Mahmoodi SA, Mirzaie K, Mahmoudi SM. A new algorithm to extract hidden rules of gastric cancer data based on ontology. SpringerPlus. 2016 Mar; 5:312. Crossref, PMid:27066344 PMCid:PMC4786510
  • Rakesh A, Srikant R. Fast algorithms for mining association rules in large databases. Proceedings of the 20th International Conference on Very Large Data Bases, VLDB. 1994 Sep; 487–99. PMid:8054149

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.