Total views : 376

Biomedical Text Mining for Diagnosing Diseases - A Review


  • Research and Development Centre, Bharathiar University, Coimbatore - 641046, Tamil Nadu, India
  • Department of Computer Applications, Bhaktavatsalam Memorial College for Women, Chennai - 600080, Tamil Nadu, India


Diagnosis of diseases is a difficult work that has to do in accurate manner. Text mining deals a great job in this field. A huge mass of data is available in biomedical field, using this data we can diagnosis many diseases by text mining techniques in efficient manner. Text mining methods are used to retrieve useful knowledge from large data. Objective: The aim of this paper is to review several text mining methods used in biomedical field. This survey is helpful to select a best text mining method for biomedical data. Methods/Analysis: In this paper, classification method is used to study the biomedical text mining for diagnosing diseases. In the field of biomedical, classification can be done on the basis of patient disease pattern to separate the patients into high risk or low risk The classification techniques have two methods they are Binary contains two classes and multilevel contains more than two classes. Classification method is widely used in biomedical text mining. In this paper different classification techniques can be applied to categorize the text they are SVM (Support Vector Machine) NN (Neural Network), K-NN (K-Nearest Neighbor), Bayesian Method and DT (Decision Tree). Findings: In this paper, different classification techniques were surveyed and their merits and limitations have been discussed. The various classification techniques were applied in medical data where useful patterns and knowledge were extracted. The important task is that to select the suitable data and classification method for disease diagnosis. The objective of this survey is that how the classification methods are applied in biomedical application and to select which method is suitable and efficient for diagnosis of a particular disease. Novelty/Improvement: The main advantage of the survey is that it can be applied to any kind of dataset, it is a description dataset or not. For future improvement, we will implement our proposed methodology on using some major chest diseases datasets and measured performance in terms of training time and accurate diagnosis.


Biomedical Text Mining, Classification, Concept Linkage, IE (Information Extraction), Topic Tracking.

Full Text:

 |  (PDF views: 319)


  • Jeon M-S, Kim H-J. Awareness levels of biomedical ethics in undergraduates. Indian Journal of Science and Technology. 2015 April; 8(S8); 149–53. DOI: 10.17485/ijst/2015/ v8iS8/71493.
  • Biomedical [Internet]. [Cited 2016 Jan 07]. Available from:
  • Suganya P, Sumathi CP. A novel meta heuristic data mining algorithm for the detection and classification of parkinson disease. Indian Journal of Science and Technology. 2015 Jul; 8(14):1–9. DOI: 10.17485/ijst/2015/v8i14/72685.
  • Michael WB. Automatic discovery of similar words in survey of text mining: clustering, classification and retrieval. Springer Verlag, New York, LLC; 2004.
  • Information extraction [Internet]. [Cited 2016 Jan 14]. Available from:
  • Jhanjil D, Garg P. Text mining; 2014
  • Agrawal R, Batra M. Detailed study on text mining techniques; 2013.
  • Jinshu S, Zhang B, Xin X. Advances in machine learning based text categorization. Journal of Software. 2006; 17(9):1848–59.
  • Han E-H, Karypis G, Kumar V. Text categorization using weight adjusted k-nearest neighbor classification. Army HPC Research Center University of Minnesota.
  • Goharian and Grossman. Data Mining Classification, Illinois Institute of Technology [Internet]. [Cited 2016 Jan 25]. Available from: DM-Classification.pdf .
  • Abraham R, Simha JB, Iyengar SS. Effective discretization and hybrid feature selection using naïve bayesian classifier for medical data mining. International Journal of Computational Intelligence Research. 2008.
  • Han J, Kamber M, Pei J. Data mining concepts and techniques. Cluster Analysis, 3rd edn; 2012.
  • Hierarchical clustering [Internet]. [Cited 2016 Jan 26]. Available from: clustering.
  • Gupta V, Lehala GS. Survey of text mining techniques and applications. Journal of Emerging Technologies in Web Intelligence. 2009 Aug; 1(1):125–33.
  • Chau R, Tsoi AC, Hagenbuchner M, Lee VCS. A concept link graph for text structure mining. Wellington. New Zealand; 2009 Jan.
  • Chali Y, Joty SR, Hasan SA. Complex question answering: unsupervised learning approaches and experiments. Journal of Artificial Intelligence Research. 2009; 35(1):1–47.
  • Hu H, Li J, Plank A, Wang H, Daggard G. A comparative study of classification methods for microarray data analysis. Proceedings of Fifth Australasian Data Mining Conference (AusDM2006), Sydney, Australia. CRPIT, ACS; 2006. p. 33–7.
  • Jena CH, Wang CC, Jiangc BC, Chub YH, Chen MS. Application of classification techniques on development an early-warning system for chronic illnesses. Expert Systems with Application. 2012; 39:8852–58.
  • Chien C, Pottie GJ. A universal hybrid decision tree classifier design for human activity classification. Proceedings of 34th Annual International Conference of the IEEE EMBS San Diego, California: USA; 2012 Aug 28–Sep 1.
  • Soliman THA, Sewissy A, Latif HA. A gene selection approach for classifying diseases based on microarray datasets. Proceedings of 2nd International Conference on Computer Technology and Development (lCCTD 2010); 2010.
  • Er O, Yumusakc N, Temurtas F. Chest diseases diagnosis using artificial neural networks. Expert Systems with Applications. 2010; 37:7648–55.
  • Curiac DI, Vasile G, Banias O, Volosencu C, Albu A. Bayesian network model for diagnosis of psychiatric diseases. Proceedings of the ITI 2009 31st Int. Conf. on Information Technology Interfaces, Cavtat: Croatia; 2009 Jun 22–25.
  • Avci. A new intelligent diagnosis system for the heart valve diseases by using genetic-SVM classifier. Expert Systems with Applications. 2009; 36:10618–26.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.