Total views : 382

Classification of DNA Sequence Using Soft Computing Techniques: A Survey


  • Department of Computer Science, Sri Padmavati Mahila University, Padmavathi Nagar, Near West Railway Station, Chittoor, Tirupati – 517502, Andhra Pradesh, India


Objectives: This survey detects which methodology of soft computing are used frequently together to solve the problems of Deoxyribonucleic acid (DNA) sequencing and provides an overview of underlying concepts commonly used for DNA classification using soft computing technique. Methods/Analysis: DNA sequence classification is a significant problem in computational biology. The DNA sequence is used to identify differences and similarities between organisms within a species. The selection of attributes is primary criteria in DNA classification. DNA sequence classification techniques involve for origin of particular characteristics from the progressions. Different species have distinct genetic structure. Findings: The distinctive asset of soft computing is that helps to learn from empirical procedure that helps for DNA classification. The major components of Soft Computing are Fuzzy Sets (FS), Artificial Neural Networks (ANN), Genetic algorithms (GAs), Evolutionary Strategies (ES), Support Vector Machines (SVM), Rough Sets (RS), Simulated Annealing (SA), biological inspired Swarm Optimization (SO), Ant Colony Optimization (ACO) and Tabu Search (TS). Soft Computing techniques are recognized as gorgeous options to the standard, conventional hard computing methods. Novelty /Improvement: This paper presents to identify the DNA sequences using the different classification approaches have been proposed by various researchers.


Classification, DNA Sequence, Soft Computing techniques.

Full Text:

 |  (PDF views: 360)


  • Jason TL, Ma QHW, Shasha D, Wu CH. Application of NN biological data mining: a case study in protein se-quence classification, Khuzdar (KDD), Boston, MA, USA; 2000. p. 305–9.
  • Dorigo M, Stützle T. Ant colony optimization. Cambridge, MA: MIT Press; 2004.
  • Francisco JL, Armando B, Fernando G, Carlos C, Antonio M. Fuzzy association rules for biological data analysis: a case study on yeast. Journal of BMC Bioinformatics. 2008; 9(1):107.
  • Scrucca L. Class prediction and gene selection for DNA microarrays using regularized sliced inverse regression. Computational Statistics and Data Analysis. 2007; 52(1):438–51.
  • Adleman LM. Molecular computation of solutions to combinatorial problems. Science. 1994 Nov; 266(11):1021–4.
  • Leslie CS, Eskin E, Noble WS. The spectrum kernel: A string kernel for SVM protein classification. In Pacific Symposium on Biocomputing. 2002; 7:566–75.
  • Wu C, Berry M, Shivakumar S , Mclarty J. Neural net-works for full-scale protein sequence classification: sequence encoding with singular value decomposition. Machine Learning. 1995; 21(1):177–93.
  • Rao PVN, Devi TU, Kaladhar D, Sridhar G, Rao AA. A Probabilistic neural network approach for protein super family classification. Journal of Theoretical and Applied Information Technology. 2009; 6(1):101–5.
  • Reczko M, Hatzigeorgiou A, Mache N, Zell A, Suhai S. A parallel neural network simulator on the connection machine CM-5. Computer Applications in the Biosciences. 1995; 11(3):309–15.
  • Wu CH. ANN for molecular sequence analysis. Computers and Chemistry. 1997; 21(4):237–56.
  • Chen W, Lu H, Wang M. Gene expression data classification using artificial neural network ensembles based on samples filtering. International Conference on Artificial Intelligence and Computational Intelligence. Shanghai. 2009; 1:626–8.
  • Harris N, Hunter L, States D. Mega classification: discovering motifs in massive data streams. Proceedings of Tenth National Conference on Artificial Intelligence. San Jose, California (CA): Association for the Advancement of Artificial Intelligence (AAAI) Press; 1992. p. 837–42.
  • Wang D, Huang GB. Protein sequence classification using extreme learning machine. Proceedings of International Joint Conference on Neural Network (IJCNN2005), Montreal, Canada. 2005; 3:1406–11.
  • Mansoori EG, Zolghadri MJ, Katebi SD, Mohabatkar H, Boostani R, Sadreddini MH. Generating fuzzy rules for protein classification. Iranian Journal of Fuzzy Systems. 2008; 5(2):21–33.
  • Mohamed S, Rubin D, Marwala T. Multi-class protein sequence classification using fuzzy ARTMAP. Institute of Electrical and Electronics Engineers (IEEE) Conference on Systems, Man, and Cybernetics, Taipei, Taiwan; 2006. p. 1676–80.
  • Kim K, Kim M, Woo Y. A DNA sequence alignment algorithm using quality information and a fuzzy inference method. Progress in Natural Science. 2008; 18(5):595–602.
  • Wang YF, Yu ZG, Anh V. Type-2 fuzzy approach for disease-associated gene identification on microarrays. International Conference on Bioscience, Biochemistry and Bioinformatics. 2011; 1(1):73–8.
  • Hu YJ, Hu YH, Ke JB. The modified DNA identification classification on fuzzy relation. Trans Tech Publications, Switzerland, Applied Mechanics and Materials. 2011; 48–49:1275–81.
  • Sharma A, Imoto S, Miyano S. A top-r feature selection algorithm for microarray gene expression data. Institute of Electrical and Electronics Engineers (IEEE) / Association for Computing Machinery (ACM). 2011; 9(3):1545–5963.
  • Mizas C, Sirakoulis GC, Mardiris V, Karafyllidis I, Glykos N, Sandaltzopoulos R. Reconstruction of DNA sequences using genetic algorithms and cellular automata: towards mutation prediction. Biosystems. 2008 Apr; 92(1):61–8.
  • Isokawa M, Wayama M, Shimizu T. Multiple sequence alignment using a genetic algorithm. Proceedings of the Seventh Workshop on Genome Informatics. 1996; 7:176–7.
  • Chin YLF, Ho NL, Lam TW, Wong WHP, Chan MY. Efficient constrained multiple sequence alignment with performance guarantee. Journal of Bioinformatics and Computational Biology. 2003; 3(1):337–46.
  • Parsons R, Forrest S, Burks C. Genetic algorithms for DNA sequence assembly. Proceedings of International Conference on Intelligent Systems for Molecular Biology Association for the Advancement of Artificial Intelligence (AAAI), Bethesda, MD, USA ; 1993. p. 310–8.
  • Chen SM, Lin CH. Multiple DNA sequence alignment based on genetic SA techniques. Information and Management Sciences. 2007; 18(2):97–111.
  • Chen X, Kwong S, Li M. A compression algorithm for DNA sequences. Institute of Electrical and Electronics Engineers (IEEE) in Medicine and Biology. 2001; 20(4):61–6.
  • Yada T, Ishikawa M, Tanaka H, Asai K. DNA sequence analysis using hidden markov model and genetic algorithm. Genome Informatics. 1994; 5:178–9.
  • Karaboga D, Akay B . A survey: algorithms simulating bee swarm intelligence. Artificial Intelligence Review. 2009; 31(1):68–85.
  • Błazewicz J, Formanowicz P, Kasprzak M, Markiewicz WT, Weglarz J. Tabu search for DNA sequencing with false negatives and false positives. European Journal of Operational Research. 2000; 125(2):257–65.
  • Gupta S, Mandal A, Das D, Datta AK. Ancient DNA – pitfalls and prospects. Indian Journal of Science and Technology. 2015 July; 8(13):1–9.
  • Gupta S, Prasad R, Yadav S. Searching gapped palindromes in DNA sequences using dynamic suffix array. Indian Journal of Science and Technology. 2015 Sep; 8(23):1–9.
  • Indumathy R, Maheswari SU. Solving DNA sequence assembly using PSO with inertia weight and constriction factor. International Journal of Soft Computing and Artificial Intelligence. 2014; 2(1):88–94.
  • Rahman MNA, Saman MYM, Ahmad A. Tap O M, applying rough set theory for DNA sequence database classification and reduction. Biomedical Soft Computing and Human Sciences. 2010; 16(2):115–24.
  • Yakop F, Ibrahim Z, Abidin AFZ. An ACS for solving DNA sequence design problem in DNA computing. International Journal of Innovative Computing. 2012; 8(10(B)):7329–39.
  • Verma RS, Singh V, Kumar S. DNA sequence assembly using particle swarm optimization. International Journal of Computer Applications. 2011Aug; 28(10):1–6.
  • Cai CZ, Han LY, Ji ZL, Chen X, Chen YZ. SVM-Prot: web- based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Research. 2003; 31(13):3692–7.
  • Rahman SA, Bakar AA, Hus-sein ZAM. Feature selection and classification of protein subfamilies using rough sets. International Conference on Electrical Engineering and Informatics, Selangor, Malaysia. 2009; 2:32–5.
  • Lagos R, Villanueva JE, Monasterio O. Identification and properties of the genes encoding microcin E492 and its immunity protein. Journal of Bacteriology. 1999 Jan; 181(1):212–7.
  • Roy S, Sadhukhan S, Sadhu S, Bandyopadhyay SK. A novel approach towards development of hybrid image steganography using DNA sequences. Indian Journal of Science and Technology. 2015 Sep; 8(22):1–7.
  • Riaz T, Wang Y, Li. Multiple sequence alignment using Tabu search. Asia- Pacific Bioinformatics Conference (APBC2004). 2004; 29:1–10.
  • Kennedy J, Eberhart R. Particle swarm optimization. Proceedings of the Institute of Electrical and Electronics Engineers (IEEE) International Conference on Neural Networks. 1995; 4:1942–8.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.