Support vector machine ensembles using feature-subset selection for enhancing microarray data classification

Eman Ahmed*, Neamat El Gayar, Iman A. El Azab

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

10 Citations (Scopus)

Abstract

Support Vector Machines (SVMs) are known to be robust tools for classification and regression in noisy and complex domains. SVM ensembles have been widely used to improve classification accuracy in complicated pattern recognition tasks. A good example is the DNA microarray data -for tumor classification- which is usually characterized by low sample size, high dimensionality, noise and large biological variability. In this work we propose to apply an ensemble of SVMs coupled with feature-subset selection methods to alleviate the curse of dimensionality associated with expression-based classification of DNA data in order to achieve stable and reliable results. We compare the single SVM classifier to SVM ensembles applying two different feature-subset selection techniques, namely random selection and k-means clustering, and combining the base classifiers using either majority vote or SVM fusion. Two real-world datasets are used as benchmarks to evaluate and compare the performance. Experimental results show that the ensemble with k-means clustering for feature-subset selection which uses SVM base classifiers and an SVM combiner achieves the best classification accuracy, and that feature-subset-selection methods can have a considerable impact on the classification accuracy.

Original languageEnglish
Pages (from-to)1-11
Number of pages11
JournalInternational Journal of Applied Mathematics and Statistics
Volume28
Issue number4
Publication statusPublished - 2012

Keywords

  • Ensemble classification
  • Feature selection
  • Feature subsets
  • Microarray data
  • Support vector machines (SVM)
  • SVM fusion

ASJC Scopus subject areas

  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Support vector machine ensembles using feature-subset selection for enhancing microarray data classification'. Together they form a unique fingerprint.

Cite this