Comparing feature bias and feature selection strategies for many-attribute machine learning

Silang Luo, David Corne

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We describe the concept of feature bias (FB) strategies and compare such strategies with traditional feature selection (FS) for predictive machine learning on a collection of datasets. FS is a common step in many classification and regression tasks. It is necessary because machine learning tools often cannot cope when the data has thousands of attributes. However, the strategy used by FS techniques is essentially binary. It is hoped that most "irrelevant" features are removed prior to the application of machine learning, and that the subsequent machine learning stage will be much faster (since there are fewer features to process) and also more successful (since many features will be removed by FS that seem unimportant for the classification task at hand). However, FS methods typically rely on standard statistical ideas and are unable to guarantee that all and only relevant features remain. A feature bias strategy, on the other hand, is an alternative approach in which we never entirely remove any feature from consideration. Experimental results reveal that FB can greatly improve upon FS for prediction tasks, particularly on poorly correlated datasets. We propose a tentative guideline for choosing an FS or FB strategy based on simply calculated inherent correlation of the dataset.

Original languageEnglish
Title of host publicationProceedings of the 10th IASTED International Conference on Artificial Intelligence and Applications, AIA 2010
Pages50-57
Number of pages8
Publication statusPublished - 2010
Event10th IASTED International Conference on Artificial Intelligence and Applications - Innsbruck, Austria
Duration: 15 Feb 201017 Feb 2010

Conference

Conference10th IASTED International Conference on Artificial Intelligence and Applications
Abbreviated titleAIA 2010
Country/TerritoryAustria
CityInnsbruck
Period15/02/1017/02/10

Keywords

  • Classification
  • Feature bias
  • Feature selection
  • Machine learning
  • Prediction tasks
  • Proteomics

Fingerprint

Dive into the research topics of 'Comparing feature bias and feature selection strategies for many-attribute machine learning'. Together they form a unique fingerprint.

Cite this