Skip to content

Literature Review on Feature Selection

Abu Zaher edited this page May 22, 2013 · 1 revision

Article Title: Feature Selection via Regularized Trees

Authors: Houtao Deng, George Runger

Article Link: http://arxiv.org/pdf/1201.1587

Published In: The 2012 International Joint Conference on Neural Networks (IJCNN), IEEE, 2012.

Publication Year: 2012

Summary

Article Title: A Review of Feature Selection Methods on Synthetic Data

Authors: Verónica Bolón-Canedo, Noelia Sánchez-Maroño, Amparo Alonso-Betanzos

Article Link: http://www.springerlink.com/content/h2673744135jm465/

Publication Year: 2012

Summary

Article Title: An Introduction to Variable and Feature Selection

Authors: Isabelle Guyon, André Elisseeff

Article Link: http://www.unbox.org/stuffed/export/98/doc/guyon2003.pdf

Published In: The Journal of Machine Learning Research

Publication Year: 2003

Summary

Article Title: A Review of Feature Selection Techniques in Bioinformatics

Authors: Yvan Saeys, Iñaki Inza, Pedro Larrañaga

Article Link: http://bioinformatics.oxfordjournals.org/content/23/19/2507.full

Publication Year: 2007

Summary

There are three types of parameter selection techniques, namely

  • Filter Methods
  • Wrapper Methods
  • Embedded Methods

The following image shows the taxonomy of feature selection techniques.

A taxonomy of feature selection techniques

Filter Methods

Filter techniques assess the relevance of features by looking only at the intrinsic properties of the data. A feature relevance score is calculated, and low-scoring features are removed. Afterwards, this subset of features is presented as input to the classification algorithm.

Wrapper Methods

Wrapper methods embed the model hypothesis search within the feature subset search. In this setup, a search procedure in the space of possible feature subsets is defined, and various subsets of features are generated and evaluated. The evaluation of a specific subset of features is obtained by training and testing a specific classification model, rendering this approach tailored to a specific classification algorithm. To search the space of all feature subsets, a search algorithm is then wrapped around the classification model. However, as the space of feature subsets grows exponentially with the number of features, heuristic search methods are used to guide the search for an optimal subset.

Embedded Methods

In Embedded techniques, the search for an optimal subset of features is built into the classifier construction, and can be seen as a search in the combined space of feature subsets and hypotheses. Just like wrapper approaches, embedded approaches are thus specific to a given learning algorithm. Embedded methods have the advantage that they include the interaction with the classification model, while at the same time being far less computationally intensive than wrapper methods.

Feature selection for microarray analysis

The following image shows the key references for each type of feature selection technique in the microarray domain:

Key references for each type of feature selection technique in the microarray domain

Software for Feature Selection

The following image shows a list of popular software packages for Feature Selection.

Software for feature selection