An Unsupervised-based Dynamic Feature Selection for Classication tasks
Feature Selection, Classication, Clustering Algorithms
With the increase of the size on the data sets used in classication systems, selecting
the most relevant attribute has become one of the main tasks in pre-processing phase.
In a data set, it is expected that all attributes are relevant. However, this is not always
veried. Selecting a set of attributes of more relevance aids decreasing the size of the data
without aecting the performance, or even increase it, this way achieving better results
when used in the data classication. The existing characteristics selection methods elect
the best attributes in the database as a whole, without considering the particularities of
each instance. The Dynamic Features Selection, proposed method, selects the relevant
attributes for each instance individually, using clustering algorithms to group them accordingly
with their similarities. This work performs an experimental analysis of dierent
clustering techniques applied to this new feature selection approach. The clustering algorithms
k-Means, DBSCAN and Expectation-Maximization (EM) were used as selection
method. Analyzes are performed to verify which of these clustering algorithms best ts
to Dynamic Feature Selection. Thus, the contribution of this study is to present a new
approach for attribute selection, the Dynamic Feature Selection, and determine which
of the clustering methods performs better selection and get a better performance in the
construction of more accurate classiers.