Dynamic Feature Selection for Ensemble Systems
Feature Selection, Classification, Ensembles, Diversity
In machine learning, the data preprocessing has the aim to improve the data quality, through to analyze and to identify of problems in it. So, the machine learning technique will receive the data of a good quality. The feature selection is one of the most important pre-processing phases. Its main aim is to choose the best subset that represents the dataset, aiming to reduce the dimensionality and to increase the classifier performance. There are different features selection approaches, on of them is the Dynamic Feature Selection. The Dynamic Feature Selection selects the best subset of attributes for each instance, instead of only one subset for a full dataset. After to select a more compact data representation, the next step in the classification is to choose the model to classify the data. This model can be composed by a single classifier or by a system with multiples classifiers, known as Ensembles classifier. These systems to combine the output to obtain a final answer for the system. For these systems to get better performance than a single classifier it is necessary to promote diversity between the components of the system. So, it is necessary that the base classifiers do not make mistakes for the same patterns. For this, the diversity is considered one of the most important aspects to use ensembles. The aim of the work is to use the Dynamic Feature Selection in Ensembles systems. To this, three versions were developed to adapt this feature selection and to create diversity between the classifiers of the ensemble. The versions were compared using different selection rates in an ensemble with five classifiers. After this, the best version was tested with different ensemble sizes.