Performance Evaluation of Evolving Algorithms for data stream clustering
Evolving systems, Datastream, Online Learning, Clustering.
Situations that generate a continuous datastream, such as the case of TCP / IP traffic,e-commerce and industrial monitoring, can make the usability of algorithms that havemachine learning completely offline unviable, due to the need for data storage, due to theinfinite growth of data generation and restrictions such as limited memory. With that, thealgorithms that have the learning totally or partially online appeared. Among them, thereare the evolving algorithms, which have been of interest because they have the ability todevelop and update in unknown environments, as well as the ability to detect conceptsdrift and evolution in the input data over time. Aiming at the wide applicability of thesealgorithms in real problems, in this work a performance comparison is made betweenthe evolving algorithms that perform the clustering based on density, namely DenStream,SOStream, Macro-SOStream and AutoCloud. For this, three synthetic databases wereused together with the clustering performance metric Adjusted Rand Index (ARI).