In the era of big data, the rapid pace and variability of information have become increasingly evident, particularly in areas like seasonal trends and manufacturing processes. The dynamic nature of the environments that pro- duce these data means that their behavior is time-dependent. Consequently, treating data streams as static entities is no longer effective. This has led to the concept of data drift, which refers to shifts in data distribution over time. Stream processing algorithms are designed to detect these changes promptly and adjust to the newly emerging data patterns. In our research, we intro- duce FURAKI , an innovative online clustering algorithm that incorporates drift detection. It employs a binary tree structure and is capable of handling both single-feature and mixed-feature data from unbounded streams. We conducted extensive testing of FURAKI against state-of-the-art algorithms using various datasets. Our findings reveal that FURAKI outperforms the state-of-the-art algorithms in the considered datasets.
Online clustering with interpretable drift adaptation to mixed features
Flavio Corradini;Vincenzo Nucci;Marco Piangerelli
;Barbara Re
In corso di stampa
Abstract
In the era of big data, the rapid pace and variability of information have become increasingly evident, particularly in areas like seasonal trends and manufacturing processes. The dynamic nature of the environments that pro- duce these data means that their behavior is time-dependent. Consequently, treating data streams as static entities is no longer effective. This has led to the concept of data drift, which refers to shifts in data distribution over time. Stream processing algorithms are designed to detect these changes promptly and adjust to the newly emerging data patterns. In our research, we intro- duce FURAKI , an innovative online clustering algorithm that incorporates drift detection. It employs a binary tree structure and is capable of handling both single-feature and mixed-feature data from unbounded streams. We conducted extensive testing of FURAKI against state-of-the-art algorithms using various datasets. Our findings reveal that FURAKI outperforms the state-of-the-art algorithms in the considered datasets.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


