In the era of big data, the rapid pace and variability of information have become increasingly evident, particularly in areas like seasonal trends and manufacturing processes. The dynamic nature of the environments that pro- duce these data means that their behavior is time-dependent. Consequently, treating data streams as static entities is no longer effective. This has led to the concept of data drift, which refers to shifts in data distribution over time. Stream processing algorithms are designed to detect these changes promptly and adjust to the newly emerging data patterns. In our research, we intro- duce FURAKI , an innovative online clustering algorithm that incorporates drift detection. It employs a binary tree structure and is capable of handling both single-feature and mixed-feature data from unbounded streams. We conducted extensive testing of FURAKI against state-of-the-art algorithms using various datasets. Our findings reveal that FURAKI outperforms the state-of-the-art algorithms in the considered datasets.

Online clustering with interpretable drift adaptation to mixed features

Flavio Corradini;Vincenzo Nucci;Marco Piangerelli
;
Barbara Re
In corso di stampa

Abstract

In the era of big data, the rapid pace and variability of information have become increasingly evident, particularly in areas like seasonal trends and manufacturing processes. The dynamic nature of the environments that pro- duce these data means that their behavior is time-dependent. Consequently, treating data streams as static entities is no longer effective. This has led to the concept of data drift, which refers to shifts in data distribution over time. Stream processing algorithms are designed to detect these changes promptly and adjust to the newly emerging data patterns. In our research, we intro- duce FURAKI , an innovative online clustering algorithm that incorporates drift detection. It employs a binary tree structure and is capable of handling both single-feature and mixed-feature data from unbounded streams. We conducted extensive testing of FURAKI against state-of-the-art algorithms using various datasets. Our findings reveal that FURAKI outperforms the state-of-the-art algorithms in the considered datasets.
In corso di stampa
Concept drift monitoring, unsupervised learning, G-test, Clustering
262
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11581/491164
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact