Survey of Data Stream Clustering Algorithms
Keywords:
Data Streams, Data Stream Clustering, Real-Time ClusteringAbstract
Data stream mining is an active research field as it discovers knowledgefrom large amounts of data that are constantly being created and collected in real-time. Unsupervised learning is one of the most common tasks in
data stream mining, which is clustering. In this research, we present the main concepts and common characteristics of data stream clustering algorithms, such as concept drift, data structures, time windows, and data processing methods. We also discuss some challenges faced by thesealgorithms, such as handling outliers, evolving data, limited memory andtime, and processing multi-dimensional and multi-group data.
Additionally, we provide a sample of data stream clustering algorithms and illustrate the concepts and challenges discussed in this research using statistical graphics. This is done to clarify and compare the criteria used in data stream clustering algorithms.