Benabdellah, A.C. and Benghabrit, A. and Bouhaddou, I. (2019) A survey of clustering algorithms for an industrial context. In: UNSPECIFIED.

Full text not available from this repository.
Official URL:


Across a wide variety of fields and especially for industrial companies, data are being collected and accumulated at a dramatic pace from many different resources and services. Hence, there is an urgent need for a new generation of computational theories and tools to assist humans in extracting useful information from the rapidly growing volumes of digital data. A well-known fundamental task of data mining to extract information is clustering. However, with the modified applications for various domains, several researchers have developed and have provided many clustering algorithms. This complexity makes it difficult for researchers and practitioners to keep up with clustering algorithms development. As a result, finding appropriate algorithms helps significantly to organize information and extract the correct answer from different queries of the databases. In this respect, the aim of this paper is to find the appropriate clustering algorithm for sparse industrial dataset. To achieve this goal, we first present related work that focus on comparing different clustering algorithms over the past twenty years. After that, we provide a categorization of different clustering algorithms found in the literature by matching their properties to the 4V's challenges of Big data which allow us to select the candidate clustering algorithm. Finally, using internal validity indices, K-means, agglomerative hierarchical, DBSCAN and SOM have been implemented and compared on four datasets. In addition, we highlighted the best performing clustering algorithm that gives us the efficient clusters for each dataset. © 2019 The Authors.

Item Type: Conference or Workshop Item (UNSPECIFIED)
Uncontrolled Keywords: Aircraft; Computational complexity; Data mining; Fighter aircraft; Intelligent computing; Logistics; Query processing; Unsupervised learning, Agglomerative hierarchical; Algorithms development; Automotive; Computational theories; Extract informations; Industrial companies; Industrial datasets; Sparse dataset, Clustering algorithms
Subjects: Computer Science
Divisions: SCIENTIFIC PRODUCTION > Computer Science
Depositing User: Administrateur Eprints Administrateur Eprints
Last Modified: 31 Jan 2020 15:45

Actions (login required)

View Item View Item