Evolutionary and Swarm Algorithm Optimized Density- Based Clustering and Classification for Data Analytics



Guan, C
(2017) Evolutionary and Swarm Algorithm Optimized Density- Based Clustering and Classification for Data Analytics. PhD thesis, University of Liverpool.

[img] Text
200980400_Apr2018.pdf - Unspecified

Download (15MB)

Abstract

Clustering is one of the most widely used pattern recognition technologies for data analytics. Density-based clustering is a category of clustering methods which can find arbitrary shaped clusters. A well-known density-based clustering algorithm is Density- Based Spatial Clustering of Applications with Noise (DBSCAN). DBSCAN has three drawbacks: firstly, the parameters for DBSCAN are hard to set; secondly, the number of clusters cannot be controlled by the users; and thirdly, DBSCAN cannot directly be used as a classifier. With addressing the drawbacks of DBSCAN, a novel framework, Evolutionary and Swarm Algorithm optimised Density-based Clustering and Classification (ESA-DCC), is proposed. Evolutionary and Swarm Algorithm (ESA), has been applied in various different research fields regarding optimisation problems, including data analytics. Numerous categories of ESAs have been proposed, such as, Genetic Algorithms (GAs), Particle Swarm Optimization (PSO), Differential Evaluation (DE) and Artificial Bee Colony (ABC). In this thesis, ESA is used to search the best parameters of density-based clustering and classification in the ESA-DCC framework to address the first drawback of DBSCAN. As method to offset the second drawback, four types of fitness functions are defined to enable users to set the number of clusters as input. A supervised fitness function is defined to use the ESA-DCC as a classifier to address the third drawback. Four ESA- DCC methods, GA-DCC, PSO-DCC, DE-DCC and ABC-DCC, are developed. The performance of the ESA-DCC methods is compared with K-means and DBSCAN using ten datasets. The experimental results indicate that the proposed ESA-DCC methods can find the optimised parameters in both supervised and unsupervised contexts. The proposed methods are applied in a product recommender system and image segmentation cases.

Item Type: Thesis (PhD)
Divisions: Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User: Symplectic Admin
Date Deposited: 23 Aug 2018 08:55
Last Modified: 19 Jan 2023 06:33
DOI: 10.17638/03021212
Supervisors:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3021212