Mining comprehensible clustering rules with an evolutionary algorithm

Ioannis Sarafis, Phil Trinder, Ali Zalzala

Research output: Chapter in Book/Report/Conference proceedingChapter (peer-reviewed)

Abstract

In this paper, we present a novel evolutionary algorithm, called NOCEA, which is suitable for Data Mining (DM) clustering applications. NOCEA evolves individuals that consist of a variable number of non-overlapping clustering rules, where each rule includes d intervals, one for each feature. The encoding scheme is non-binary as the values for the boundaries of the intervals are drawn from discrete domains, which reflect the automatic quantization of the feature space. NOCEA uses a simple fitness function, which is radically different from any distance-based criterion function suggested so far. A density-based merging operator combines adjacent rules forming the genuine clusters in data. NOCEA has been evaluated on challenging datasets and we present results showing that it meets many of the requirements for DM clustering, such as ability to discover clusters of different shapes, sizes, and densities. Moreover, NOCEA is independent of the order of input data and insensitive to the presence of outliers, and to initialization phase. Finally, the discovered knowledge is presented as a set of non-overlapping clustering rules, contributing to the interpretability of the results. © Springer-Verlag Berlin Heidelberg 2003.

Original languageEnglish
Title of host publicationGenetic and Evolutionary Computation — GECCO 2003
Subtitle of host publicationGenetic and Evolutionary Computation Conference Chicago, IL, USA, July 12–16, 2003 Proceedings, Part II
Pages2301-2312
Number of pages12
Volume2724
ISBN (Electronic)978-3-540-45110-5
DOIs
Publication statusPublished - 2003

Publication series

NameLecture Notes in Computer Science
Volume2724
ISSN (Print)0302-9743

Fingerprint

Evolutionary algorithms
Data mining
Merging

Cite this

Sarafis, I., Trinder, P., & Zalzala, A. (2003). Mining comprehensible clustering rules with an evolutionary algorithm. In Genetic and Evolutionary Computation — GECCO 2003: Genetic and Evolutionary Computation Conference Chicago, IL, USA, July 12–16, 2003 Proceedings, Part II (Vol. 2724, pp. 2301-2312). (Lecture Notes in Computer Science; Vol. 2724). https://doi.org/10.1007/3-540-45110-2_123
Sarafis, Ioannis ; Trinder, Phil ; Zalzala, Ali. / Mining comprehensible clustering rules with an evolutionary algorithm. Genetic and Evolutionary Computation — GECCO 2003: Genetic and Evolutionary Computation Conference Chicago, IL, USA, July 12–16, 2003 Proceedings, Part II. Vol. 2724 2003. pp. 2301-2312 (Lecture Notes in Computer Science).
@inbook{5df4ec0c11bc4717a0405d93fcce67ec,
title = "Mining comprehensible clustering rules with an evolutionary algorithm",
abstract = "In this paper, we present a novel evolutionary algorithm, called NOCEA, which is suitable for Data Mining (DM) clustering applications. NOCEA evolves individuals that consist of a variable number of non-overlapping clustering rules, where each rule includes d intervals, one for each feature. The encoding scheme is non-binary as the values for the boundaries of the intervals are drawn from discrete domains, which reflect the automatic quantization of the feature space. NOCEA uses a simple fitness function, which is radically different from any distance-based criterion function suggested so far. A density-based merging operator combines adjacent rules forming the genuine clusters in data. NOCEA has been evaluated on challenging datasets and we present results showing that it meets many of the requirements for DM clustering, such as ability to discover clusters of different shapes, sizes, and densities. Moreover, NOCEA is independent of the order of input data and insensitive to the presence of outliers, and to initialization phase. Finally, the discovered knowledge is presented as a set of non-overlapping clustering rules, contributing to the interpretability of the results. {\circledC} Springer-Verlag Berlin Heidelberg 2003.",
author = "Ioannis Sarafis and Phil Trinder and Ali Zalzala",
year = "2003",
doi = "10.1007/3-540-45110-2_123",
language = "English",
isbn = "978-3-540-40603-7",
volume = "2724",
series = "Lecture Notes in Computer Science",
pages = "2301--2312",
booktitle = "Genetic and Evolutionary Computation — GECCO 2003",

}

Sarafis, I, Trinder, P & Zalzala, A 2003, Mining comprehensible clustering rules with an evolutionary algorithm. in Genetic and Evolutionary Computation — GECCO 2003: Genetic and Evolutionary Computation Conference Chicago, IL, USA, July 12–16, 2003 Proceedings, Part II. vol. 2724, Lecture Notes in Computer Science, vol. 2724, pp. 2301-2312. https://doi.org/10.1007/3-540-45110-2_123

Mining comprehensible clustering rules with an evolutionary algorithm. / Sarafis, Ioannis; Trinder, Phil; Zalzala, Ali.

Genetic and Evolutionary Computation — GECCO 2003: Genetic and Evolutionary Computation Conference Chicago, IL, USA, July 12–16, 2003 Proceedings, Part II. Vol. 2724 2003. p. 2301-2312 (Lecture Notes in Computer Science; Vol. 2724).

Research output: Chapter in Book/Report/Conference proceedingChapter (peer-reviewed)

TY - CHAP

T1 - Mining comprehensible clustering rules with an evolutionary algorithm

AU - Sarafis, Ioannis

AU - Trinder, Phil

AU - Zalzala, Ali

PY - 2003

Y1 - 2003

N2 - In this paper, we present a novel evolutionary algorithm, called NOCEA, which is suitable for Data Mining (DM) clustering applications. NOCEA evolves individuals that consist of a variable number of non-overlapping clustering rules, where each rule includes d intervals, one for each feature. The encoding scheme is non-binary as the values for the boundaries of the intervals are drawn from discrete domains, which reflect the automatic quantization of the feature space. NOCEA uses a simple fitness function, which is radically different from any distance-based criterion function suggested so far. A density-based merging operator combines adjacent rules forming the genuine clusters in data. NOCEA has been evaluated on challenging datasets and we present results showing that it meets many of the requirements for DM clustering, such as ability to discover clusters of different shapes, sizes, and densities. Moreover, NOCEA is independent of the order of input data and insensitive to the presence of outliers, and to initialization phase. Finally, the discovered knowledge is presented as a set of non-overlapping clustering rules, contributing to the interpretability of the results. © Springer-Verlag Berlin Heidelberg 2003.

AB - In this paper, we present a novel evolutionary algorithm, called NOCEA, which is suitable for Data Mining (DM) clustering applications. NOCEA evolves individuals that consist of a variable number of non-overlapping clustering rules, where each rule includes d intervals, one for each feature. The encoding scheme is non-binary as the values for the boundaries of the intervals are drawn from discrete domains, which reflect the automatic quantization of the feature space. NOCEA uses a simple fitness function, which is radically different from any distance-based criterion function suggested so far. A density-based merging operator combines adjacent rules forming the genuine clusters in data. NOCEA has been evaluated on challenging datasets and we present results showing that it meets many of the requirements for DM clustering, such as ability to discover clusters of different shapes, sizes, and densities. Moreover, NOCEA is independent of the order of input data and insensitive to the presence of outliers, and to initialization phase. Finally, the discovered knowledge is presented as a set of non-overlapping clustering rules, contributing to the interpretability of the results. © Springer-Verlag Berlin Heidelberg 2003.

UR - http://www.scopus.com/inward/record.url?scp=35248888035&partnerID=8YFLogxK

U2 - 10.1007/3-540-45110-2_123

DO - 10.1007/3-540-45110-2_123

M3 - Chapter (peer-reviewed)

SN - 978-3-540-40603-7

VL - 2724

T3 - Lecture Notes in Computer Science

SP - 2301

EP - 2312

BT - Genetic and Evolutionary Computation — GECCO 2003

ER -

Sarafis I, Trinder P, Zalzala A. Mining comprehensible clustering rules with an evolutionary algorithm. In Genetic and Evolutionary Computation — GECCO 2003: Genetic and Evolutionary Computation Conference Chicago, IL, USA, July 12–16, 2003 Proceedings, Part II. Vol. 2724. 2003. p. 2301-2312. (Lecture Notes in Computer Science). https://doi.org/10.1007/3-540-45110-2_123