A Search-Based Approach to Multi-view Clustering of Software Systems

A. M. Saeidi, J. Hage, R. Khadka, Slinger Jansen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

17 Citations (Scopus)

Abstract

Unsupervised software clustering is the problem of automatically decomposing the software system into meaningful units. Some approaches solely rely on the structure of the system, such as the module dependency graph, to decompose the software systems into cohesive groups of modules. Other techniques focus on the informal knowledge hidden within the source code itself to retrieve the modular architecture of the system. However both techniques in the case of large systems fail to produce decompositions that correspond to the actual architecture of the system. To overcome this problem, we propose a novel approach to clustering software systems by incorporating knowledge from different viewpoints of the system, such as the knowledge embedded within the source code as well as the structural dependencies within the system, to produce a clustering. In this setting, we adopt a search-based approach to the encoding of multi-view clustering and investigate two approaches to tackle this problem, one based on a linear combination of objectives into a single objective, the other a multi-objective approach to clustering. We evaluate our approach against a set of substantial software systems. The two approaches are evaluated on a dataset comprising of 10 Java open source projects. Finally, we propose two techniques based on interpolation and hierarchical clustering to combine different results obtained to yield a single result for single-objective and multi-objective encodings, respectively.
Original languageEnglish
Title of host publication22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2015)
PublisherIEEE
ISBN (Electronic)9781479984695
DOIs
Publication statusPublished - 9 Apr 2015

Fingerprint

Dive into the research topics of 'A Search-Based Approach to Multi-view Clustering of Software Systems'. Together they form a unique fingerprint.

Cite this