A model-based clustering of expectation–maximization and K-means algorithms in crime hotspot analysis

Simon Kojo Appiah*, Kingsley Wirekoh, Eric Nimako Aidoo, Samuel Dua Oduro, Yarhands Dissou Arthur

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

10 Citations (Scopus)
107 Downloads (Pure)

Abstract

Hotspot analysis of spatial attributes is a persistent research field in data mining, and applying a model-based clustering procedure is increasingly becoming popular in identifying trends and patterns in datasets on crime events occurring in space. The distributions of potential crime hotspots are parameterized as arising from Gaussian multivariate distributions, whose parameters are estimated by the expectation–maximization (E-M) algorithm, an iterative process with convergence very sensitive to initializations. In this study, a model-based clustering algorithm is explored from the E-M algorithm, initialized by K-means clustering using geodesic distance classification to estimate the model parameters and compared with the classical E-M algorithm, initialized with hierarchical clustering, to identify the distributional patterns of incidence of criminal activities. These model-based clustering algorithms were demonstrated on an open-source large dataset of violent crime activities, which occurred in West Midlands County. Training the data as a Gaussian process, the study identified 12 hotspots of Gaussian mixed models as clusters of an ellipsoidal distribution varying in shape, volume, and orientation, which are mostly found in central parts of boroughs of the study area. The proposed model-based clustering of the E-M algorithm combined with K-means clustering algorithm proved efficient as being fast and stable in convergence with low probability of uncertainty by classifications, producing same classification in some cases when compared to that of the classical E-M and K-means algorithms. The combined model-based clustering techniques applied in the hotspot analysis of criminal activities in space will not only provide insight into crime prediction and resource allocation in combating strategies but also guide researchers to adopt mechanisms for modeling large spatial attributes in data mining.

Original languageEnglish
Article number2073662
JournalResearch in Mathematics
Volume9
Issue number1
Early online date15 May 2022
DOIs
Publication statusPublished - 31 Dec 2022

Keywords

  • clusters
  • crime hotspot
  • expectation–maximization (E-M) algorithm
  • Gaussian mixed models (GMMs)
  • initialization
  • K-means algorithm
  • Model-based clustering

ASJC Scopus subject areas

  • General Mathematics

Fingerprint

Dive into the research topics of 'A model-based clustering of expectation–maximization and K-means algorithms in crime hotspot analysis'. Together they form a unique fingerprint.

Cite this