Multi-Objective Topic Modelling

David Corne, Osama Khalifa, Mike Chantler, Fraser Halley

Research output: Contribution to conferencePaperpeer-review

14 Citations (Scopus)

Abstract

Topic Modeling (TM) is a rapidly-growing area at the interfaces of text mining, artificial intelligence and statistical modeling, that is being increasingly deployed to address the ‘information overload’ associated with extensive text repositories. The goal in TM is typically to infer a rich yet intuitive summary model of a large document collection, indicating a specific collection of topics that characterizes the collection – each topic being a probability distribution over words – along with the degrees to which each individual document is concerned with each topic. The model then supports segmentation, clustering, profiling, browsing, and many other tasks. Current approaches to TM, dominated by Latent Dirichlet Allocation (LDA), assume a topic-driven document generation process and find a model that maximizes the likelihood of the data with respect to this process. This is clearly sensitive to any mismatch between the ‘true’ generating process and statistical model, while it is also clear that the quality of a topic model is multi-faceted and complex. Individual topics should be intuitively meaningful, sensibly distinct, and free of noise. Here we investigate multi-objective approaches to TM, which attempt to infer coherent topic models by navigating the trade-offs between objectives that are oriented towards coherence as well as coverage of the corpus at hand. Comparisons with LDA show that adoption of MOEA approaches enables significantly more coherent topics than LDA, consequently enhancing the use and interpretability of these models in a range of applications, without significant degradation in generalization ability.
Original languageEnglish
Pages51-65
Number of pages15
DOIs
Publication statusPublished - 19 Mar 2013
Event7th International Conference on Evolutionary Multicriterion Optimization - Sheffield, United Kingdom
Duration: 19 Mar 201323 Mar 2013

Conference

Conference7th International Conference on Evolutionary Multicriterion Optimization
Country/TerritoryUnited Kingdom
CitySheffield
Period19/03/1323/03/13

Fingerprint

Dive into the research topics of 'Multi-Objective Topic Modelling'. Together they form a unique fingerprint.

Cite this