A semi-supervised clustering approach for semantic slot labelling

Heriberto Cuayahuitl, Nina Dethlefs, Helen Hastie

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Work on training semantic slot labellers for use in Natural Language Processing applications has typically either relied on large amounts of labelled input data, or has assumed entirely unlabelled inputs. The former technique tends to be costly to apply, while the latter is often not as accurate as its supervised counterpart. Here, we present a semi-supervised learning approach that automatically labels the semantic slots in a set of training data and aims to strike a balance between the dependence on labelled data and prediction accuracy. The essence of our algorithm is to cluster clauses based on a similarity function that combines lexical and semantic information. We present experiments that compare different similarity functions for both our semi-supervised setting and a fully unsupervised baseline. While semi-supervised learning expectedly outperforms unsupervised learning, our results show that (1) this effect can be observed based on very few training data instances and that increasing the size of the training data does not lead to better performance, and (2) that lexical and semantic information contribute differently in different domains so that clustering based on both types of information offers the best generalisation.

Original languageEnglish
Title of host publicationProceedings - 2014 13th International Conference on Machine Learning and Applications, ICMLA 2014
PublisherIEEE
Pages500-505
Number of pages6
ISBN (Print)9781479974153
DOIs
Publication statusPublished - 2014
Event2014 13th International Conference on Machine Learning and Applications - Detroit, United Kingdom
Duration: 3 Dec 20146 Dec 2014

Conference

Conference2014 13th International Conference on Machine Learning and Applications
Abbreviated titleICMLA 2014
Country/TerritoryUnited Kingdom
CityDetroit
Period3/12/146/12/14

Keywords

  • interactive systems
  • natural language processing
  • semantic slot labelling
  • semi-supervised learning

Fingerprint

Dive into the research topics of 'A semi-supervised clustering approach for semantic slot labelling'. Together they form a unique fingerprint.

Cite this