Comparing attribute classifiers for interactive language grounding

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We address the problem of interactively learning perceptually grounded word meanings in a multimodal dialogue system. We design a semantic and visual processing system to support this and illustrate how they can be integrated. We then focus on comparing the performance (Precision, Recall, F1, AUC) of three state-of-the-art attribute classifiers for the purpose of interactive language grounding (MLKNN, DAP, and SVMs), on the aPascal-aYahoo datasets. In prior work, results were presented for object classification using these methods for attribute labelling, whereas we focus on their performance for attribute labelling itself. We find that while these methods can perform well for some of the attributes (e.g. head, ears, furry) none of these models has good performance over the whole attribute set, and none supports incremental learning. This leads us to suggest directions for future work.
Original languageEnglish
Title of host publicationProceedings of the 4th Workshop on Vision and Language (VL'15)
Pages60-69
Number of pages10
Publication statusPublished - 2015
Event4th Workshop on Vision and Language - Portugal, Lisbon, Portugal
Duration: 18 Sep 2015 → …

Conference

Conference4th Workshop on Vision and Language
Abbreviated titleVL'15
CountryPortugal
CityLisbon
Period18/09/15 → …

Fingerprint

Electric grounding
Labeling
Classifiers
Semantics
Processing

Cite this

Yu, Y., Eshghi, A., & Lemon, O. (2015). Comparing attribute classifiers for interactive language grounding. In Proceedings of the 4th Workshop on Vision and Language (VL'15) (pp. 60-69)
Yu, Yanchao ; Eshghi, Arash ; Lemon, Oliver. / Comparing attribute classifiers for interactive language grounding. Proceedings of the 4th Workshop on Vision and Language (VL'15). 2015. pp. 60-69
@inproceedings{2725f8008e664ccbbb4594494fecfe26,
title = "Comparing attribute classifiers for interactive language grounding",
abstract = "We address the problem of interactively learning perceptually grounded word meanings in a multimodal dialogue system. We design a semantic and visual processing system to support this and illustrate how they can be integrated. We then focus on comparing the performance (Precision, Recall, F1, AUC) of three state-of-the-art attribute classifiers for the purpose of interactive language grounding (MLKNN, DAP, and SVMs), on the aPascal-aYahoo datasets. In prior work, results were presented for object classification using these methods for attribute labelling, whereas we focus on their performance for attribute labelling itself. We find that while these methods can perform well for some of the attributes (e.g. head, ears, furry) none of these models has good performance over the whole attribute set, and none supports incremental learning. This leads us to suggest directions for future work.",
author = "Yanchao Yu and Arash Eshghi and Oliver Lemon",
year = "2015",
language = "English",
pages = "60--69",
booktitle = "Proceedings of the 4th Workshop on Vision and Language (VL'15)",

}

Yu, Y, Eshghi, A & Lemon, O 2015, Comparing attribute classifiers for interactive language grounding. in Proceedings of the 4th Workshop on Vision and Language (VL'15). pp. 60-69, 4th Workshop on Vision and Language, Lisbon, Portugal, 18/09/15.

Comparing attribute classifiers for interactive language grounding. / Yu, Yanchao; Eshghi, Arash; Lemon, Oliver.

Proceedings of the 4th Workshop on Vision and Language (VL'15). 2015. p. 60-69.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Comparing attribute classifiers for interactive language grounding

AU - Yu, Yanchao

AU - Eshghi, Arash

AU - Lemon, Oliver

PY - 2015

Y1 - 2015

N2 - We address the problem of interactively learning perceptually grounded word meanings in a multimodal dialogue system. We design a semantic and visual processing system to support this and illustrate how they can be integrated. We then focus on comparing the performance (Precision, Recall, F1, AUC) of three state-of-the-art attribute classifiers for the purpose of interactive language grounding (MLKNN, DAP, and SVMs), on the aPascal-aYahoo datasets. In prior work, results were presented for object classification using these methods for attribute labelling, whereas we focus on their performance for attribute labelling itself. We find that while these methods can perform well for some of the attributes (e.g. head, ears, furry) none of these models has good performance over the whole attribute set, and none supports incremental learning. This leads us to suggest directions for future work.

AB - We address the problem of interactively learning perceptually grounded word meanings in a multimodal dialogue system. We design a semantic and visual processing system to support this and illustrate how they can be integrated. We then focus on comparing the performance (Precision, Recall, F1, AUC) of three state-of-the-art attribute classifiers for the purpose of interactive language grounding (MLKNN, DAP, and SVMs), on the aPascal-aYahoo datasets. In prior work, results were presented for object classification using these methods for attribute labelling, whereas we focus on their performance for attribute labelling itself. We find that while these methods can perform well for some of the attributes (e.g. head, ears, furry) none of these models has good performance over the whole attribute set, and none supports incremental learning. This leads us to suggest directions for future work.

M3 - Conference contribution

SP - 60

EP - 69

BT - Proceedings of the 4th Workshop on Vision and Language (VL'15)

ER -

Yu Y, Eshghi A, Lemon O. Comparing attribute classifiers for interactive language grounding. In Proceedings of the 4th Workshop on Vision and Language (VL'15). 2015. p. 60-69