Abstract
We present three enhancements to existing encoder-decoder models for open-domain conversational agents, aimed at effectively modeling coherence and promoting output diversity: (1) We introduce a measure of coherence as the GloVe embedding similarity between the dialogue context and the generated response, (2) we filter our training corpora based on the measure of coherence to obtain topically coherent and lexically diverse context-response pairs, (3) we then train a response generator using a conditional variational autoencoder model that incorporates the measure of coherence as a latent variable and uses a context gate to guarantee topical consistency with the context and promote lexical diversity. Experiments on the OpenSubtitles corpus show a substantial improvement over competitive neural models in terms of BLEU score as well as metrics of coherence and diversity.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing |
Publisher | Association for Computational Linguistics |
Pages | 3981–3991 |
Number of pages | 11 |
ISBN (Electronic) | 9781948087841 |
Publication status | Published - 31 Oct 2018 |
Event | 2018 Conference on Empirical Methods in Natural Language Processing - Brussels, Belgium Duration: 31 Oct 2018 → 4 Nov 2018 |
Conference
Conference | 2018 Conference on Empirical Methods in Natural Language Processing |
---|---|
Abbreviated title | EMNLP 2018 |
Country/Territory | Belgium |
City | Brussels |
Period | 31/10/18 → 4/11/18 |