Converse-et-impera: Exploiting deep learning and hierarchical reinforcement learning for conversational recommender systems

Claudio Greco, Alessandro Suglia, Pierpaolo Basile*, Giovanni Semeraro

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

23 Citations (Scopus)

Abstract

In this paper, we propose a framework based on Hierarchical Reinforcement Learning for dialogue management in a Conversational Recommender System scenario. The framework splits the dialogue into more manageable tasks whose achievement corresponds to goals of the dialogue with the user. The framework consists of a meta-controller, which receives the user utterance and understands which goal should pursue, and a controller, which exploits a goal-specific representation to generate an answer composed by a sequence of tokens. The modules are trained using a two-stage strategy based on a preliminary Supervised Learning stage and a successive Reinforcement Learning stage.

Original languageEnglish
Title of host publicationAI*IA 2017 Advances in Artificial Intelligence. AI*IA 2017
EditorsFloriana Esposito, Stefano Ferilli, Francesca A. Lisi, Roberto Basili
PublisherSpringer
Pages372-386
Number of pages15
ISBN (Electronic)9783319701691
ISBN (Print)9783319701684
DOIs
Publication statusPublished - 7 Nov 2017
Event16th International Conference on Italian Association for Artificial Intelligence 2017 - Bari, Italy
Duration: 14 Nov 201717 Nov 2017

Publication series

NameLecture Notes in Computer Science
Volume10640
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Conference on Italian Association for Artificial Intelligence 2017
Abbreviated titleAI*IA 2017
Country/TerritoryItaly
CityBari
Period14/11/1717/11/17

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Converse-et-impera: Exploiting deep learning and hierarchical reinforcement learning for conversational recommender systems'. Together they form a unique fingerprint.

Cite this