Abstract
Current machine dialog systems are predominantly implemented using a sequential, utterance based, two-party, speak-wait/speak-wait approach. human-human dialog is 1) not sequential, with overlap, interruption and back channels; 2) processes utterances before they are complete and 3) are often multi-party. The current approach is stifling innovation in social robots were long delays(often several seconds) is the current norm for dialog response time, leading to stilted and unnatural dialog flow. In this paper, by referencing a light weight word spotting speech recognition system - Chatty SDK, we present a practical engineering strategy for developing what we term a conversational listener that would allow systems to mimic natural human turn-taking in dialogue.
Original language | English |
---|---|
Title of host publication | ACM/IEEE International Conference on Human-Robot Interaction 2023 |
Publisher | Association for Computing Machinery |
Publication status | Published - 2023 |