Abstract
Spoken Language Understanding infers semantic meaning directly from audio data, and thus promises to reduce error propagation and misunderstandings in end-user applications. However, publicly available SLU resources are limited. In this paper, we release SLURP, a new SLU package containing the following: (1) A new challenging dataset in English spanning 18 domains, which is substantially bigger and linguistically more diverse than existing datasets; (2) Competitive baselines based on state-of-the-art NLU and ASR systems; (3) A new transparent metric for entity labelling which enables a detailed error analysis for identifying potential areas of improvement. SLURP is available at https://github.com/pswietojanski/slurp.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) |
Publisher | Association for Computational Linguistics |
Pages | 7252-7262 |
Number of pages | 11 |
ISBN (Electronic) | 9781952148606 |
DOIs | |
Publication status | Published - Nov 2020 |
Event | 2020 Conference on Empirical Methods in Natural Language Processing - Virtual, Online Duration: 16 Nov 2020 → 20 Nov 2020 |
Conference
Conference | 2020 Conference on Empirical Methods in Natural Language Processing |
---|---|
Abbreviated title | EMNLP 2020 |
City | Virtual, Online |
Period | 16/11/20 → 20/11/20 |
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Computational Theory and Mathematics