Abstract
The performance of Automatic Speech Recognition (ASR) systems has constantly increased in state-of-the-art development. However, performance tends to decrease considerably in more challenging conditions (e.g., background noise, multiple speaker social conversations) and with more atypical speakers (e.g., children, non-native speakers or people with speech disorders), which signifies that general improvements do not necessarily transfer to applications that rely on ASR, e.g., educational software for younger students or language learners. In this study, we focus on the gap in performance between recognition results for native and non-native, read and spontaneous, Swedish utterances transcribed by different ASR services. We compare the recognition results using Word Error Rate and analyze the linguistic factors that may generate the observed transcription errors.
Original language | English |
---|---|
Title of host publication | Proceedings of INTERSPEECH 2021 |
Publisher | ISCA |
Pages | 4463-4467 |
Number of pages | 5 |
ISBN (Electronic) | 9781713836902 |
DOIs | |
Publication status | Published - 2021 |
Event | 22nd Annual Conference of the International Speech Communication Association 2021 - Brno, Czech Republic Duration: 30 Aug 2021 → 3 Sept 2021 |
Conference
Conference | 22nd Annual Conference of the International Speech Communication Association 2021 |
---|---|
Abbreviated title | INTERSPEECH 2021 |
Country/Territory | Czech Republic |
City | Brno |
Period | 30/08/21 → 3/09/21 |
Keywords
- Automatic speech recognition
- Language learning
- Non-native speech
ASJC Scopus subject areas
- Language and Linguistics
- Human-Computer Interaction
- Signal Processing
- Software
- Modelling and Simulation