You don't understand me!": Comparing asr results for l1 and l2 speakers of swedish

Ronald Cumbal, Birger Moell, Jośe Lopes, Olov Engwall

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (SciVal)

Abstract

The performance of Automatic Speech Recognition (ASR) systems has constantly increased in state-of-the-art development. However, performance tends to decrease considerably in more challenging conditions (e.g., background noise, multiple speaker social conversations) and with more atypical speakers (e.g., children, non-native speakers or people with speech disorders), which signifies that general improvements do not necessarily transfer to applications that rely on ASR, e.g., educational software for younger students or language learners. In this study, we focus on the gap in performance between recognition results for native and non-native, read and spontaneous, Swedish utterances transcribed by different ASR services. We compare the recognition results using Word Error Rate and analyze the linguistic factors that may generate the observed transcription errors.

Original languageEnglish
Title of host publicationProceedings of INTERSPEECH 2021
PublisherISCA
Pages4463-4467
Number of pages5
ISBN (Electronic)9781713836902
DOIs
Publication statusPublished - 2021
Event22nd Annual Conference of the International Speech Communication Association 2021 - Brno, Czech Republic
Duration: 30 Aug 20213 Sep 2021

Conference

Conference22nd Annual Conference of the International Speech Communication Association 2021
Abbreviated titleINTERSPEECH 2021
Country/TerritoryCzech Republic
CityBrno
Period30/08/213/09/21

Keywords

  • Automatic speech recognition
  • Language learning
  • Non-native speech

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'You don't understand me!": Comparing asr results for l1 and l2 speakers of swedish'. Together they form a unique fingerprint.

Cite this