TY - GEN
T1 - All Together Now: The Living Audio Dataset
AU - Braude, David A.
AU - Aylett, Matthew P.
AU - Laoide-Kemp, Caoimhin
AU - Ashby, Simone
AU - Scott, Kristen M.
AU - Raghallaigh, Brian O.
AU - Braudo, Anna
AU - Brouwer, Alex
AU - Stan, Adriana
PY - 2019
Y1 - 2019
N2 - The ongoing focus in speech technology research on machine learning based approaches leaves the community hungry for data. However, datasets tend to be recorded once and then released, sometimes behind registration requirements or paywalls. In this paper we describe our Living Audio Dataset. The aim is to provide audio data that is in the public domain, multilingual, and expandable by communities. We discuss the role of linguistic resources, given the success of systems such as Tacotron which use direct text-to-speech mappings, and consider how data provenance could be built into such resources. So far the data has been collected for TTS purposes, however, it is also suitable for ASR. At the time of publication audio resources already exist for Dutch, R.P. English, Irish, and Russian.
AB - The ongoing focus in speech technology research on machine learning based approaches leaves the community hungry for data. However, datasets tend to be recorded once and then released, sometimes behind registration requirements or paywalls. In this paper we describe our Living Audio Dataset. The aim is to provide audio data that is in the public domain, multilingual, and expandable by communities. We discuss the role of linguistic resources, given the success of systems such as Tacotron which use direct text-to-speech mappings, and consider how data provenance could be built into such resources. So far the data has been collected for TTS purposes, however, it is also suitable for ASR. At the time of publication audio resources already exist for Dutch, R.P. English, Irish, and Russian.
U2 - 10.21437/Interspeech.2019-2448
DO - 10.21437/Interspeech.2019-2448
M3 - Conference contribution
SP - 1521
EP - 1525
BT - Proceedings of Interspeech 2019
PB - ISCA
ER -