Abstract
When working with robots it is very important that the robot understands the user. This is more difficult when the user is only able to speak to it. You do not want a robot to call for milk when the user said call for help. It is possible for a robot to get a clear understanding of the user in a lab environment where there is no noise or reverberation to distort the instructions. However, in a normal setting this is not always the case. We concentrate on speaker separation to improve speech recognition. To do this we use non-negative matrix factorisation (NMF) and deep learning techniques. For training and testing these techniques, we introduce a new corpus that is recorded with a microphone array. In this paper, we use different NMF and deep learning techniques for the speaker separation. We found that adding directional information improves the separation when there is no noise or reverberation. However, when reverberation is present we saw that the NMF technique with the Itakura-Saito cost function out performs the other techniques. With deep learning we found that a recurrent neural networks is able to perform the separation of the speakers.
| Original language | English |
|---|---|
| Title of host publication | AAAI Fall Symposium Series |
| Publisher | AAAI Press |
| Pages | 99-103 |
| Number of pages | 5 |
| ISBN (Print) | 9781577357940 |
| Publication status | Published - Dec 2017 |
| Event | AAAI 2017 Fall Symposium - Washington, DC, United States Duration: 9 Nov 2017 → 11 Nov 2017 |
Conference
| Conference | AAAI 2017 Fall Symposium |
|---|---|
| Country/Territory | United States |
| City | Washington, DC |
| Period | 9/11/17 → 11/11/17 |