TY - GEN
T1 - Deep Spatiotemporal Network Based Indian Sign Language Recognition from Videos
AU - Uddin, Md Azher
AU - Denny, Ryan
AU - Joolee, Joolekha Bibi
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
PY - 2024/3/18
Y1 - 2024/3/18
N2 - The deaf community has substantial obstacles because of the communication barrier with hearing individuals. The traditional method of relying on sign language interpreters is not a cost-effective solution to address this issue. Existing systems for dynamic sign language recognition employ the CNN-LSTM framework, which has achieved reasonable performance. However, relying solely on spatial features extracted through CNN is inadequate for accurate recognition of sign language words. In this study, we propose a novel end-to-end deep spatiotemporal network for recognizing Indian sign language from videos. Our framework combines the extraction of deep spatial features using Inception-ResNet-V2 and the utilization of handcrafted spatiotemporal features obtained from the application of Volume Local Directional Number (VLDN). Furthermore, we introduce a new encoder-decoder network based on Long Short-Term Memory (LSTM) to effectively learn the spatiotemporal features. Lastly, we conduct a comprehensive experiment to demonstrate the performance of our proposed method.
AB - The deaf community has substantial obstacles because of the communication barrier with hearing individuals. The traditional method of relying on sign language interpreters is not a cost-effective solution to address this issue. Existing systems for dynamic sign language recognition employ the CNN-LSTM framework, which has achieved reasonable performance. However, relying solely on spatial features extracted through CNN is inadequate for accurate recognition of sign language words. In this study, we propose a novel end-to-end deep spatiotemporal network for recognizing Indian sign language from videos. Our framework combines the extraction of deep spatial features using Inception-ResNet-V2 and the utilization of handcrafted spatiotemporal features obtained from the application of Volume Local Directional Number (VLDN). Furthermore, we introduce a new encoder-decoder network based on Long Short-Term Memory (LSTM) to effectively learn the spatiotemporal features. Lastly, we conduct a comprehensive experiment to demonstrate the performance of our proposed method.
KW - Inception-ResNet-V2
KW - Indian sign language recognition
KW - LSTM-based encoder-decoder network
KW - Volume local directional number
UR - http://www.scopus.com/inward/record.url?scp=85189538851&partnerID=8YFLogxK
U2 - 10.1007/978-981-99-8324-7_16
DO - 10.1007/978-981-99-8324-7_16
M3 - Conference contribution
AN - SCOPUS:85189538851
SN - 9789819983230
T3 - Lecture Notes in Networks and Systems
SP - 171
EP - 181
BT - Proceedings of International Conference on Information Technology and Applications
A2 - Ullah, Abrar
A2 - Anwar, Sajid
A2 - Calandra, Davide
A2 - Di Fuccio, Raffaele
PB - Springer
T2 - 16th International Conference on Information Technology and Applications 2022
Y2 - 20 October 2022 through 22 October 2022
ER -