A multiple stream architecture for the recognition of signs in Brazilian sign language in the context of health

da Silva, Diego R. B.; de Araújo, Tiago Maritan U.; do Rêgo, Thaís Gaudencio; Brandão, Manuella Aschoff Cavalcanti; Gonçalves, Luiz Marcos Garcia

doi:10.1007/s11042-023-16332-7

A multiple stream architecture for the recognition of signs in Brazilian sign language in the context of health

Published: 28 July 2023

Volume 83, pages 19767–19785, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Diego R. B. da Silva ORCID: orcid.org/0000-0002-1037-9953¹,
Tiago Maritan U. de Araújo²,
Thaís Gaudencio do Rêgo²,
Manuella Aschoff Cavalcanti Brandão² &
…
Luiz Marcos Garcia Gonçalves¹

283 Accesses
4 Citations
Explore all metrics

Abstract

Deaf people communicate naturally through sign languages and often face barriers to communicating with hearing people and accessing information in written languages. These communication difficulties are aggravated in the health domain, especially in a hospital emergency, when human sign language interpreters are unavailable. This paper proposes a solution for automatically recognizing signs in Brazilian Sign Language (Libras) in the health context to reduce this problem. The idea is that the system could assist in the communication between a Deaf patient and his doctor in the future. Our solution involves a multiple-stream architecture that combines convolutional and recurrent neural networks, dealing with sign languages’ visual phonemes individual and specialized ways. The first stream uses the optical flow as input for capturing information about the “movement” of the sign; the second stream extracts kinematic and postural features, including “handshapes” and “facial expressions”; and the third stream process the raw RGB images to address additional attributes about the sign not captured in the previous streams. Thus, we can process more spatiotemporal features that discriminate the classes during the training stage. The computational results show that the solution can recognize signs in Libras in the health context, with an average accuracy, precision, recall, and f1-score of 99.80%, 99.81%, 99.80%, and 99.80%, respectively. Our system also performed better than other works in the literature, obtaining an average accuracy of 100% in an Argentine Sign Language (LSA) dataset, which is usually used for comparison purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

An Efficient Real-Time Word-Level Recognition of Indian Sign Language

Analysis of Deep Learning Techniques for Indian Sign Language Detection System

Deep Neural Networks for Image-Based Indian Sign Language Recognition: A Comprehensive Review and Practical Analysis

Availability of data and material

Data generated or used during the study is available from the corresponding author by request.

Notes

The multi-stream architectures present multiple channels with different data and processing that are merged using concatenative, additive, subtractive multiplicative, statistical, among others

References

Akmeliawati R, Ooi MPL, Kuang YC (2007) Real-time Malaysian sign language translation using colour segmentation and neural network. In: 2007 IEEE Instrumentation & Measurement Technology Conference IMTC 2007. IEEE. https://doi.org/10.1109/imtc.2007.379311
Aragão JDS, Francisco ISXD, Coura AS, Sousa FSD, Batista JDL, Magalhões IMDO (2007) A content validity study of signs, symptoms and diseases/health problems expressed in LIBRAS. Revista Latino-Americana de Enfermagem 23:1014–1023. http://www.scielo.br/scielo.php?script=sci_arttext &pid=S0104-11692015000601014 &nrm=iso
Araujo T, Ferreira F, Silva D, Oliveira L, Falcão E, Martins V, Portela I, Nóbrega Y, Lima H, Souza Filho G, Tavares T, Duarte A (2014) An approach to generate and embed sign language video tracks into multimedia contents. Inf Sci 281:762. https://doi.org/10.1016/j.ins.2014.04.008
Article Google Scholar
de Araújo TMU, Ferreira FLS, dos S. Silva DAN, Lemos FH, Neto GPB, Omaia D, de Souza Filho GL, Tavares TA (2012) Automatic generation of Brazilian sign language windows for digital TV systems. J Braz Comput Soc 19:107–125
Bessa Carneiro S, De M. Santos EDF, De A. Barbosa TM, Ferreira JO, Soares Alcalá SG, Da Rocha AF (2016) Static gestures recognition for Brazilian sign language with kinect sensor. In: 2016 IEEE Sensors. pp 1–3
Bhatti UA, Huang M, Wang H, Zhang Y, Mehmood A, Di W (2018) Recommendation system for immunization coverage and monitoring. Hum Vaccin Immunother 14(1):165–171. https://doi.org/10.1080/21645515.2017.1379639. (PMID: 29068748)
Article PubMed Google Scholar
Bhatti UA, Huang M, Wu D, Zhang Y, Mehmood A, Han H (2019) Recommendation system using feature extraction and pattern recognition in clinical care systems. Enterprise Information Systems 13(3):329–351. https://doi.org/10.1080/17517575.2018.1557256
Article ADS Google Scholar
Binh ND, Ejima T (2005) Real-time Malaysian sign language translation using colour segmentation and neural network. In: Proceeding of ICGST International Conference Graphics, Vision and Image Processing. pp 1–6
Boháček M, Hrúz M (2022) Sign pose-based transformer for word-level sign language recognition. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops. pp 182–191
Bradski G (2000) The OpenCV Library. Dr. Dobb’s Journal of Software Tools
Cao Z, Hidalgo G, Simon T, Wei S, Sheikh Y (2018) Openpose: realtime multi-person 2D pose estimation using part affinity fields. CoRR abs/1812.08008. http://arxiv.org/abs/1812.08008
Carreira J, Zisserman A (2017) Quo vadis, action recognition? A new model and the kinetics dataset
Castiglioni I, Rundo L, Codari M, Di Leo G, Salvatore C, Interlenghi M, Gallivanone F, Cozzi A, D’Amico NC, Sardanelli F (2021) AI applications to medical images: from machine learning to deep learning. Physica Med 83:9–24
Article Google Scholar
Cavararo R (2010) Características gerais da população, religião e pessoas com defciência. Instituto Brasileiro de Geografa e Estatística (IBGE). https://biblioteca.ibge.gov.br/visualizacao/periodicos/94/cd_2010_religiao_deficiencia.pdf
Cheok MJ, Omar Z, Jaward MH (2017) A review of hand gesture and sign language recognition techniques. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-017-0705-5
Article Google Scholar
Chollet F et al (2015) Keras. https://keras.io
Chuan C, Regina E, Guardino C (2014) American sign language recognition using leap motion sensor. In: 2014 13th International Conference on Machine Learning and Applications. pp 541–544
CNsaúde: Cenário dos Hospitais no Brasil. S.N. (2022). http://cnsaude.org.br/wp-content/uploads/2022/07/CNSAUDE-FBH-CENARIOS-2022.pdf
Cooper H, Holt B, Bowden R (2011) Sign language recognition. In: Visual Analysis of Humans. Springer London, pp 539–562. https://doi.org/10.1007/978-0-85729-997-0-27
Cooper H, Ong E, Pugeault N, Bowden R (2017) Sign language recognition using sub-units. pp 89–118. https://doi.org/10.1007/978-3-319-57021-1_3
Cooper H, Pugeault N, Bowden R (2011). Reading the signs: a video based sign dictionary. https://doi.org/10.1109/iccvw.2011.6130349
Dignan C, Perez E, Ahmad I, Huber M, Clark A (2022) An AI-based approach for improved sign language recognition using multiple videos. Multimed Tools Appl 81(24):34525–34546. https://doi.org/10.1007/s11042-021-11830-y
Article Google Scholar
Donahue J, Hendricks LA, Rohrbach M, Venugopalan S, Guadarrama S, Saenko K, Darrell T (2014) Long-term recurrent convolutional networks for visual recognition and description. Preprint at http://arxiv.org/abs/1411.4389
Elemento O, Leslie C, Lundin J, Tourassi G (2021) Artificial intelligence in cancer research, diagnosis and therapy. Nat Rev Cancer 21(12):747–752
Article CAS PubMed Google Scholar
Fakhfakh S, Jemaa YB (2022) Deep learning shape trajectories for isolated word sign language recognition. Int Arab J Inf Technol 19(4):660–666
Google Scholar
Galicia R, Carranza O, Jiménez ED, Rivera GE (2015) Mexican sign language recognition using movement sensor. In: 2015 IEEE 24th International Symposium on Industrial Electronics (ISIE). pp 573–578
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Article CAS PubMed Google Scholar
Huenerfauth M (2004). A multi-path architecture for machine translation of English text into American sign language animation. https://doi.org/10.3115/1614038.1614043
Huenerfauth M (2008) Generating American sign language animation: overcoming misconceptions and technical challenges. Univ Access Inf Soc 6:419–434. https://doi.org/10.1007/s10209-007-0095-7
Article Google Scholar
Jani AB, Kotak NA, Roy AK (2018) Sensor based hand gesture recognition system for English alphabets used in sign language of deaf-mute people. In: 2018 IEEE Sensors. pp 1–4
Kau L, Su W, Yu P, Wei S (2015) A real-time portable sign language translation system. In: 2015 IEEE 58th International Midwest Symposium on Circuits and Systems (MWSCAS). pp 1–4
Kaya F, Tuncer AF, Yildiz ŞK (2018) Detection of the Turkish sign language alphabet with strain sensor based data glove. In: 2018 26th Signal Processing and Communications Applications Conference (SIU). pp 1–4
Konstantinidis D, Dimitropoulos K, Daras P (2018) A deep learning approach for analyzing video and skeletal features in sign language recognition. In: 2018 IEEE International Conference on Imaging Systems and Techniques (IST). pp 1–6. https://doi.org/10.1109/IST.2018.8577085
Konstantinidis D, Dimitropoulos K, Daras P (2018) Sign language recognition based on hand and body skeletal data. In: 2018 - 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON). pp 1–4. https://doi.org/10.1109/3DTV.2018.8478467
Lab C (2015) Ranking web of world hospitals. https://hospitals.webometrics.info/
Li T, Li J, Liu J, Huang M, Chen YW, Bhatti UA (2022) Robust watermarking algorithm for medical images based on log-polar transform. EURASIP J Wirel Commun Netw 2022(1):24. https://doi.org/10.1186/s13638-022-02106-6
Article Google Scholar
López-Ludeña V, Morcillo C, López JC, Barra-Chicote R, Cordoba R, Hernandez R (2014) Translating bus information into sign language for deaf people. Eng Appl Artif Intell 32. https://doi.org/10.1016/j.engappai.2014.02.006
López-Ludeña V, Morcillo C, López JC, Ferreiro E, Ferreiros J, Hernandez R (2014) Methodology for developing an advanced communications system for the deaf in a new domain. Knowl-Based Syst 56:240–252. https://doi.org/10.1016/j.knosys.2013.11.017
Article Google Scholar
Lu J, Nguyen M, Yan WQ (2021) Sign language recognition from digital videos using deep learning methods. In: Nguyen M, Yan WQ, Ho H (eds) Geometry and Vision. Springer International Publishing, Cham, pp 108–118
Chapter Google Scholar
Machado MC (2018) Classificação automática de sinais visuais da língua brasileira de sinais representados por caracterização espaço-temporal. Master’s thesis. https://tede.ufam.edu.br/handle/tede/6645. Instituto de Computação
Masood S, Srivastava A, Thuwal H, Ahmad M (2018) Real-time sign language gesture (word) recognition from video sequences using CNN and RNN. pp 623–632. https://doi.org/10.1007/978-981-10-7566-7_63
Mistree K, Thakor D, Bhatt B (2021) Towards Indian sign language sentence recognition using INSIGNVID: Indian sign language video dataset. Int J Adv Comput Sci Appl 12(8)
Morrissey S, Way A (2013) Manual labour: tackling machine translation for sign languages. Mach Transl 27. https://doi.org/10.1007/s10590-012-9133-1
Ong EJ, Koller O, Pugeault N, Bowden R (2014). Sign spotting using hierarchical sequential patterns with temporal intervals. https://doi.org/10.1109/CVPR.2014.248
World Health Organization (2013) Millions of people in the world have hearing loss that can be treated or prevented. WHO. encurtador.com.br/qOXZ8
Oszust M, Wysocki M (2013) Polish sign language words recognition with Kinect. In: 2013 6th International Conference on Human System Interactions (HSI). pp 219–226
Parelli M, Papadimitriou K, Potamianos G, Pavlakos G, Maragos P (2020) Exploiting 3D hand pose estimation in deep learning-based sign language recognition from RGB videos. In: Bartoli A, Fusiello A (eds) Computer Vision - ECCV 2020 Workshops. Springer International Publishing, Cham, pp 249–263
Chapter Google Scholar
Pigou L, Dieleman S, Kindermans PJ, Schrauwen B (2015) Sign language recognition using convolutional neural networks. In: Computer Vision - ECCV 2014 Workshops. Springer International Publishing, pp 572–578. https://doi.org/10.1007/978-3-319-16178-5_40
Rastgoo R, Kiani K, Escalera S (2021) Hand pose aware multimodal isolated sign language recognition. Multimed Tools Appl 80(1):127–163. https://doi.org/10.1007/s11042-020-09700-0
Article Google Scholar
Rastgoo R, Kiani K, Escalera S (2022) Real-time isolated hand sign language recognition using deep networks and SVD. J Ambient Intell Humaniz Comput 13(1):591–611. https://doi.org/10.1007/s12652-021-02920-8
Article Google Scholar
Ronchetti F, Quiroga F, Estrebou C, Lanzarini L, Rosete A (2016) LSA64: a dataset of Argentinian sign language. XX II Congreso Argentino de Ciencias de la Computación (CACIC)
Ronchetti F, Quiroga F, Estrebou C, Lanzarini L, Rosete A (2016) LSA64: an Argentinian sign language dataset
Sharma S, Kumar K (2021) ASL-3DCNN: American sign language recognition technique using 3-D convolutional neural networks. Multimed Tools Appl 80(17):26319–26331. https://doi.org/10.1007/s11042-021-10768-5
Article Google Scholar
Shoaib U, Ahmad N, Prinetto P, Tiotto G (2013) Integrating multiwordnet with Italian sign language lexical resources. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2013.09.027
Article Google Scholar
Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. Preprint at http://arxiv.org/abs/1406.2199
de Souza MFNS, Araújo AMB, Sandes LFF, Freitas DA, Soares WD, de Mello Vianna RS, de Sousa ÁAD (2017) Principais dificuldades e obstáculos enfrentados pela comunidade surda no acesso à saúde: uma revisão integrativa de literatura. Revista CEFAC 19(3)395–405. https://doi.org/10.1590/1982-0216201719317116
Srinivasu PN, SivaSai JG, Ijaz MF, Bhoi AK, Kim W, Kang JJ (2021) Classification of skin disease using deep learning neural networks with mobilenet V2 and LSTM. Sensors 21(8). https://doi.org/10.3390/s21082852. https://www.mdpi.com/1424-8220/21/8/2852
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2015) Rethinking the inception architecture for computer vision. Preprint at http://arxiv.org/abs/1512.00567
Tran WT, Sadeghi-Naini A, Lu FI, Gandhi S, Meti N, Brackstone M, Rakovitch E, Curpen B (2021) Computational radiology in breast cancer screening and diagnosis using artificial intelligence. Can Assoc Radiol J 72(1):98–108
Article PubMed Google Scholar
Vazquez-Enriquez M, Alba-Castro JL, Docio-Fernandez L, Rodriguez-Banga E (2021) Isolated sign language recognition with multi-scale spatial-temporal graph convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. pp 3462–3471
Wadhawan A, Kumar P (2020) Deep learning-based sign language recognition system for static signs. Neural Comput Appl 32(12):7957–7968. https://doi.org/10.1007/s00521-019-04691-y
Article Google Scholar
Wan J, Li SZ, Zhao Y, Zhou S, Guyon I, Escalera S (2016) Chalearn looking at people RGB-D isolated and continuous datasets for gesture recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). pp 761–769
Wu J, Sun L, Jafari R (2016) A wearable system for recognizing American sign language in real-time using IMU and surface EMG sensors. IEEE J Biomed Health Inform 20(5):1281–1290
Article PubMed Google Scholar
Wu Z, Wang X, Jiang YG, Ye H, Xue X (2015) Modeling spatial-temporal clues in a hybrid deep learning framework for video classification
Yadav A, Verma D, Kumar A, Kumar P, Solanki P (2021) The perspectives of biomarker-based electrochemical immunosensors, artificial intelligence and the internet of medical things towardáCOVID-19 diagnosis and management. Mater Today Chem 20:100443
Article CAS PubMed PubMed Central Google Scholar
Ye H, Wu Z, Zhao RW, Wang X, Jiang YG, Xue X (2015) Evaluating two-stream CNN for video classification. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, ICMR ’15. Association for Computing Machinery, New York, NY, USA, pp 435–442. https://doi.org/10.1145/2671188.2749406
Zhang L, Zhu G, Shen P, Song J, Shah SA, Bennamoun M (2017) Learning spatiotemporal features using 3DCNN and convolutional LSTM for gesture recognition. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW). pp 3120–3128

Download references

Acknowledgements

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001. We gratefully acknowledge NVIDIA Corporation’s support with the donation of a Quadro P6000 used for this research.

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.

Author information

Authors and Affiliations

Universidade Federal do Rio Grande do Norte, Natal, Brazil
Diego R. B. da Silva & Luiz Marcos Garcia Gonçalves
Universidade Federal da Paraíba, João Pessoa, Brazil
Tiago Maritan U. de Araújo, Thaís Gaudencio do Rêgo & Manuella Aschoff Cavalcanti Brandão

Authors

Diego R. B. da Silva
View author publications
You can also search for this author in PubMed Google Scholar
Tiago Maritan U. de Araújo
View author publications
You can also search for this author in PubMed Google Scholar
Thaís Gaudencio do Rêgo
View author publications
You can also search for this author in PubMed Google Scholar
Manuella Aschoff Cavalcanti Brandão
View author publications
You can also search for this author in PubMed Google Scholar
Luiz Marcos Garcia Gonçalves
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Diego R. B. da Silva and Tiago Maritan U. de Araújo conceived and designed the approach. Thaís Gaudencio do Rêgo contributed to the experimental design of the study. Manuella Aschoff Cavalcanti Brandão helped with data collection. The first draft of the manuscript was written by Diego R. B. da Silva. Tiago Maritan U. de Araújo and Luiz Marcos G. Gonçalves thoroughly corrected the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Diego R. B. da Silva.

Ethics declarations

Ethics approval

Not applicable.

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

da Silva, D.R.B., de Araújo, T.M.U., do Rêgo, T.G. et al. A multiple stream architecture for the recognition of signs in Brazilian sign language in the context of health. Multimed Tools Appl 83, 19767–19785 (2024). https://doi.org/10.1007/s11042-023-16332-7

Download citation

Received: 15 December 2021
Revised: 11 July 2023
Accepted: 17 July 2023
Published: 28 July 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11042-023-16332-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

A multiple stream architecture for the recognition of signs in Brazilian sign language in the context of health

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Efficient Real-Time Word-Level Recognition of Indian Sign Language

Analysis of Deep Learning Techniques for Indian Sign Language Detection System

Deep Neural Networks for Image-Based Indian Sign Language Recognition: A Comprehensive Review and Practical Analysis

Availability of data and material

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A multiple stream architecture for the recognition of signs in Brazilian sign language in the context of health

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Efficient Real-Time Word-Level Recognition of Indian Sign Language

Analysis of Deep Learning Techniques for Indian Sign Language Detection System

Deep Neural Networks for Image-Based Indian Sign Language Recognition: A Comprehensive Review and Practical Analysis

Availability of data and material

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation