A Comprehensive Approach to Indian Sign Language Recognition: Leveraging LSTM and MediaPipe Holistic for Dynamic and Static Hand Gesture Recognition
DOI:
https://doi.org/10.4108/airo.8693Keywords:
Sign Language Recognition, LSTM, Indian Sign Language, MediaPipe Holistic, Computer Vision, Deep Learning, Gesture RecognitionAbstract
Recognizing Indian Sign Language (ISL) gestures effectively is crucial for improving communication accessibility for deaf community. This study introduces an innovative approach that integrates a Sequential Long Short-Term Memory (LSTM) model with MediaPipe Holistic for accurate and real-time gesture recognition. This work outlines a straightforward approach to recognizing Indian Sign Language (ISL) gestures effectively. The process is divided into three steps: Extracting features from data, Cleaning, Labelling and identifying gestures using MediaPipe Holistic. The system tracks landmarks on the face, hands, and body across video frames, capturing essential details such as temporal and spatial features for interpreting gestures. First, the data is cleaned and labeled by removing unclear fuzzy images and null entries. Then after, the processed data is passed into a Sequential LSTM model, which has two LSTM layers and a dense output layer. In the proposed approach, model’s performance is improved by integrating techniques such as early stopping and categorical cross-entropy. The model is trained and tested using a customized ISL dataset that included 11 distinct gestures, and it achieved a high accuracy rate of 96.97%. The framework emphasizes the model's robustness across diverse lighting conditions and real-world scenarios, ensuring its applicability in sectors such as healthcare, education, and public service. By enhancing communication for ISL users, it effectively addresses existing gaps and improves accessibility in these domains.
Downloads
References
[1] “Deafness and hearing loss,” World Health Organization. Accessed: Apr. 26, 2023. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss
[2] Yousaf K, Mehmood Z, Saba T, Rehman A, Rashid M, Altaf M, Shuguang Z. A Novel Technique for Speech Recognition and Visualization Based Mobile Application to Support Two‐Way Communication between Deaf‐Mute and Normal Peoples. Wireless Communications and Mobile Computing. 2018;2018(1):1013234.
[3] Martins TM. A letra e o gesto: estruturas linguísticas em Língua Gestual Portuguesa e Língua Portuguesa,”2011.
[4] Oliveira T, Escudeiro P, Escudeiro N, Rocha E, Barbosa FM. Automatic sign language translation to improve communication. In2019 IEEE Global Engineering Education Conference (EDUCON) 2019 Apr 8 (pp. 937-942). IEEE.
[5] Richardson JT, Barnes L, Fleming J. Approaches to studying and perceptions of academic quality in deaf and hearing students in higher education. Deafness & Education International. 2004 Jun;6(2):100-22.
[6] Riddell S, Weedon E. Disabled students in higher education: Discourses of disability and the negotiation of identity. International Journal of Educational Research. 2014 Jan 1;63:38-46.
[7] S. C. Daroque and A. M. L. Padilha, “Alunos Surdos no Ensino Superior: Uma Discussão Necessária,” Comunicações, vol. 19, no. 2, pp. 23–32, Dec. 2012, doi: 10.15600/2238-121X/comunicacoes.v19n2p23-32.
[8] Ghotkar AS, Kharate GK. Dynamic hand gesture recognition and novel sentence interpretation algorithm for indian sign language using microsoft kinect sensor. Journal of pattern recognition research. 2015 Jul;1:24-38.
[9] Wang RY, Popović J. Real-time hand-tracking with a color glove. ACM transactions on graphics (TOG). 2009 Jul 27;28(3):1-8.
[10] Deora D, Bajaj N. Indian sign language recognition. In2012 1st international conference on emerging technology trends in electronics, communication & networking 2012 Dec 19 (pp. 1-5). IEEE.
[11] Cheng H, Yang L, Liu Z. Survey on 3D hand gesture recognition. IEEE transactions on circuits and systems for video technology. 2015 Aug 18;26(9):1659-73.
[12] Prisacariu VA, Reid I. Robust 3D hand tracking for human computer interaction. In2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG) 2011 Mar 21 (pp. 368-375). IEEE.
[13] Suarez J, Murphy RR. Hand gesture recognition with depth images: A review. In2012 IEEE RO-MAN: the 21st IEEE international symposium on robot and human interactive communication 2012 Sep 9 (pp. 411-417). IEEE.
[14] Kapuscinski T, Oszust M, Wysocki M, Warchol D. Recognition of hand gestures observed by depth cameras. International Journal of Advanced Robotic Systems. 2015 Apr 14;12(4):36.
[15] Dong C, Leu MC, Yin Z. American sign language alphabet recognition using microsoft kinect. InProceedings of the IEEE conference on computer vision and pattern recognition workshops 2015 (pp. 44-52).
[16] Kim K, Kim SK, Choi HI. Depth based sign language recognition system using SVM. Int. J. Multimed. Ubiquitous Eng. 2015 Feb;10(2):75-86..
[17] Tripathi K, Baranwal N, Nandi GC. Continuous dynamic Indian Sign Language gesture recognition with invariant backgrounds. In 2015 international conference on advances in computing, communications and informatics (ICACCI) 2015 Aug 10 (pp. 2211-2216). IEEE.
[18] Adithya V, Vinod PR, Gopalakrishnan U. Artificial neural network based method for Indian sign language recognition. In2013 IEEE conference on information & communication technologies 2013 Apr 11 (pp. 1080-1085). Ieee.
[19] Sharma M, Pal R, Sahoo AK. Indian sign language recognition using neural networks and KNN classifiers. ARPN journal of Engineering and Applied Sciences. 2014 Aug;9(8):1255-9.
[20] Kumar A, Thankachan K, Dominic MM. Sign language recognition. In2016 3rd international conference on recent advances in information technology (RAIT) 2016 Mar 3 (pp. 422-428). IEEE.
[21] Hussain I, Talukdar AK, Sarma KK. Hand gesture recognition system with real-time palm tracking. In2014 Annual IEEE India Conference (INDICON) 2014 Dec 11 (pp. 1-6). IEEE.
[22] Patil SB, Sinha GR. Distinctive feature extraction for Indian Sign Language (ISL) gesture using scale invariant feature Transform (SIFT). Journal of The Institution of Engineers (India): Series B. 2017 Feb;98(1):19-26.
[23] Akhter S. Orientation hashcode and articial neural network based combined approach to recognize sign language. In 2018 21st International Conference of Computer and Information Technology (ICCIT) 2018 Dec 21 (pp. 1-5). IEEE.
[24] Aly W, Aly S, Almotairi S. User-independent American sign language alphabet recognition based on depth image and PCANet features. IEEE Access. 2019 Sep 2;7:123138-50.
[25] Tao W, Leu MC, Yin Z. American Sign Language alphabet recognition using Convolutional Neural Networks with multiview augmentation and inference fusion. Engineering Applications of Artificial Intelligence. 2018 Nov 1;76:202-13.
[26] Chong TW, Kim BJ. American sign language recognition system using wearable sensors with deep learning approach. The Journal of the Korea institute of electronic communication sciences. 2020;15(2):291-8.
[27] Abraham E, Nayak A, Iqbal A. Real-time translation of Indian sign language using LSTM. In2019 global conference for advancement in technology (GCAT) 2019 Oct 18 (pp. 1-5). IEEE.
[28] Gupta R, Kumar A. Indian sign language recognition using wearable sensors and multi-label classification. Computers & Electrical Engineering. 2021 Mar 1;90:106898.
[29] Kaur B, Joshi G, Vig R. Identification of ISL alphabets using discrete orthogonal moments. Wireless Personal Communications. 2017 Aug;95:4823-45.
[30] Kumar A, Kumar R. A novel approach for ISL alphabet recognition using Extreme Learning Machine. International Journal of Information Technology. 2021 Feb;13(1):349-57.
[31] Xiao Q, Qin M, Yin Y. Skeleton-based Chinese sign language recognition and generation for bidirectional communication between deaf and hearing people. Neural networks. 2020 May 1;125:41-55.
[32] Hu L, Gao L, Liu Z, Feng W. Continuous sign language recognition with correlation network. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 2529-2539).
[33] Zhao W, Hu H, Zhou W, Mao Y, Wang M, Li H. Masa: Motion-aware masked autoencoder with semantic alignment for sign language recognition. IEEE Transactions on Circuits and Systems for Video Technology. 2024 Jun 5.
[34] Gao L, Shi P, Hu L, Feng J, Zhu L, Wan L, Feng W. Cross-modal knowledge distillation for continuous sign language recognition. Neural Networks. 2024 Nov 1;179:106587.
[35] Saproo V, Aggarwal RK. A Transformer Based Indian Signed Language Recognition. In2024 First International Conference on Pioneering Developments in Computer Science & Digital Technologies (IC2SDT) 2024 Aug 2 (pp. 170-174). IEEE.
[36] Sandoval-Castaneda M, Li Y, Brentari D, Livescu K, Shakhnarovich G. Self-supervised video transformers for isolated sign language recognition. arXiv preprint arXiv:2309.02450. 2023 Sep 2.
[37] S. Hochreiter S, Schmidhuber J. Long short-term memory. Neural computation. 1997 Nov 15;9(8):1735-80.
[38] Gers FA, Schmidhuber J, Cummins F. Learning to forget: Continual prediction with LSTM. Neural computation. 2000 Oct 1;12(10):2451-71.
[39] Graves A, Mohamed AR, Hinton G. Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing 2013 May 26 (pp. 6645-6649). Ieee.
[40] Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. 2014 Dec 22.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Prachi Rawat, Papendra Kumar, Vivek Kumar Tamta, Anuj Kumar

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.