A Multi-modal Fusion Technique to Combine Manual and Non-Manual Cues for Amharic Sign Language Recognition: A Systematic Literature Review
DOI:
https://doi.org/10.69660/jcsda.01022405Keywords:
ASL, Deep Learning, Faster R-CNN, LSTM, Machine Learning, Manual, Multimodal, Non-Manual, SSDAbstract
Amharic Sign Language (ASL) is a vital form of communication for the hearing-impaired community in Ethiopia. Recognizing and understanding ASL is crucial for facilitating communication and accessibility for Amharic-speaking hearing-impaired individuals. ASL relies on manual gestures, including hand shapes and movements, and non-manual cues such as facial expressions and body postures. The objective of this review is to investigate the methodologies employed to combine manual and non-manual cues in ASL recognition systems. In our review, we have considered various inclusion and exclusion criteria to select relevant research papers. After primary data selection, 46 papers which focus on sign language are included in our analysis. We also employed a data extraction form to collect and gather information from these selected papers systematically. Based on the review, combining manual and non-manual cues enhances the accuracy and robustness of sign language recognition systems. These techniques leverage computer vision and machine learning approaches to interpret manual gestures, while also capturing the nuanced information conveyed through facial expressions and body language. Improving hand gesture recognition involves the finding of key points or poses. Despite the advancements in ASL recognition, this review underscores a significant challenge—the lack of available resources, reputable publications, annotated data, and annotating tools specific to Amharic Sign Language are scarce. This shortage hampers the development and evaluation of ASL recognition systems, hindering progress in this field.