Exploring Sign Language Detection on Smartphones: A Systematic Review of Machine and Deep Learning Approaches

,


Introduction
A speech disorder, also known as a speech disability, is a condition where an individual faces difculty in efectively communicating verbally with others.One of the primary challenges for individuals with speech disorders is their inability to convey messages directly through spoken language.Furthermore, some individuals with speech disorders may also experience hearing loss, a prevalent issue worldwide.Te prevalence of speech disorders and hearing loss is steadily on the rise, with an increasing number of individuals afected by these conditions each day.According to the World Health Organization (WHO), an estimated 430 million people, which is 5% of the total world population, have a speech disability and this number is expected to rise to 1 in 4 by 2050.Te impacts of hearing loss are very serious.For example, people with speech disabilities are unable to communicate with others which may lead to social isolation, loneliness, and frustration.Tese conditions signifcantly impact individuals' lifestyles and academic performance, often resulting in employment challenges.In many developing countries, there are a very limited number of specialized schools to cater to the needs of students with speech disabilities and hearing impairments [1].
Sign language is a way of communication among people sufering from speech disorders and/or hearing loss problems.It is a language for speech-disordered people through which they can communicate with other people and convey their messages.Sign alphabets rely on static hand poses to symbolize individual letters of the alphabet, employing gestures as a form of nonverbal communication.Te progression in computer vision has opened doors to the development of sophisticated models capable of recognizing these signs, interpreting hand confgurations, and seamlessly translating them into both text and voice [2].For instance, in a study by Raziq and Latif [3], the authors proposed a gesture-based approach for Pakistan Sign Language (PSL) recognition, focusing on training and communication modules to detect sign language and convert it to text.
Tere is no universal sign language in the world, and most people rely on region-specifc sign languages.Today, there are 138-300 varieties of sign language across the world [4].Moreover, there is a persistent communication gap between hearing-disabled people, because they rely on sign language, which is a problem for normal people due to their less understanding of sign language.Typically, sign language recognition through gadgets entails a two-step process: frst, the detection of hand gestures within the image, followed by their classifcation into the corresponding alphabet.Numerous methodologies incorporate the use of hand-tracking devices such as Leap Motion and Intel RealSense, accompanied by the application of machine learning algorithms like support vector machines (SVMs) to classify these gestures [5].Hardware devices, such as Microsoft's kinetic sensors, are capable of constructing a three-dimensional (3D) model of the hand while tracking hand movements and their orientations [6].Although hardware-based techniques can ofer a relatively high level of accuracy, their widespread adoption is impeded by the signifcant initial setup costs.
Numerous information and communication technologies (ICTs) are used for the detection and translation of diferent sign languages used by speech-disordered people.However, some of these technologies are either expensive or socially unacceptable to many people sufering from speech disabilities.Te computer-based techniques were widely used; however, the computer is not portable and hence cannot be used by most people on the go.For such, a specialized environment is necessary.Furthermore, it is crucial to employ socially accepted devices to address these challenges.
Te ubiquitous presence of smartphones is undeniable.Tese devices can efciently execute a wide range of machine and deep learning applications.Notable examples include convolutional neural networks (CNNs), K-nearest neighbors (KNN), deep convolutional generative adversarial networks (DCGANs), deep neural networks (DNNs), support vector machines (SVMs), recurrent neural networks (RNNs), and 3-D convolutional neural networks.Te smartphone can translate a sign language gesture to speech and vice versa in real time to convey a proper message to other people.Some prototypical-level applications also exist; however, they are either region-specifc or not accurate and hence rarely used.Tis problem highlights the need for a universal sign language with no geographical boundaries and specifcations.
Te smartphone processor and camera can be used for the detection of sign language.As mobile hardware technology is getting more sophisticated over time and moving towards cloud infrastructure, maintaining a user-friendly interface and keeping low latency on the cloud processing remains a major issue [7].Smartphones equipped with an increasing number of cameras have prompted researchers to explore their potential in vision-based sign language recognition applications.In the vision-based approach, a smartphone's camera is employed to capture images or videos of hand gestures.Subsequently, these frames undergo processing to recognize the signs and generate text or speech output.It is important to note that vision-based approaches may entail a trade-of in accuracy compared to sensor-based methods.Tis is among various challenges in image processing, including variations in lighting conditions, sensitivity to the user's skin color, and the presence of complex backgrounds within the image [8].
Numerous review articles have been written on accessibility for speech disorder problems, regional and global sign languages, sensors-based approaches, and gesturebased recognition systems.Te following few paragraphs summarize and discuss the contributions in terms of survey papers or reviews and their contributions along with a discussion on the research gap.
In a study by Ardiansyah, et al. [9], a review of studies has been performed between 2015 and 2020.Tey selected the 22 most relevant studies regarding their research questions.In this study, the most popular method to obtain data is through a camera.Diferent techniques were compared and CNN was the most popular as it was more accurate and used by 11 researchers out of 22. Similarly, a brief review of recent trends in sign language recognition by Nimisha and Jacob [10] discussed the two main approaches, which are the vision-based approach (VBA) and the gesturebased approach (GBA).Te image or vision-based systematic literature review (SLR) and their approach comprising feature extraction and classifcation are mainly discussed.Moreover, a comparative analysis of the techniques and achievements (in terms of accuracy) of nine diferent studies on VBA and three studies on GBA is also available in this study.
A review of smart gloves for the conversion of signs to speech for the mute community was proposed [11].In this study, there was an absence of comparisons across various research papers.Te study primarily concentrated on a single approach, specifcally the glove-based approach for gesture recognition.Similarly, the perspective and evolution of gesture recognition for sign language are presented [12].Tey analyzed diferent gesture recognition devices through a timeline with important features and achieved recognition rates.Tey concluded that Leap Motion is a good option for sign language as it is cheap, easy to use, and accurately recognizes the hands.Some work on vision-based sign language recognition systems is also proposed by Sharma and Singh [8].In this study, diferent vision-based methods are analyzed along with the datasets used.
A comprehensive review of wearable sensor-based sign language recognition is discussed by Kudrinko et al. [13].Tey conducted a review of studies between 1991 and 2019, focusing on a total of 72 diferent research eforts.Tis review paper aimed to discern prevailing trends, best practices, and existing challenges within the feld.Various

2
Advances in Human-Computer Interaction attributes, such as sign language variation, sensor confguration, classifcation methods, study designs, and performance metrics, were systematically analyzed and compared.
It is important to note that this particular study exclusively examined the sensor-based approach.Additionally, the paper proposed a review specifcally centered around hand gestures and sign language recognition techniques [14].Tey focused on a comprehensive exploration of the challenges, diverse approaches, and the application domain of gesture recognition.Furthermore, they studied the various techniques and technologies utilized in sensor-based gesture recognition, providing valuable insights into this area of research.
A technical approach to Chinese Sign Language processing is discussed in the study by Kamal et al. [15].Tey provided an overview of Chinese Sign Language Recognition (CSLR).Te paper discusses numerous issues related to Chinese Sign Language.Similarly, another review on system-based sensory gloves for sign language recognition and state of the art between 2007 and 2017 was presented by Ahmed et al. [16].Tey reviewed the studies published between 2007 and 2017.Te authors explored and investigated the SLR using the glove sensor approach.Te articles are divided into four categories that are framework, review and study, development, and hand gesture types.Numerous recommendations put forth by researchers aim to address both current and anticipated challenges, ofering a wealth of opportunities for further research in this feld.
Te study on a review of automatic translation from Arabic to Arabic Sign Language is presented in the study by Ayadi et al. [17].Te authors presented work related to Arabic Sign Language (ArSL).Tey discussed the classical machine translation approach (direct, transfer-based, and interlingua) and the corpus-based approach (memory, example, and statistical).Te authors also described the language challenges, such as morphology, syntax, and structure.Te study provides an extensive list of important works related to ArSL machine translation.Additionally, it ofers a comprehensive review of feature extraction methods in sign language recognition systems by Suharjito et al. [18].Te review of studies published between 2009 and 2018 was analyzed.Te authors reviewed and presented the progress of feature extraction in sign language recognition.Te authors conclude that there is a considerable improvement in tracking hand regions by active sensors but still, there is room for improvements in vision-based approaches.
A review of gesture recognition focusing on sign language in a mobile context is presented in the study by Neiva and Zanchettin [19].A review of studies published between 2009 and 2017 is presented.Te total number of papers that were analyzed and compared was 43.Te authors covered static and dynamic gestures, simple and complex backgrounds, facial and gaze expressions, and the use of special mobile hardware.Similarly, a review of vision-based American Sign Language (ASL) recognition, its techniques, and outcomes are discussed in the study by Shivashankara and Srinath [20].Te authors presented a review of ASL.Te authors highlighted the work and comparison of several researchers for vision-based sign language recognition.
A comprehensive survey on sign language recognition using smartphones is presented in the study by Ghanem et al. [7].In this paper, the authors explored the latest advancements in mobile-based sign language recognition.Tey categorized existing solutions into sensor-based and vision-based approaches, highlighting their respective advantages and disadvantages.Te authors' primary focus was on feature detection and sign classifcation algorithms.Similarly, an automatic sign language recognition survey was done in the study [21].Tey reviewed the studies published between 2008 and 2017.Te authors discussed the advancement of sign language recognition.Te authors also provided an overview of state-of-the-art building blocks of automatic sign language recognition like feature extraction, classifcation, and sign language databases.
A study by Suharjito et al. [22] conducted a review of sign language recognition application systems for hearing loss or speech-disordered individuals, employing an inputprocess-output framework.Tey evaluated various sign language recognition approaches and identifed the most efective approach.Additionally, the study focused on different acquisition methods and classifcation techniques, presenting their respective advantages and disadvantages.Tis comprehensive analysis ofers valuable insights for researchers seeking to develop improved sign language recognition systems.
In summary, this discussion above has encompassed selected systematic literature reviews (SLRs) and survey papers covering diverse topics of interest, while also highlighting notable contributions in these areas.Certain reviews are specifcally tailored to region-based sign languages, such as Chinese and American Sign Languages.Meanwhile, others have become obsolete, ofering minimal relevance to contemporary modern approaches.To address this research gap, this paper conducts a comprehensive analysis and review of publications focused on sign language detection and interpretation techniques, particularly those employing machine and deep learning approaches.Te review encompasses publications from esteemed journals and prestigious conferences spanning the past decade, ranging from 2012 to July 2023.Te insights derived from this review hold signifcant implications for a wide spectrum of stakeholders, including practitioners, researchers, developers, and industries engaged in accessibility solutions, software, and hardware development, and the creation of smart devices tailored to individuals with speech disorders.Te major contributions of this paper include (i) A complete up-to-date analysis of the publications published from 2012 to July 2023 through a rigorous search and standard selection criteria.(ii) A detailed yet comprehensive discussion on current trends in the feld of disabilities specifcally for speech disorder people.(iii) A discussion on diferent machine learning approaches for smart gadgets (smartphones in Advances in Human-Computer Interaction particular) along with sensor-based approaches used in smart gloves.
Tis paper organized and categorized (in a comprehensive manner) the available literature from diferent perspectives and points of view discussed in the Materials and Methods section.A compact and concise literature is presented in respect of sign language recognition.Tis study may help the practitioners to better understand the area, specifcally in mobile-based sign language detection and recognition systems.It may also help the researchers to be fully aware of diferent approaches and research progress in this feld.Tis work comes under the category of accessibility for people sufering from hearing loss or speech disorders.
Te remainder of the paper is structured as follows.Section 2 encompasses the "Materials and Methods," outlining the approach used for examining the existing literature.Section 3, titled "Findings and Discussion," investigates the explanation of seven research questions.Section 4, labelled "Meta-Analysis," provides a comprehensive overview of the paper's analysis, and it also touches upon potential avenues for future research in Section 5 "Open Research Questions."Finally, Section 6 serves as the conclusion, and the references are listed at the end of the paper.

Materials and Methods
Tis study presents a systematic literature review (SLR) on sign language detection and interpretation via smartphonebased machine or deep learning approaches.Tis study is mapped and conducted based on the guidelines presented by Kitchenham et al. [23] and Moher et al. [24].Te research questions are designed to identify the research gap and are framed in Table 1.
2.1.Search Strategy.Tis section discusses the search strategy for searching and mapping the relevant literature.We used the PRISMA framework for selecting the most relevant studies.We have adhered to the PRISMA framework [24] for structuring our search and selection methodology, illustrated in Figure 1.Te PRISMA framework is a widely recognized and established methodology for conducting systematic literature reviews.It ofers a set of guiding principles and a fowchart (refer to Figure 1) that aids researchers in adopting a systematic approach to ensure the reporting quality is accurate, comprehensive, and transparent.Tis, in turn, forms the foundation for making well-founded and evidence-based decisions when selecting relevant literature.Figure 1 illustrates the initial search results, which amounted to 233,860 records.After screening and removing duplicates, 281 studies were left of which 163 studies were the most relevant and are included for analysis.
Te criteria for inclusion/exclusion of publication are defned in Table 2. Te literature has been tabulated, analyzed, and mapped based on criteria defned in Table 2.

Time Frame and Digital
Repositories.Te time for searching the relevant literature is from 2012 to July 2023 (both years included) shown in Table 2. Te use of smartphones for sign language detection and identifcation has evolved over the years due to the widespread adoption of smartphones and their growing role in assisting individuals with disabilities, including speech disorders, visual impairments, and related challenges.Since then, a reasonable amount of literature is available and mapped in this paper.We selected IEEE Xplore, ScienceDirect, ACM Digital Library, and Google Scholar for searching the literature.Tese repositories were selected due to the reasons that they provide relevant publications, results, and analytics.Academic search engines, such as Google Scholar, are also used for meaningful searches and insights.

Teoretical Framework and Initial Results
. Table 3 shows a list of strings that we have used for searching and mapping the literature.Te search strings were searched using different web search engines (discussed above).Te search strings tabulated in Table 3 were applied in the selected digital repositories.Te results are recorded in Table 3.
Te publications are categorized as journal papers and conferences.Only prestigious conferences, i.e., supported by ACM, IEEE, or Springer, are considered.Te ratio is shown in Figure 2.
Similarly, the year-wise frequency of the selected publication is shown in Figure 3.We selected papers from 2012 to July 2023.We have seen a healthy growth of publications on these accessibilities, sign language, and smartphones as tools for speech-disordered people.
Table 4 presents the summary (most relevant papers) of the publications along with years, types, and publishers.We selected only well-reputed journals and conferences.

Findings and Discussion
Tis section is dedicated to addressing the research questions raised and discussed in Table 1.Additionally, it provides an exhaustive review of the selected publications from a pool of 163 research papers.It covers a wide range of aspects within the research on smartphones as assistive devices, the application of machine and deep learning approaches for individuals with speech disorders, the compilation of comprehensive datasets utilized in research, region-specifc sign languages, and a detailed examination of the evaluation metrics employed in experiments, each discussed in dedicated subsections.Moreover, this section discusses the fndings, research gap, and possible directions for future research.

RQ1: What Is the Current Status of Smartphone-Based
Sign Language?In a study by Ghanem et al. [7], the authors discussed in detail a survey of existing techniques used for smartphone-based sign languages.Moreover, the authors 4 Advances in Human-Computer Interaction  To study what metrics are used in the experiments of the sign languages RQ7 Which models have demonstrated better performance for specifc sign languages?
To summarize the performance of models in sign language recognition, specifcally highlighting which models have demonstrated better performance for specifc sign languages Advances in Human-Computer Interaction developed an interactive Android mobile application centered around machine learning, aimed at bridging the communication gap between individuals with hearing loss and the general population.In this connection, they introduced the PSL dataset [141].Te approach used in this study involved training the data through the SVM model, enabling automatic recognition of captured signs using the static symbols stored in the database.Numerous approaches to machine and deep learning are used in various applications.Table 5 provides a list of several of these approaches.
Table 5 shows a range of techniques organized according to the year of study and evaluation metric.Notably, the CNN deep learning model has gained widespread acceptance among recent researchers for sign language detection and or recognition.Furthermore, the major evaluation metric employed across the studies is "accuracy," as indicated in Table 5.

RQ2: How Machine Learning, Deep Learning, and Lightweight Deep Learning Techniques Are Used for the Detection and Interpretation of Sign
Languages?Over time, numerous techniques have been investigated for efcient recognition of sign and gesture languages.Te majority of sign language recognition systems rely on machine learning, deep learning, and lightweight deep learning approaches.Table 6 presents a compilation of selected studies and their respective approaches for detecting sign languages through deep learning methods.Analyzing the table, we can see that CNN is the most dominant technique.Tese techniques are general and not associated with specifc hardware, such as smartphones.Moreover, most of the studies use hand gestures as input and recognize it via some devices, such as custom-built gloves.It is also observed that CNN is still widely used even in recent years.It is important to recognize that any sign recognition system
Table 2: Inclusion and exclusion criteria.

Inclusion criteria
Te searched string appeared in the title, abstract, or keywords of the study Te publication is written only in the English language Studies in journals, conferences, and book chapters from 2012 to July 2023 Exclusion criteria Blogs, keynotes, and weak reference studies, such as Wikipedia, dictionaries, and thesaurus Duplicate studies, i.e., studies published in more than one publisher's database 6 Advances in Human-Computer Interaction typically involves several key steps.First, input data are acquired, often through sources such as smartphone cameras or sensors.Te subsequent step requires feature extraction from the acquired input data.Finally, the signs are classifed using algorithms that are well-suited to the extracted features.Te accuracy of the detection and extraction system signifcantly infuences the quality of recognition results.Various approaches have been employed in sign recognition systems, including CNN, KNN, ANN, and SVM, among others.Among these techniques, CNN stands out as a leading approach compared to the other methods listed in Table 6.Table 6 also depicts the studies and their associated information with each study.

RQ3: What Are the Types of Datasets Used for Sign
Language Recognition?Table 7(a) provides a comprehensive discussion of the various types of datasets and their utilization in numerous studies.Furthermore, in Table 7(b), links to publicly available datasets are provided.Upon analyzing these tables, it is observed that most of the studies have developed their custom datasets.Additionally, it is notable that many of these datasets are language-dependent, such as the PSL, American Sign Language (ASL), Malaysian Sign Language, Taiwan Sign Language (TSL), and China Sign Language (CSL), among others.Table 7 showcases the studies along with their respective years, datasets used, and remarks for each study.Conference ACM [27] Conference ACM [28][29][30][31][32][33] Conference IEEE [34] Journal Pensee Journal [35][36][37][38][39][40][41] Conference IEEE [42] Conference Elsevier [43] Conference ACM [44][45][46] Conference Advances in Human-Computer Interaction Numerous publicly available datasets are used by different articles.Some of them can be accessed via links shown in Table 7(b).Some datasets are custom-made and not publicly available.

RQ4: What Are the Most Popular Approaches for Recognizing Sign Language?
Sign language recognition commonly utilizes sensor-based and vision-based techniques to observe hand motion and posture [7].Te sensor-based approach involves the use of sensors, such as those embedded in gloves or smartphones, to track hand movements.Tese sensors, whether external or internal to the mobile device, capture data related to hand gestures.For example, glove-based approaches utilize multiple sensors within the gloves to monitor the position and movement of fngers and the palm, providing coordinates for subsequent processing.Tese devices may be connected wirelessly via Bluetooth.Te glove contains ten fexors for tracking fnger posture [39].In the sensor-based approach, a combination of sensors, including a G-sensor and a Gyroscope sensor, is employed to monitor hand orientation and motion.Tese sensors continuously capture signals related to hand data, which are then wirelessly transmitted to a mobile device for hand state estimation.Te choice of recognition method depends on the input data and the dataset utilized.In this particular case, the authors utilized template matching as a classifcation method, which encompasses fve dynamic sign classes.In the vision-based approach, hand gestures are observed through the mobile camera, and a series of processing steps are applied to identify the signs within the video stream.

RQ5: Which Sign Languages
Are Targeted?Diferent countries used their regional sign languages for research and contributed to the accessibility domain for speech disorder people.Te American Sign Language is the dominant sign language in the research as shown in Table 8.

RQ6: What Evaluation Metrics Are Used in the
Experiments?Te systems that use sign language dataset(s) are usually evaluated using standard metrics such as accuracy, precision, recall, and F1 score.From the literature, most of the systems were evaluated by detecting and interpreting the sign languages, and hence accuracy is the frequently used metric as shown in Figure 4. Similarly, precision and recall were also used.

RQ7: Which Models Have Demonstrated Better
Performance for Specifc Sign Languages?Numerous machine and deep learning models have been employed for detecting and recognizing diverse sets of sign languages.Tis process encompasses the training and testing of data using specifc sign language datasets, which can include data ranging from hand gestures to video frames, as well as data collected from wearable sensors.As previously discussed, gestures are captured using mobile cameras, while data from wearable sensors are collected through gloves.Table 9 provides an overview of studies centered on various sign languages, ofering insights into their respective accomplishments, primarily evaluated in terms of accuracy.Language.4 gestures, that is, J, P, Y, and Z, were left out because of their nonstatic nature.Each gesture was performed by 57 participants.Te total dataset contains 1.25 million samples [75] 2018 Indian Sign Language (ISL) Digits 0 to 9 and alphabets a to z were taken for the experiment [79] 2018 Indian Sign Language (ISL) Digits 0 to 9 and alphabets a to z were taken for the experiment [83] 2018 Custom built.Indian Sign Language 18 signs with each sign by 10 diferent signers recorded [71] 2018 Indian Sign Language American Sign Language British Sign Language Turkish Sign Language (i) ISL dataset: used SVM for this dataset Contains 4 signs, that is, A, B, C, and the word "Hello" (ii) ASL dataset: used KNN for this dataset Contains 10 ASL fngerspelling alphabets from a to i and k.
Te letter j is not included.Te total number of samples was 5254 (iii) ISL: used CNN for this dataset Te total dataset is 5000 samples for 200 signs done by fve Indian Sign Language users (iv) Authors used ANN for the following 3 datasets (v) ASL: consists of letters from A to Z (vi) British Sign Language: contains alphabets from A to Z (vii) Turkish Sign Language: Consists of alphabets from A to Z. Te letters Q, W, and X are excluded    males and 2 females were the participants [41] 2015 South African Sign Language (SASL) Taken only three alphabets A, B, and C and three digits 1, 2, and 3 [42] 2015 Malaysian Sign Language Taken only three alphabets A, B, and C and three digits 1, 2, and 3 [39] 2015 Taiwan Sign Language 51 fundamental postures in Taiwan Sign Language [35] 2015 ASL Custom built (real-time hand gesture recognition system) [37] 2015 Indonesian Sign Language Alphabets A to Z [40] 2015 ASL Only the letters A to Z are included for testing [43] 2015 Greek Sign Language Greek Sign Language alphabets [36] 2015 Advances in Human-Computer Interaction

Meta-Analysis
Tis section ofers a multilayered examination of the collected literature, exploring various dimensions, including publisher contributions, contributions by country, and citation analysis.Numerous approaches have been thoroughly tested and validated on specifc sign languages, as extensively discussed earlier.For instance, Figure 5 provides a comparative analysis of various studies on American Sign Language (ASL) along with the achieved accuracy levels.It is important to note that the accuracy of these approaches and models is contingent upon the complexity and variability of signs within a specifc sign language.
Te contribution of publishers has been analyzed based on the selected publications.While it is evident that each publisher has made substantial contributions to research in the feld of accessibility for individuals with speech disorders, it is noteworthy that a majority of the selected papers in this paper were published in IEEE journals and conferences, as illustrated in Figure 6.
Moreover, the most highly cited paper among the selected publications has been identifed.Te paper with the highest number of citations was authored by Cheok et al. in 2019, titled "A review of hand gesture and sign language recognition techniques [14]."As of the latest available data, it has accumulated 456 citations, as illustrated in Figure 7.
Similarly, the analysis of the selected literature for this paper has been conducted with a focus on country-wise contributions.In terms of country-wise contributions, India stands out as a signifcant contributor to publications related to speech disorders, as depicted in Figure 8. Te United States follows as the second most prominent contributing country.

Open Research Questions
Tis section explores the potential open research questions and challenges that currently exist.While the advancing hardware and software capabilities of smartphones are no longer a computational constraint, the multifaceted nature of sign languages, each with its diverse set of gestures,   Accurcay Achived

Studies on ASL
Comparative analysis of various studies on ASL and their achievement in terms of accuracy (%) Advances in Human-Computer Interaction continues to present signifcant challenges.Moreover, the challenges also include social acceptability and pervasiveness at low cost.Besides, the reliance on sign language(s) and its translation for individuals sufering from speech disorders has unique challenges that need proper investigation, for example, compatibility issues, multilingual translation, education level, real-time gesture generation, and translation.Te following subsection provides an in-depth elaboration of the most salient issues and challenges identifed in the existing literature.
5.1.Accuracy, Robustness, and Real-Time Detection.Te accuracy of real-time translation of sign language is challenging due to various factors, such as light conditions, power consumption, social acceptability, and privacy constraints.Te question is "How can we improve the accuracy and robustness of sign language detection and interpretation on smartphones to ensure reliable and real-time communication for users?"Tis is because it involves real-time image processing and source constraints, such as processing and storage [148].Delays in processing with false positive responses may further increase frustration for speech-disabled people.While smartphones are portable, the input of gestures on smartphones may require specifc tools or the presence of an individual to operate the smartphone's camera for individuals with disabilities.Without these provisions, there is a risk of improper gesture input and consequently an increased chance of errors.

Multilingual Support.
Every region of the world has its own sign language for its speech-disabled people.Tis makes it difcult to translate one sign language to another and hence  28 Advances in Human-Computer Interaction the scope becomes narrow [148].Te question of "What techniques can be developed to support multiple sign languages on smartphones, accommodating diverse user needs?" still exists.Furthermore, there is a pressing need to establish a universal standard for sign language.Such a standardized language could facilitate the development of universal smart devices, ultimately leading to a reduction in the overall cost of equipment designed for these purposes.

Gesture Recognition.
As mentioned, the sign languages are detected via sensor (hardware approach) or by vision approach.Te sensor approaches, i.e., gloves or other wearable devices, are not socially acceptable and hence rarely used by speech-disordered people.In the vision-based approach, we have image processing, which itself requires lots of energy, power, and storage [167].Te question "How can machine learning algorithms be optimized to recognize a wide range of sign language gestures and expressions accurately?" is yet to be answered.One reason may be that machine and deep learning algorithms are resource-intensive, and hence little attention is given to smartphones.Terefore, existing machine and deep learning algorithms require proper optimization for smartphones.

Data Privacy and Security.
Privacy is everyone's right and also for people with special needs including the visually impaired [168,169] and people sufering from speech disorders.Te sign language talking patterns are vulnerable due to processing by a machine [170].Moreover, the sign language talking in public may lead to privacy breaches.Terefore, the following question arises: "What measures can be implemented to ensure the privacy and security of sign language data transmitted and processed on smartphones?"Tis question needs proper attention.Te messages in digital form have numerous security issues, such as chat leakages and hacking, among others.As a case study, some attempts have been made by Michigan State University (https://msutoday.msu.edu/news/2019/newtechnology-breaks-through-sign-language-barriers) to address numerous pressing issues.However, more work is needed in this domain to ensure that sign language interpretation is riskfree.Proper encryption/decryption by the machine (used for translation) could also improve privacy issues.

Low-Light and Noisy
Environments.Image processing in low light generates false positives, which directly afect the performance and results [171,172].Te question "How can sign language detection systems on smartphones perform efectively in low-light conditions and noisy environments?"still exists.Moreover, due to battery constraints, smartphones have limited battery life, which tends to deplete rapidly during image processing activities under low-light conditions.Te machine and deep learning application(s) may further contribute to battery depletion.Tese research questions include various aspects of sign language(s) detection on smartphones and ofer opportunities to advance this feld to better serve the needs of individuals with hearing and speech disability problems.Researchers/ academia and practitioners can focus on one or more of these questions to contribute to the development of innovative, low-cost, socially acceptable, and efective solutions.

Conclusion
Te detection and interpretation of sign language for people with speech disorders, utilizing cost-efective of-the-shelf devices, particularly smartphones, has gained substantial attention within the research and academic communities.Using a smartphone for accessibility solutions is not an exception due to its growing capabilities in terms of processing, mobility, storage capacity, and social acceptability.Tis paper presented a systematic literature review (SLR) on sign language detection and interpretation using pervasive and ubiquitous computing devices, such as smartphones.
Te objective is to comprehensively analyze the progress achieved thus far in the machine and deep learning approaches using smartphones.Moreover, to analyze the approaches employed in enhancing accessibility for individuals with speech disorders, it is important to gather insights regarding the recent machine and deep learning approaches, available datasets, evaluation metrics, and current research and emerging trends.In this connection, this paper is intended to provide valuable insights for researchers and practitioners engaged in accessibility initiatives, particularly in the domain of speech disorders.Tis study highlighted the most valuable literature published from 2012 to July 2023.Moreover, it highlighted a detailed yet comprehensive literature, datasets, and numerous machine and deep learning approaches used on smartphones.Te paper specifcally focuses on the detection and interpretation of sign languages via smartphones.Tis study suggests that the development of a universal sign language could greatly beneft both practitioners and developers in this feld since it may mitigate the overhead costs associated with learning, detecting, and translating multiple sign languages.Moreover, the focus should be on socially acceptable devices instead of expensive or complex wearable devices.Tis review paper may serve as a valuable contribution to the existing body of knowledge and is expected to ofer a roadmap for future research in the domain of accessibility, specifcally for speech-disabled individuals.Future work can be carried out in diferent areas, such as real-time accurate translation by smartphones, preserving privacy during translation, and accurate gesture recognition in low-light conditions.

Data Availability
Te collected data (in an Excel sheet) will be provided upon request.Most of the basic statistics regarding the systematic literature review are discussed within the paper.

Disclosure
Tis study was conducted at the Department of Computer Science, City University of Science and Information Technology, Peshawar, Pakistan.

Advances in Human-Computer Interaction
current status of smartphone-based sign language?To study and map the current status of overall sign languages using smartphones as a device for detection and interpretation, especially in 2023 RQ2 How machine learning, deep learning, and lightweight deep learning techniques are used for the detection and interpretation of sign languages?To study deep learning and lightweight deep learning techniques used for the detection and interpretation of sign languages RQ3 What are the types of datasets used for sign language recognition?To specify the diferent datasets, used for detection and interpretation of sign languages RQ4 What are the most popular approaches for recognizing sign language?To study and map the most popular approaches to sign language detection and interpretationRQ5Which sign languages are targeted?To study the sign languages which are detected and interpretedRQ6What evaluation metrics are used in the experiments?

Figure 2 :Figure 3 :
Figure 2: Studies published in conferences and journals.

6 :
NNs) with log-sigmoid, NN with symmetric Elliott, and SVM Accuracy, classifcation time, memory usage, battery consumption Te time needed by the system to recognize a single sign is between 6 frames per second (FPS) and 20 FPS.Techniques of sign language recognition using deep learning.Inaudible acoustic signal to estimate channel information and capture the sign language in real time Accuracy[162] 2022 Hybrid convolutional neural network + bidirectional long short-term memory (CNN + Bi-LSTM) Peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), Fréchet inception distance (FID), temporal consistency metric (TCM)

Figure 4 :
Figure 4: Evaluation metrics by frequency used in diferent research.

Figure 5 :Figure 6 :
Figure 5: Comparative analysis of various studies in terms of accuracy.

Table 3 :
Studies found in the selected repositories.

Table 4 :
Summary of the included literature.

Table 5 :
Techniques of sign language recognition using smartphones.

Table 7 :
(a) Datasets used in sign language recognition.(b) Links to publicly available dataset.

Table 9 :
Models and their evaluation performance on specifc sign languages.