Usability Evaluation of Educational Games: An Analysis of Culture as a Factor Affecting Children’s Educational Attainment

Educational games have been employed among Omani schools but those used by local Omani schools were imported and were mostly designed based on western contexts. For Omani children, these games may be culturally inappropriate and difficult to comprehend and follow, impeding children’s learning. Three questionnaires and one observational checklist were used to gather data from 40 respondents (observers). SPSS was used in data analysis. Through experiments, the behavior of Omani students towards the use of imported educational games was examined. Five main factors, namely, efficiency, learnability, memorability, errors, and satisfaction, of educational games for a target user were measured using Hybrid User Evaluation Methodology for Remote Evaluation (HUEMRE), Training Framework for Untrained Observer (TFUO), and Framework on Educational Games Behavior Intention (EGsBI), which are specifically designed frameworks for this purpose. The results of this study explained that the Omani children are facing difficulties in using the imported educational games; furthermore, this study proves that culture, language, animation, and interaction are contributing heavily to benefiting from educational games, and, therefore, these factors shall be highly considered in the process of educational games design to facilitate and ensure children learning; furthermore, the findings of this study enrich the comprehension of how the specified factors positively affect behavioral intention of Omani students in the use of educational games and in improving the behavior intention level of these students.


Introduction
Education systems have undergone transformation, with the inclusion of digital elements, allowing the incorporation of an innovative teaching-learning ecosystem known as e-learning [1]. As a blanket term, "e-learning" generally relates to some digital learning environments as can be exempli ed by the online and virtual learning environments, as well as social learning technologies [2]. E-learning systems provide appropriate ow of knowledge within organization, facilitate learning, ease the delivery of knowledge and information, and improve performance [3,4]. E-learning is commonly used in academic institutions today mainly for the purpose of easing the learning of all levels of learners, and this type of learning is highly useful in assisting young learners and those with special needs [5,6].
As e-learning has expanded and evolved, it becomes a tting conduit for educational games. For children especially, educational games have proven to be e ective in facilitating their learning process, and, for this reason, educational institutions are increasingly relying on the use of educational games. Educational games (or serious games) are games designed explicitly to teach learner on certain subject, expound concepts, strengthen development, or facilitate their skill training or learning during play [7].
Learning through the use of software involves the integration of science, technology, and society in educational games. A design model is used in structuring the learning contents and activities in educational games [8]. In developing serious games, game designers are often challenged when dealing with fantasy content and context integration owing to the lack of workable design and evaluation models for them [9], but the best educational software will still fail if learners are unmotivated towards learning. Educational software designers thus need to develop a context that will motivate learners and enrich their learning process [10].
Software games have demonstrated their potential in facilitating learning. At the same time, there are also criticisms towards such games. As such, the benefits and drawbacks of software games in facilitating education need to be examined more deeply, and the outcomes could become a guide in designing effective educational software games. It is indeed important that educational software games could educationally benefit all students, including those who are not used to playing computer games [11].
Educational games need to be evaluated to determine their effectiveness in facilitating learning. For that purpose, usability evaluation could be employed in affirming the acceptance and usage of software by certain group. e outcomes of evaluation could provide user with the information of whether the use of the software is advantageous or otherwise and whether the software is sufficient or needs certain improvements. e rest of this paper is organized as follows: In the second section, we discuss the related background and present literature review which helped to identify the gap for this study. In the third section, a detailed description of the methodology that was used to conduct this study is presented. In the fourth section, a detailed description of the performed experiments and discussions of the obtained results are presented. In In the fifth section, the authors provide a description of the research contributions. e sixth section discusses the research limitations and recommendations, and, finally, the conclusion of the efforts and findings presented in this research paper is presented in the seventh section.

Educational Games.
Video games provide players with simulated worlds that may embody certain social practices, and so, well simulated worlds provided by video games are more than just about facts or remote skills. Games incorporate the manners of understanding, executing, caring, surviving, and becoming adept in certain environment. Here, the skills and knowledge are dispersed between the human player and the virtual actor, and so, during play, the values, skills, and practices are dispersed, making the game world a powerful one as it could create situated understanding [12].
Inclusion and commercial success could be achieved through managing the variability in knowledge, know-how, or cultural environment of budding gamers [13]. A person recognizes culture at a very young age [8], and it has been suggested that the perception towards culture (cultural perception) differs from one child to another. A person's cultural perceptions form and dictate how he/she perceives and reasons [14] and also how he/she learns [15]. Cultural identities add to the group dynamic forces and the development of institutions or communities [16].
Mobile applications have considerably progressed. Today, mobile applications also allow gaming, in addition to allowing user to also perform other tasks. Mobile games are mostly played for entertainment and enjoyment purposes. However, the functions of games have been expanded that now mobile games are applicable in various areas like education, business, and also healthcare.
Students could be motivated to learn through the use of educational games, making them an efficient learning tool [17,18]. Educational games also could improve students' performance [19,20]. Educational games can be played using mobile devices, and so, learning can occur at any time or place. Nonetheless, it is not easy to ascertain the efficacy of mobile educational games, because mobile devices do have limitations which can impair usability, like inconvenient screen size, impractical control or input interface, and disruptions [21]. Also, the determination of usability of educational games needs the combination of educational and gaming requirements, and, therefore, it is not easy to ascertain [22].
Cultural context and symbolism are elements of educational games which can capture user's attention. Culture and learning perceptions vary, and, thus, evaluation on them may generate varying outcomes. Also, observers may have varied competency level, and it is important that observer is sufficiently competent in performing the task so that the input provided in usability evaluations is reliable [23].
Culture is associated with systems from which meaning is generated and regenerated [24]. Equally, culture could be associated with systems that provide established sets of meanings inside a given community [24]. Additionally, theories of cultural learning add new viewpoint on what impacts learning [25,26]. From the viewpoint of culture, cultural subgroups utilize language and other artefacts in forming the viewpoints of people concerning reality, which has linkage to the needs of that culture.
In the traditional educational system, cultural identities of learner are generally stable or permanent and are resistant to change [8]. However, educational computer games can affect the cultural identities of children by way of educational contexts [8]. Equally, a culture-enriched educational game could be used in teaching students cultural identities.

Usability Evaluation for Children's Mobile Learning
Games. Usability concerns the possibility of a given product to be utilized by people with minimal level of abilities. Usability is projectable within the product's creation scope and assessed through inspections or tests carried out by experts and prospective users. In this regard, heuristic evaluation has been a commonly used usability inspection method in finding the problems of structural and/or heuristic from the interface review, with user experience being considered [27].
Usability and HCI have been comprehensively explored in the literature. For instance, one study [28] was looking at how case study usage by students associates with usability life cycle. In another study [29], an evaluation tool created by the students in describing heuristic evaluation usage was described. A method allowing students to utilize specific techniques in addressing usability via results testing and analysis was elaborated in [30]. A usability study among students who used some web pages in answering the usability questions was performed by [31]. e four key usability quality attributes are efficiency, effectiveness, learnability, and user satisfaction and these attributes were examined and the results were showing the need to improve children's mobile learning apps in terms of usability and design [32].
Usually, usability evaluations are performed in a usability evaluation lab which is a special room equipped with complex audio and visual recording and analysis device, and the evaluation is autonomously executed by the study participants [33]. On the other hand, remote usability testing is carried out with evaluators and participants at different location. e method of evaluation used can be either synchronous or asynchronous, and each type of method has its own tools. Evaluator that employs synchronous method would gather information, and the evaluation would be performed remotely in real time with participant who might be in different location, like their workplace or home. e tools used for this method include video conferencing software or remote viewing, with which evaluator and participant could share computer screens, and evaluator would be able to view the participant's screen [34][35][36].
Remote usability evaluation is a commonly used HCI evaluation method in academic studies and in real-life settings [37]. In this regard, user-centred assessment methods have also been classed as usability evaluation methods. ey are still in use, but the outbreak of COVID-19 pandemic has raised criticisms on how the methods are being affected especially in their assessment of usability.
Involving Omani children, [37] examined the effectiveness of educational games in their learning process. A methodology that allows the remote execution of educational games, in addition to a framework which could gauge the cultural behaviour intention of Omani children towards educational games usage, was employed, as shown in Figure 1.
A framework for training the observer (see Figure 2) was proposed by [23]. e training as stipulated by the framework was a requirement to the methodology usage. e use of HUEMRE would ease the execution of remote usability evaluation by untrained observers (untrained observer is one with no UE background). e framework proposed presents user with the needed arrangement for teaching and training the untrained observers on how to carry out their respective task in gathering observations from children.
e findings and recommendations of [23,37] were adapted in this study in performing the actual evaluation, with the purpose of measuring and understanding the present situation in Oman, particularly with respect to the application of imported educational games that has appropriately served the study's purpose.

Methodology
A Hybrid User Evaluation Methodology for Remote Evaluation (HUEMRE) was the used method in this study, and Training Framework for Untrained Observer (TFUO) was used for observers training; the trained observes are required to participate in HUEMRE under Part one of Test one. HUEMRE comprises chosen population, three questionnaires, and an observational checklist. All of these were for gathering information; trained observers were the ones gathering the information relating to students' behavior and intention of utilizing the imported foreign educational games. e games in question were designed for children aged 6 to 8 years.
e following section presents the details of the experiment evaluation, with details on the processes and the obtained results.

Experimental Evaluation
e evaluation of the experiment covers the topics of measurement model validity, scale reliability, and the descriptive analysis.

Validity of Measurement Model.
e validity of the measurement model can be evaluated by determining its internal consistency and discriminant validity. e study examined the characteristics of the study samples, particularly in terms of their gender, age, relationship with children, and grade of the children. For that purpose, the attendant information and demographic variables of the children were obtained. See Table 1.

e Reliability of the Scale Used.
e factor analysis results were showing that all factors have construct validity. e sample was determined in terms of its reliability score scale, through sample analysis and reexamination using reliability test. Cronbach's alpha was next applied in testing the internal consistency of all factors. Meanwhile, scale reliability was ascertained using an iterative process, whereby any item would be removed if doing so will make the scale more reliable. e analysis would be repeated following the removal of certain item. However, some scholars [38][39][40][41] opined that item removal is unnecessary if the increase in reliability (following the removal) is negligible. Accordingly, no elimination was made to the study items since all variables scored alpha value higher than 0.7. Table 2 presents the details. e sample was examined and reexamined to ascertain the reliability of the items (see Table 2). Cronbach's alpha was accordingly applied in ascertaining the internal consistency of all factors. Further, volume reliability was determined using an iterative process.

Descriptive Analysis.
e descriptive analysis was carried out in this study. is involved the computation of the mean and standard deviation (see Table 3). Essentially, high mean value denotes high level of agreement of a given statement, while small standard deviation Advances in Human-Computer Interaction value denotes the high congregation of data around the mean [42]. Also, the five-point Likert scales were split into comparable size involving three classes as follows: scores lower than 2.33 (4/3 + lowest value 1) are interpreted as low, while scores higher than 3.67 (highest value (5) -4/3) are interpreted as high, and scores in between are interpreted as moderate.

Experiment Test II.
Five factors have been found to directly affect the evaluation of child's cultural behavior intention, namely, efficiency, learnability, memorability, errors, and satisfaction. Hence, they were all used during evaluation to obtain reliable and accurate results. A framework was accordingly constructed for the purpose of facilitating the examination of the impact of the five aforementioned factors on Omani children's behavior intention (see Figure 3). e framework was included in the study methodology. e proposed framework was built mainly to evaluate educational games (EGs) in understanding the cultural behavior intention of Omani child in using educational games. Efficiency, learnability, memorability, errors, and satisfaction are the constructs included in the framework as independent variables, while Omani child's cultural behaviour intention is the dependent variable included in the framework.
Experiment test II was performed in this study, and it involved the use of survey in collecting the data. Questionnaires were circulated to the teachers and parents via online survey. Out of 30 questionnaires dispersed, 20 were collected. Section 4.2.1 accordingly provides the details of the responses.

Measurements of Major Variables.
In descriptive statistics analysis, the values of mean and standard deviation were calculated for the study constructs to ascertain their characteristics. Generally speaking, high mean value means high accordance with the statements, and low standard deviation means high clustering of data around the mean [42]. e Likert scale used for the measurement items has five points, and these points were classified into three categories of low (scores of less than 2.33 or 4/3 + lowest value 1), moderate (scores in between), and high (scores of 3.67 and above or (5) -4/3) [43] to ease interpretation.
(1) Cronbach's Alpha Scores for Language. A number of sequential reliability testing treatments were performed on the five multivariable factors (efficiency, learnability, memorability, errors, and satisfaction). As displayed in Table 4, the highest Cronbach's alpha score is 0.899 (efficiency), while the lowest score is 0.728 (errors). Hence, the dimensions or variables of the study have adequate reliability.
(2) Cronbach's Alpha Scores for Animations. A number of sequential reliability testing treatments were performed on the five multivariable factors (efficiency, learnability, memorability, errors, and satisfaction). As displayed in Table 5, the highest Cronbach's alpha score is 0.874 (satisfaction), while the lowest score is 0.723 (learnability). Hence, the dimensions or variables of the study have adequate reliability.
(3) Cronbach's Alpha Scores for Interaction. A number of sequential reliability testing treatments were performed on the five multivariable factors (efficiency, learnability, memorability, errors, and satisfaction). As displayed in Table 6, the highest Cronbach's alpha score is 0.869 (memorability), while the lowest score is 0.725 (errors). Hence, the dimensions or variables of the study have adequate reliability.
Usability of educational games was tested in this study, specifically focusing on the aspects of language, animations, and interaction. ese aspects were measured in terms of efficiency, learnability, memorability, errors, and satisfaction. Usability and effectiveness of the games were analyzed, and the potential cultural specific aspects of game design and   Advances in Human-Computer Interaction Using the game to learn was easy 40 0.812

2.
Using the game to learn was relaxing 40 3.
Using the game to learn was simple 40 4.
Using the game to learn was satisfying 40 5.
Using the game to learn was interesting 40 Section 3. Postevaluation questionnaire 1.
Which game would the child like to play again, and why? 40 0.832 2.
Which game helps the child to learn, and why? 40 3.
Which game is considered easy to play, and why? 40 6 Advances in Human-Computer Interaction Using the game to learn was easy 3.82 0.813 2.
Using the game to learn was relaxing 4.14 0.941 3.
Using the game to learn was simple 3.96 0.892 Advances in Human-Computer Interaction 7 usability were explored. Specifically, how cultural aspects of game design contribute to the learning process of children was examined, and the cultural aspects should be included in the development of educational games. Accordingly, the elements of efficiency, learnability, memorability, errors, and satisfaction could facilitate the understanding of usability in educational games. Figure 4 accordingly illustrates the comparison of mean (minimum and maximum) of the aforementioned elements.

Measurements Analysis of Independent Variable (IV) and Dependent Variable (DV). Correlation of Omani child's cultural behavior intention as dependent variable (DV) is
based on the educational games (EGs). e educational games (EGs) analysis model shows the mean score of the dimensions of Omani child's cultural behavior intention, as follows: 4.3271 (efficiency), 3.789 (learnability), 3.567 (memorability), 3.741 (errors), and 3.8684 (satisfaction). e overall satisfaction score is showing significant relationship between satisfaction and Omani child's cultural behavior intention. e language element of educational games was shown to be most effective in increasing efficiency, learnability, memorability, errors, and satisfaction.
For the animation element of educational games and its impact on cultural behavior intention of Omani children, the mean scores are as follows: 3.5376 (efficiency), 3.1579 (learnability), 3.453 (memorability), 3.325 (errors) and 3.745 and 3.254 (satisfaction). e relationship between satisfaction and Omani child's cultural behavior intention was hence significant. Educational games (animation) were found to be effective in improving efficiency, learnability, memorability, errors, and satisfaction of Omani children.
For the interaction element of educational games and its impact on cultural behavior intention of Omani children, the mean scores are as follows: 3.159 (efficiency), 3.135 (learnability), 3.357 (memorability), 3.259 (errors) and 3.267 and 3.347 (satisfaction). e relationship between satisfaction and Omani child's cultural behavior intention was hence significant. Educational games (interaction) were found to be effective in improving efficiency, learnability, memorability, errors, and satisfaction of Omani children. Figure 5 displays the comparison of correlations for language, animations, and interaction of educational games.
e results demonstrated language as the element with the most significant direct impact on the design of educational games, followed by the element of animation and then the element of interaction. Accordingly, the evaluation results demonstrate the importance of involving these three elements in the development of educational games for children.

Analysis of Pretest.
Question one (Q1) (how good is the children's interaction with the games?) was to understand the interaction of children with the games at first introduction. Accordingly, an observation checklist was provided to the teachers and parents, and the teachers and parents were to indicate "Good" or "Bad" to this question. e results show the response "Bad" as the most recorded response. e details are displayed in Figure 6.
Question two (Q2) (how good is the children's ability to communicate with the presented interfaces?) was to determine if the children were able to understand the language of the written instructions of the games. Accordingly, an observation checklist was provided to the teachers and parents, and the teachers and parents were to indicate "Good" or "Bad" to this question. e results show the response "Bad" as the most recorded response. e details are displayed in the form of histogram in Figure 7.

Advances in Human-Computer Interaction
Question three (Q3) (how good is the children's ability to understand the contents of the presented interfaces?) was to see if the children could comprehend and interact with the animations in interfaces. Accordingly, an observation checklist was provided to the teachers and parents, and the teachers and parents were to indicate "Good" or "Bad" to this question. e results show the response "Bad" as the most recorded response. e details are displayed in Figure 8 in histogram form.
Question four (Q4) (how confident did the children feel using the educational games?) was to understand the feelings of children towards the use of educational games. An observation checklist was furnished to the teachers and parents, and the teachers and parents were to indicate "Good" or "Bad" to this question. e results show the response "Bad" as the most recorded response, as illustrated by the histogram in Figure 9.
Question five (Q5) (how high is the children's hesitation towards the games?) was to determine if children felt hesitation before the utilization of the games. An observation checklist was accordingly provided to the teachers and parents, and the teachers and parents were to indicate "Good" or "Bad" to this question. e results show the response "Bad" as the most recorded response. Figure 10 presents the details in histogram form.
Question six (Q6) (how good is the overall behavior of the children?) was to determine the behavior of children prior to using the games. Teachers and parents as observers were provided with an observation checklist, and they were to indicate "Good" or "Bad" to this question. e results show the response "Bad" as the most recorded response. e details can be viewed in Figure 11. e observation checklist comprising six questions (Q1-Q6) was used by teachers and parents during the pretest to observe the children prior to the use of the games. e results demonstrated the lack of ability of these children in interacting with the games, the hesitation towards using the games, lack of knowledge of the content of the games, and misunderstanding of the general elements of the games. e analysis of observation checklist in Test III evaluated the behavior of Omani children towards educational games which were designed based on western culture and in foreign language (English).

Research Contribution
e main contributions of this study in terms of theory and practice are discussed in the next subsections.

Practical Contribution.
e use of HUEMRE was demonstrated in this study. Such usage has imparted some practical contributions to schools in Oman especially. e discovery of poor levels of success in educational games application in learning among Omani children was proven in this study. In view of that, some valuable guidelines were proposed in explaining the factors that could improve the use of educational games in Omani schools. Hence, factors affecting the use of games-learning services could be identified, and appropriate steps could be taken to improve games-learning services.
Additionally, the awareness and requirements of students on games-learning services within the school setting should be investigated. Other factors including the external factors that may affect students' acceptance of gameslearning services need to be investigated as well. Also, the relationship among usability testing evaluation, preevaluation, and postevaluation test should be ascertained.

Academic Contribution.
e acceptance of gameslearning services education institutions in Oman has not been expansively examined, and this study is among the few that has. In this study, challenges associated with gamebased learning were reviewed, and the proposed related models were compared, and the weakness and the strength of the model, as well as the literary gaps, were identified. e identified strengths and weaknesses of the reviewed model became the basis of this study's proposed model.
Clearly, several academic contributions are offered by this study. Firstly, this study provided a deep comprehension of the factors that affect games-learning services. As concluded from the literature review, empirical research works on the topic at hand are still inadequate. Relevantly, this study combines and expands the findings of past research works to address the issues related to games-learning services, with a specific focus on behavior intention to use, involving students in schools in Oman.
is explanatory empirical study is among the few that cover the topic of games-learning services in Omani schools. Accordingly, the factors impacting the quality of gameslearning services in schools in Oman were identified and elaborated in this study, and this promotes and enriches the knowledge in this area. e identified gaps with the literature were also addressed.
Also, the scrutiny of games learning in schools from the perspective of Omani culture is a significant contribution of this study, as this aspect had not been examined by previous studies.
is study is indeed a valuable addition to the knowledge of the practices related to game-learning services in general.

Limitations and Recommendations for Future Research
During the execution of this study, several issues were faced by the researchers, and some of the prominent ones are discussed below: (1) Notably, the services used were those available ones, and this has limited the administration of gameslearning services. Education related services should therefore be expanded, including those provided for in-class and off-campus learning. (2) Considering the fact that participants in this study were all from Omani schools, the results have limited generalizability. Hence, the inclusion of students of open or online learning could increase the generalizability of the study, in addition to expanding the scope to other education types, regular or open type. (3) As the model was tested using SPSS, the model's accuracy may be questionable. In this regard, the use of Structural Equation Modelling (SEM) technique would be more appropriate to increase the accuracy of the model. (4) Data were obtained using quantitative and qualitative approach, which increases the depth of the information as more factors with the potential impact on students' acceptance could be discovered. e students' acceptance of the new technology application could be understood more deeply. (5) As for future work, the population sample can be increased for better data presentation and validation, and also the validity and accuracy of the model can be tested and verified using SEM to improve the accuracy of the model.

Conclusion
Usability is an increasingly interesting concept, and it has expanded over the years. is concept has been defined differently by different scholars. Several subattributes (hypothetical constructs) of usability in defining system success have been proposed in the literature. For instance, usability of software is greatly determined by user involvement. Based on past related studies, this study has determined the subattributes of usability in analyzing software system's usability. Hence, those involved in software engineering domain like students and researchers could benefit from this study. However, the most appropriate measurement technique to evaluate software usability is challenging to determine, and this should be addressed in future studies.
Learning through games has become a new important platform in schools, and this has also sparked interest towards examining the prerequisite of the adoption of gameslearning services in schools. e study's findings show that the schools in Oman have appropriate environment and infrastructure to incorporate games-learning into the curriculum. Equally, the current awareness of students of games-learning should be taken into account as well to understand their level of readiness towards this mode of learning. e results show that Omani children appear to have ample knowledge of the technology needed to run these games. Also, students were found to understand the limitations of games-learning in education. Equally, problems that could impair the actual application of games-learning have to be addressed. In education, it is important to be aware of student's perception, specifically, on the use of games in students' learning, and combining education channels and alternative learning mode like educational games eases students' learning at the time and place convenient to them.
Data Availability e data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.