Motions using Leap Motion controller are not standardized while the use of it is spreading in media contents. Each content defines its own motions, thereby creating confusion for users. Therefore, to alleviate user inconvenience, this study categorized the commonly used motion by Amusement and Functional Contents and defined the Structural Motion Grammar that can be universally used based on the classification. To this end, the Motion Lexicon was defined, which is a fundamental motion vocabulary, and an algorithm that enables real-time recognition of Structural Motion Grammar was developed. Moreover, the proposed method was verified by user evaluation and quantitative comparison tests.
Interface technology, which supports the interaction between content and users, is continuously being developed. Recently, the technology is transforming into a natural user interface (NUI) method that provides users with a bigger sense of reality compared with the conventional method, which focusses on the use of mouse and keyboard. NUI is an up-to-date means of interacting with computers that has gradually drawn more interests in human-computer interaction (HCI). NUI comprises voice interface, sensory interface, touch interface, and gesture interface. Leap Motion is a finger gesture interface-supported device [
For these benefits, the Leap Motion controller is widely used in various applications such as games [
In particular, Leap Motion gesture recognition in Amusement (game) Contents plays a crucial role in keeping the player engrossed in the game. It also increases the immersive sense of the Amusement Content because Leap Motion uses the player’s gestures without any controllers in real time as the player interacts with the content. Games that use gesture recognition can capture the player’s attention easily through the progress of the game [
Research on the recognition of Leap Motion has been carried out in technical studies. Some studies on the use of SVM were reported in [
As described above, we realise the use of Leap Motion in contents is expanding and the technology of recognition has a cumbersome preprocessing task. Although many studies investigated movement recognition through Leap Motion and content application, authors have not found any literature reported on standardized motion grammar. This study actually is designed to target leap motion gestures that have been used in games since game users are inconvenienced by having to learn different motions for content because they all have their own motions. A preliminary conference paper is shown in [
To this end, this study defined the Motion Lexicon (ML) that can be universally used in Amusement and Functional Contents and designed the Structural Motion Grammar (SMG) composed of the combination of ML. Then, the tree of SMG was recognized in real time thorough coupling a motion API without using complex procedures such as feature extraction and training process like a machine learning algorithm. Then, the defined motions were then tested for verification.
Researchers have studied the accuracy and robustness of Leap Motion [
The Leap Motion’s movement recognition has also been investigated [
The use of Leap Motion on sign language is also being investigated [
Researchers also investigated content using Leap Motion [
And there were also studies about various contents and techniques using Leap Motion, aforementioned in Introduction [
To accomplish the proposed method, the common motions used in Amusement and Functional Contents were first classified. Based on the classifications, the ML that can be used universally was defined. Then, SMG was defined through the combination of ML. Also, the recognition step is provided.
Figure
Schematic of the proposed method.
To define the universal motions that use Leap Motion, the representative motions are needed to be extracted by each content classification. The digital contents where leap motion is applicable can be classified into Amusement Content and Functional Content based on their purposes. Both types of contents have subgenres, and commonly used motions were extracted through the classification and analysis of the genres.
Amusement Content is also known as the game content. This content can be classified into the following subgenres based on the motions: Action, FPS (First Person Shooing), Simulation-Racing/Flight, Arcade, Sports, and Role-playing. Of the six genres, Sports and Role-playing were excluded because they did not fit in the current study. Sports games were not fit for Leap Motion usage because multiple players need to be controlled simultaneously.
For Role-playing games, which have a high level of freedom, defining the motion has limitations because its interface and the number of possibilities are very complex and diverse.
To this end, the four genres, namely, FPS, Action, Simulation-Racing/Flight and Arcade, were analysed and common motions were extracted. Table
Motion category by Amusement Content genres.
Game genre | Motion | Game example |
---|---|---|
FPS | Move, Jump, Run, Sit, Shot, Reload, Weapon Change |
|
Action | Move, Jump, Run, Attack (Skill) |
|
Arcade | Move, Jump, Function |
|
The motions can be comprehensively categorized into movement and action. In this study, ML is defined based on the framework that the left hand is the movement while the right hand is the action.
Functional Content was classified into Experience and Creation Content and Teaching and Learning Content. With the recent expansion of the virtual reality market, numerous Experience contents or disaster reaction training contents use NUI. A representative example of lecture content is e-Learning, which is a form of Teaching and Learning Content that provides lecture videos online to overcome the drawbacks of offline education, such as being closed and collective. Table
Motion category by Functional Content genres.
Functional Content | Motion | Example |
---|---|---|
Experience and Creation | Zoom In, Zoom Out, Using Tool (Drawing, Attaching, Cutting, and so forth), Rotation, and so forth |
|
Teaching and Learning | Play, Fast Play, Stop, Rewind |
|
Motion Lexicon (ML) consists of the motions that have been analysed within the Amusement Content and Functional Content using the hand and finger API. To define ML, the hand and finger API reflecting the features of the genres have been analysed. Tables
Amusement Content Motion Lexicon.
ML | Image | Motion principle | |
---|---|---|---|
Left Hand | Go (G) |
|
(i) Dynamic and static classification condition |
Stop (ST) |
|
(i) Dynamic and static classification condition | |
Left Direction (LD) |
|
(i) Dynamic and static classification condition | |
Right Direction (RD) |
|
(i) Dynamic and static classification condition | |
Jump (J) |
|
(i) Dynamic and static classification condition | |
Sit (S) |
|
(i) Dynamic and static classification condition | |
Roll (R) |
|
(i) Dynamic and static classification condition | |
Right Hand | Shot (sh) |
|
(i) Dynamic and static classification condition |
Reload (r) |
|
(i) Dynamic and static classification condition | |
Weapon Change (ch) |
|
(i) Dynamic and static classification condition | |
Kick (k) |
|
(i) Dynamic and static classification condition | |
Punch (p) |
|
(i) Dynamic and static classification condition | |
Function1 (F1) |
|
(i) Dynamic and static classification condition | |
Function2 (F2) |
|
(i) Dynamic and static classification condition | |
Drift (D) |
|
(i) Dynamic and static classification condition | |
Booster (B) |
|
(i) Dynamic and static classification condition |
Functional Content Motion Lexicon.
Functional Content | ML | Image | Motion principle |
---|---|---|---|
Experience and Creation | Zoom In (ZI) |
|
(i) Dynamic and static classification condition |
Zoom Out (ZO) |
|
||
Rotation (RO) |
|
(i) Dynamic and static classification condition | |
|
|||
|
|||
Teaching and Learning | Play (p) |
|
(i) Dynamic and static classification condition |
Fast Play (fp) |
|
(i) Dynamic and static classification condition | |
Rewind (rw) |
|
(i) Dynamic and static classification condition | |
Pause (PA) |
|
(i) Dynamic and static classification condition |
For Functional Content, “Zoom In,” “Rotation,” “Play,” “Pause,” and “Rewind” were representative motions. Given a very wide range of motions, not all of them can be defined. Therefore, motions that were commonly used have been defined.
Table
Table
For “Fast Play” and “Rewind,” the movement was the same, but with different hands and directions. For “Fast Play,” the left hand was moved to the right side of the
Structural Motion Grammar is a combination and grammaticalization of the aforementioned ML that has been defined. It consists of ML (Motion Lexicon), AML (Adverb and ML), CML (Compound ML), and ACML (Adverb and Compound ML). Figure
Structure of the motion grammar tree.
ML can be SMG by itself, such as the “Rotation” motion of the Experience and Creation Content. SMG is connected to ML. The process of “Rotation” motion has been identified with arrows within the schematic tree.
AML is a combination of ML and Adverb and Adverb was used as a part of speech that supports ML. For instance, for the left hand motion that was responsible for movement, the ML of “Go” was recognized and, at the same time, the SMG of the “Right Direction + Go” was expressed with the coupling of the Adverb of “Right Direction.” Within the schematic tree, the SMG leads to AML, which then leads to the ML/Adverb. The process of “Right Direction + Go” motion has been identified in arrows on the schematic tree.
CML was used when two types of motions were executed using ML and ML. For example, the left hand that was responsible for movement recognizes the ML of “Go,” and at the same time, the right hand can express the “Shot” motion with the integration of ML. On the schematic tree, SMG leads to CML, which then leads to ML/ML. The process of “Go + Shot” motion has been identified with arrows on the schematic tree.
ACML is a combination of ML and ML and Adverb vocabularies and was used when three motions were executed. For instance, the left hand responsible for movement recognizes the ML of “Go” and also recognizes the Adverb of “Left Direction” simultaneously. The right hand can express “Shot” with the integration of ML. On the schematic tree, the SMG leads to ACML, which then leads to the ML/ML/Adverb. The process of “Left Direction + Go + Shot” was identified with arrows on the schematic tree. In this study, the vocabulary combinations based on the aforementioned schematic tree have been used to define the SMG. The red dotted arrows indicate the recognition procedures that satisfy SMG. For example in Figure
A formal representation of SMG is the form of context-free grammar (CFG) since SMG can be broken down into a set of production rules. SMG illustrates all possible motions in given formal motions. We also define SMG as a theoretical form as below.
SMG: = AML ∥ CML ∥ ACML ∥ ML,
AML: = ML + Adverb,
CML: = ML + ML,
ACML: = ML + ML + Adverb,
ML: = G ∥ ST ∥ S ∥ LD ∥ RD ∥ J ∥ S ∥ R ∥ sh ∥ r ∥ ch ∥ k ∥ p ∥ F1 ∥ F2 ∥ D ∥ B ∥ZI ∥ ZO ∥ RO ∥ p ∥ fp ∥ rw ∥ PA,
Adverb: = LD ∥ RD.
Given that SMG has a combination of ML that represents a motion either using one hand or two hands, the SMG is decomposed into four children ML, AML, CML, or ACML; then, the recognition steps of ML are carried out. Recognition refers to the conditions that can explain the recognizable API on the Leap Motion device and define the motions. Leap Motion, which is a form of NUI, provides various APIs [
The algorithms SMG (mr_SMG), ML or ML_Adverb (mr_ML and mr_ML_Adverb), Hand Count (HC), Hand Feature (HF), Finger Count (FC), and Finger Feature (FF) are defined as shown in Figure
Motion recognition algorithm.
The following experimental environment was set up to evaluate the SMG suggested in this study. The desktop used for simulation was installed with Window 7 64bit OS, with Geforce GTX 770 as the graphics card. For software, Unity 5.3.1f1 version was installed, and Leap Motion was established for the hardware. The motion recognition module was developed using C#.
For the test method, the Amusement and Functional Contents motions defined in this study and established into grammar (ours) were compared with the Leap Motion SVM [
The correlation output value above 0.7 is thought of as recognized as well. And the recognition rates were illustrated on a graph. Table
Comparison of motion recognition rate.
ML | Ours (%) | SVM (%) |
---|---|---|
Go (G) | 100 | 100 |
Right Direction (RD) | 80 | 100 |
Roll (R) | 80 | 70 |
Weapon Change (ch) | 90 | 85 |
Function1 (F1) | 85 | 95 |
Booster (B) | 90 | 80 |
Rotation (RO) | 80 | 70 |
Rewind (rw) | 85 | 80 |
Stop (ST) | 90 | 80 |
Jump (J) | 90 | 70 |
Shot (sh) | 80 | 85 |
Kick (k) | 90 | 70 |
Function2 (F2) | 95 | 100 |
Zoom In (ZI) | 85 | 80 |
Play (p) | 80 | 70 |
Pause (PA) | 90 | 80 |
Left Direction (LD) | 80 | 100 |
Sit (S) | 90 | 70 |
Reload (r) | 100 | 100 |
Punch (p) | 80 | 70 |
Drift (D) | 85 | 75 |
Zoom Out (ZO) | 85 | 80 |
Fast Play (fp) | 85 | 80 |
— | — | — |
Compared to SVM, the recognition rate of ours for dynamic motions moving towards
Figure
Comparison of motion recognition rate between ours and SVM [
Figure
Figure
The results of ours and SVM show that the recognition rate changes depending on various factors that include the following: static motion that distinguishes the number of fingers and dynamic movement that moves towards a specific direction and a combination of motions. The last factor comprises the combination of two motions, namely, static motion + static motion, static motion + dynamic motion, and dynamic motion + dynamic motion. When additional static or dynamic motions were added to these combinations, a combination of three motions was made. Overall, the results show that ours had a higher recognition rate for diverse factors compared to SVM.
The defined grammar was applied to the Amusement Content to carry out the test. Table
Application of defined actions to content.
Grammar | Image | Content environment | |
---|---|---|---|
ML |
|
|
|
AML |
|
|
|
CML |
|
|
|
ACML |
|
|
To verify the research results qualitatively, the research carried out a survey on 104 people. The subjects of the survey were given comprehensive explanations of the needs of the SMG and its defined concept and were shown a simulation video of the research results. The participants of the test were between the age groups of 20 and 30 and had prior knowledge and experience on games and Leap Motion.
Google Survey was used to receive more objective responses for the survey by granting subjects with access convenience and sufficient amount of time. The questionnaires and simulation videos were uploaded on the Google program. The questionnaire comprises four questions, and the detailed contents are shown in Table
User assessment questions.
Questions | Question contents |
---|---|
Q1 | Were the contents appropriately classified according to genre? |
Q2 | Are the class structures of the defined language appropriate in terms of linguistics? |
Q3 | Can the defined motion language be used for the contents? |
Q4 | Are the motions defined in Clay Art useful? |
Demonstration of the Amusement Content.
This study defined the SMG that can be applied to the universal content environment of NUI Leap Motion, moving beyond the conventional content interface environment. Owing to the variation of the defined motions among contents in the content market environment, the contents were classified and the SMG that can be applied universally has been defined. The contents were classified into Amusement and Functional contents.
These two types of contents were classified into the subcategories: Action, FPS, Adventure, and Racing/Aviation for Amusement Content and Experience and Creation, as well as Teaching and Learning, for Function Content. The representative motions that were commonly used in the classified contents were investigated, and ML was defined using Leap Motion API. For Action, FPS, Adventure, and Racing/Aviation, the motions were distinguished into right and left hands and were defined. For Experience and Creation and Teaching and Learning, the motions that users can comfortably use have been defined. The motions that have been distinguished into right and left hands have been combined into three types of grammar, while a single ML was also allowed to be a grammar item by itself. The SMG was completed by applying the four types of grammar to all content motions.
Comparisons with a conventional mouse, a keyboard, and other traditional interaction methods are considered to be of sufficient value. It is also necessary to analyse the time required to learn how to interact. This series of experiments should be added as a future study. Further studies that build a database of more comprehensive gestures will be considered for future works as well.
The authors declare that they have no conflicts of interest.
This research was partially funded by National Research Foundation (NRF) (no. 2015R1D1A1A01057725).