Nowadays, every user has dozens of Apps on her/his mobile device. As time passes, it becomes increasingly difficult simply to find the desired App among those installed on the mobile device and launch it. In spite of several attempts to address this challenge, no good solution for this growing problem has yet been found. In this paper, we examine the idea of classifying Apps based on their functionality to allow users to find and access them easily. An App’s functionality is elicited from the textual description of the App, as retrieved from the App store, and enriched by content from additional online publications. The functional representation is then classified into classes and mapped into personal categories using functional hierarchical compact taxonomies that can be easily presented on the small screen of a mobile device. Experiments and user studies demonstrated the potential of this approach.
Smartphones have changed (and continue to change) our lives beyond recognition, as they combine the capabilities of cellular phones with advanced computing and communication technologies. By using various Apps, we are able to perform a variety of tasks using our mobile devices more easily than ever before. While the term “mobile App” is not clearly defined, it commonly refers to computer software developed to be executed on smartphones, tablet computers, automobile interfaces, gaming consoles, and other mobile devices, which usually run a specific operating system. Apps are designed to help users perform specific tasks with a specific purpose: reading and writing emails, calendar management, gaming, factory automation, banking, order tracking, and much more [
According to “The Next Web” magazine [
In 2011, Perez [
One of the biggest problems in App search is that, in many cases, the name of the App bears no indication of its functionality (e.g., Instagram, Zedge, and Shazam) and therefore association- or word-based-related searches do not help. Pash [
Self-organizer. Functional classification of Apps.
This work examines the possibility of using machine learning techniques for the organization of Apps based on their functionality. The main questions we faced were as follows: Is it possible to perform Apps categorization based on their functionality? Is it possible to elicit the functionality of Apps from their textual description? Is it possible to provide personalized categorization of installed Apps? Is it possible to present the results in an intuitive visualization?
In our previous work [
The first challenge was already discussed by Böhmer and Krüger [
The rest of this paper is organized as follows. In Section
Our work deals with resolving the problem of finding Apps installed on a personal mobile device. As such, in this section, solutions for App search are surveyed in general, and we discuss what can be learned from PC desktop organization solutions. Finally, we examine the solutions in the App stores regarding the specific problem.
Shanahan and Glover [
The problem we are trying to resolve is somewhat similar to that of desktop file icon arrangement. Hence, before we describe our idea and exploratory study, we examine the long-standing problem of computer desktop organization. Many desktops are cluttered with icons and the users search for documents or folders among tens of icons. During the last two decades, many studies have focused on this subject. Barreau and Nardi [
There are several commercial tools that address the desktop icon arrangement problem. Fences [
Most of the solutions proposed thus far (e.g., Fences) are based on the differences between the file types, as indicated by the file extensions, such as “.pdf,” “.doc,” “.docx,” “.ppt,” and “.pptx.” On a mobile device, however, there is no such option, since all Apps are of the same type (.apk). Hence, it is necessary to understand better what lies behind any App in order to find better and more compact methods of ordering them on the mobile device screen.
There are quite a few Apps developed specifically to support Apps organization, most of which require manual organization of Apps into folders or through label definition (e.g., [
Given the above, the research questions that need to be addressed, followed by our hypotheses, are as follows: How can the correct Apps descriptions that describe App functionality be found? Textual description of the Apps when elicited from the App store and Apps reviews can provide a precise description of their functionality. How can the functionality be extracted from the App description? By extracting verb phrases from Apps descriptions, we can identify Apps functionality. How can Apps be categorized accurately based on their functionality representation? Apps functional description can be used to classify them correctly. How can Apps categories be organized and presented given the screen size of the mobile device? Semantic relations between Apps’ functional descriptions allow hierarchical organization of Apps in scalable abstraction, where the user zooms in and out of relevant functional categories.
We attempted to address the above challenges in a four-step exploratory study, where each step depends on the results of the previous step, as can be seen in Figure
Functionality-based App organization process. Rectangle: steps; arrows: inputs/outputs.
The first step in our study was to explore the possibility of enriching Apps description by Web mining, in order to determine whether we can find useful information about Apps that can be used to identify their functionality (H1). The challenge was to identify and extract relevant information about Apps’ functionality from the vast amount of information available on the Web. For this purpose, relevant textual descriptions of Apps were needed. The goal of this step was to provide sufficient raw material (representative features) for the next step. Since Apps descriptions are relatively short (as described below and presented in Table
Statistics for Apps descriptions from App store and after enrichment.
Measure/words number | App store description | After enrichment | ||||
---|---|---|---|---|---|---|
Total words per raw description | Bag of words (BoW) | Bag of functionality words (BoFW) | Total words per raw description | Bag of words (BoW) | Bag of functionality words (BoFW) | |
Max. | 1962 | 1393 | 239 | 3179 | 2019 | 298 |
Min. | 9 | 3 | 1 | 13 | 4 | 1 |
Average | 290.85 | 122.34 | 34.21 | 379.62 | 153.01 | 42.43 |
Median | 224 | 92 | 25 | 291 | 115 | 31 |
St. dev. | 231.45 | 103.79 | 29.94 | 306.87 | 132.25 | 37.58 |
Apps Web mining step.
For the purpose of this study, a database of Apps was built using the Androidrank Website [
Initial analysis of the Apps descriptions revealed a large variation of and relatively short textual descriptions (Table
App descriptions word distribution.
Using the Bing search API, we searched the Web for textual descriptions of Apps and collected about 50 results for each App. The query pattern was “[App Name] + ‘App’” such as “Kayak App.” The results were filtered and duplicate results, as well as results from Google Play (that were already mined), were removed. For Apps that had less than 50,000 downloads, an additional check was performed to ensure that the search result describes the desired App. We verified that the results contained a link to Google Play with the name of the App that we searched.
Figure
App enriched description example (Kayak App). Red rectangles indicate verbs; blue rectangles indicate nouns.
The right side of Table
Comparison of App store description (blue bars) and description after enrichment (green bars). The graphs present (a) total words per description, (b) BoW, and (c) BoFW datasets.
The second step of the study examined the possibility of extracting the functionality of the App from its textual description (H2). We obtained the enriched App descriptions as an input from the Web mining step. As can be seen in Figure
Eliciting Apps functionality step.
The text preprocessing stage started with classical information retrieval processing: part of speech tagging was performed, stop words were removed, and then stemming was performed. As part of stop words removal App domain-specific stop words were removed as well as words related to advertising or marketing, directions for operation, technical problems, and versions updates (“free,” “service,” “device,” “note,” “best,” “fastest,” “versions,” etc.). The stop word lists were created manually. By building a histogram of words, we could see that there are common words, whose importance in terms of understanding the functional meaning of the text is low. In some cases, mainly in titles, sequences of connected words appeared. These had to be separated (e.g., “MemoryBooster” and “ToDoList” may be split to obtain “Memory Booster” and “To-Do List”). A simple function that separates connected words was developed, based on the identification of a new word by the appearance of a capital letter in the string. Textual descriptions not written in English were automatically translated into English by the Bing translation service. This process created a BoW description for every App.
For functionality elicitation, we used the enriched BoW representation and extracted the words that indicate functionalities, creating a “bag of functional words” (BoFW) that contained verbs such as “manage,” “explore,” and “track” (see Table
Functionality mining (most common) from Apps descriptions. The table highlights the App store category vagueness as compared to the BoFW approach. For example, Apps numbers 4, 5, and 6 get different App categories but almost the same functionalities.
App number | App name | App category | Function 1 | Function 2 | Function 3 |
---|---|---|---|---|---|
1 | Super-Bright LED | Productivity | Light | Flash | |
2 | Clean Master | Tools | Boost | Clean | |
3 | TED | Education | Watch | Listen | Learn |
4 | Period Calendar | Health & Fitness | Track | Backup | Restore |
5 | My Days-Period & Ovulation | Lifestyle | Track | Forecast/Predict | Backup |
6 | Menstrual Calendar | Medical | Track | Forecast/Predict | Backup |
7 | Netflix | Entertainment | Watch | Browse | Rate |
8 | Calorie Counter-MyFitnessPal | Health & Fitness | Track | Count | |
9 | MP3 Music Download V6 | Entertainment | Manage | Download | Listen |
10 | Mp3 Search and Download Pro | Music & Audio | Search | Download | |
11 | File Manager | Business | Manage | Explore | Browse |
12 | DU Battery Saver | Productivity | Save | Manage | |
13 | CM Security Antivirus | Tools | Safe | Protect | Lock |
14 | Draw Something | Games | Draw | Guess |
Functionality words were elicited using a part of speech tagger (POS) [
Functionality elicitation.
Measure/verbs number | POS tagger verbs | Verbnet verbs | BoFW-POS tagger and Verbnet intersection |
---|---|---|---|
Max. | 489 | 874 | 298 |
Min. | 1 | 1 | 1 |
Average | 63.13 | 114.68 | 42.43 |
Median | 51 | 108 | 31 |
St. dev. | 46.82 | 54.67 | 37.58 |
For each of the 6633 BoW descriptions, we used the POS tagger (second column) and Verbnet (third column) and intersected the tools output (fourth column). As can be seen in the table, the average outputs of the POS tagger and Verbnet are 63 and 115 verbs, respectively. On average, 66% from POS tagger verbs (63) and 36% from Verbnet verbs (115) constitute the intersection (42). Verbnet examines each word individually, in contrast to the POS tagger, which relies on the word context, and therefore it recognizes many words as verbs when they are not. Following the above, we decided to use the intersection of the outputs in order to achieve an accurate BoFW representation; however, this issue needs to be studied further. The BoFW representation after enrichment contained 42 functional words on average (Table
The input for this step was the App description representations for each (unclassified) App in the database from the eliciting Apps functionality step. In this step, we investigated how Apps can be categorized based on their representations, considering both BoFW and BoW (H3). It should be noted that we focused mainly on the BoFW representation, as one of the main ideas of the study was to categorize Apps according to their functionality. In order to categorize Apps and to evaluate the categorization, we needed a subset of labeled Apps; hence, we developed an automatic App labeling function. The functionality-based classification step returned a classified Apps database.
An automatic App labeling mechanism may be useful not only for experimentation, but also when thousands and millions of Apps need to be labelled automatically (as this cannot be done manually in reasonable time). Hence, we preferred to suggest such a mechanism rather than to manually label tens or even hundreds of Apps only for experimentation. For labeling, we took advantage of the App titles’ unique features, when such existed. As most of the Apps deal with specific functions, but relatively few titles indicate the App functionality explicitly, we relied on the fact that some Apps do include their exact functionality in their title. The App labeling step (see Figure
Apps labeling step.
In order to label the Apps, we extracted the titles of all the Apps in the database and created one document containing all the title words. We retained the pointers from the title terms to the original Apps they represent. Then, we elicited title functionalities by applying the
In order to group similar words (after synonyms abstraction) into meaningful clusters, we generated a similarity matrix using an integrated similarity function between each pair of prevalent words. The function integrated the extended gloss overlaps method [
Equations (
We combined this measure with the semantic distance measure implementation [
The extended gloss overlaps measure and the semantic distance measure are averaged (at this stage, we decided to weight them equally), as can be seen as function
Every pair of words from the prevalent words was compared using the integrated similarity function. Table
Similarity matrix of prevalent words of Apps titles. The letters e and d indicate the generated clusters (“time,” “flashlight”) according to similar words (
|
Clock | Call | Flashlight | Phone | Screen | Alarm | Auto | Light | Timer | Monitor | Time | Flash | Voice | Volume | Recorder | Bright | Sound | Torch | Car |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Clock | |||||||||||||||||||
Call | 0.57 | ||||||||||||||||||
Flashlight | 0.64 | 0.57 | |||||||||||||||||
Phone | 0.64 | 1 |
0.64 | ||||||||||||||||
Screen | 0.67 | 0.59 | 0.67 | 0.62 | |||||||||||||||
Alarm | 0.96 |
0.73 | 0.74 | 0.71 | 0.78 | ||||||||||||||
Auto | 0.55 | 0.57 | 0.55 | 0.6 | 0.56 | 0.63 | |||||||||||||
Light | 0.74 | 0.67 | 0.9 |
0.71 | 0.78 | 0.78 | 0.63 | ||||||||||||
Timer | 0.91 |
0.57 | 0.64 | 0.64 | 0.67 | 0.87 |
0.55 | 0.74 | |||||||||||
Monitor | 0.67 | 0.63 | 0.67 | 0.89 |
0.9 |
0.78 | 0.7 | 0.78 | 0.57 | ||||||||||
Time | 1 |
0.4 | 0.13 | 0.55 | 0.18 | 0.5 | 0.13 | 0.6 | 0.9 |
0.18 | |||||||||
Flash | 0.67 | 0.67 | 0.9 |
0.78 | 0.7 | 0.73 | 0.6 | 0.9 |
0.67 | 0.78 | 0.86 |
||||||||
Voice | 0.4 | 0.6 | 0.4 | 0.4 | 0.5 | 0.67 | 0.4 | 0.67 | 0.62 | 0.62 | 0.5 | 0.77 | |||||||
Volume | 0.53 | 0.4 | 0.53 | 0.4 | 0.67 | 0.44 | 0.53 | 0.67 | 0.53 | 0.59 | 0.77 | 0.44 | 0.77 | ||||||
Recorder | 0.63 | 0.67 | 0.63 | 0.82 |
0.67 | 0.75 | 0.63 | 0.75 | 0.71 | 0.82 |
0.18 | 0.82 |
0.62 | 0.62 | |||||
Bright | 0.12 | 0.57 | 0.12 | 0.33 | 0.15 | 0.36 | 0.12 | 1 |
0.2 | 0.15 | 0.6 | 0.93 |
0.62 | 0.67 | 0.2 | ||||
Sound | 0.83 |
0.73 | 0.25 | 1 |
0.33 | 0.67 | 0.25 | 0.67 | 0.33 | 0.33 | 0.83 |
0.73 | 1 |
0.83 |
0.33 | 0.67 | |||
Torch | 0.67 | 0.6 | 1 |
0.74 | 0.7 | 0.78 | 0.57 | 0.95 |
0.67 | 0.74 | 0.2 | 0.86 |
0.6 | 0.56 | 0.78 | 0.12 | 0.27 | ||
Car | 0.6 | 0.63 | 0.6 | 0.67 | 0.62 | 0.71 | 1 |
0.71 | 0.6 | 0.76 | 0.15 | 0.67 | 0.44 | 0.59 | 0.71 | 0.13 | 0.29 | 0.63 |
In order to evaluate this labeling/clustering method, a “gold standard,” a set of manually classified Apps, was built. We took prevalent title words (that had more than 15 occurrences in Apps titles). Each selected word represents at least 15 Apps that include the word in their title and therefore has a high probability to represent the functionality of the Apps. We extracted 143 title words from the BoW collection and 64 title words from the BoFW collection. We took these prevalent title words and manually built 19 BoW and 12 functional clusters. The clusters were created by 5 experts according to the semantic relatedness between the words (discretionary), in order to create large groups (at least 80 Apps in each group) of Apps (descriptions) that have the same or similar functionality. Each expert grouped the words separately. The final cluster set was defined by combining the results of the experts, and in cases where there was no agreement, majority voting was used. These clusters’ set served as a “gold standard.” Then, we applied the Apps labeling (titles clustering process) described above to the same title words. This process resulted in 21 BoW and 14 BoFW clusters.
In order to evaluate the process results, we compared the machine-generated clusters with the predefined manual clusters. We labeled the automatic clusters based on the majority word members “majority voting” of the manually clustered words within each automatically generated cluster. Table
Similar words (semantic) clustering measures.
Measure | Precision | Recall |
|
Rand index | Accuracy |
---|---|---|---|---|---|
Result | 0.752 | 0.597 | 0.665 | 0.512 | 0.703 |
We noted in some cases that the words did not always represent the common meaning according to Wordnet. For example, the method found that Auto and Car are highly similar (see Table
Once we had a subset of labeled clusters (classes), some of the original Apps were assigned to the classes based on the words appearing in their titles (recall that pointers were retained from the title words and their abstraction to the original Apps). Out of 6633 Apps, we succeeded in labeling 2747 Apps automatically into 19 BoW classes and 1003 Apps into 12 BoFW classes (see Figures
Twelve functional classes (a) and 19 BoW classes (b) instance distributions.
An initial feasibility study explored the possibility of functionality-based clustering using short textual descriptions. The study compared three clustering algorithms and showed some difficulties in cluster overlap, cluster labels, and cluster diversity. The main conclusion was that the descriptions of Apps seem to be sufficient for clustering them, but research is required in order to obtain good accurate results [
The classification step is described in Figure
Apps classification step.
The library’s optional parameters are specific preprocessing actions, selection of feature representation, classifier, and normalization. Specific preprocessing parameters are tokenization (unigram or bigram) (all the contiguous sequences of two words from given text), stemming, and stop words removal “SPR”. The feature representation parameters are binary (if a feature appears at least once in the text, it receives 1; otherwise, it receives 0), word count (the number of times a feature appears in the text), term frequency (“TF,” the number of times a feature appears in the text divided by the total number of features), and
We examined all the 512 parameters combinations through a “brute-force” parameter search. We applied it to the BoFW and BoW representations of the enriched descriptions. We used a classic cross validation partitioning technique where each partition divided the labeled class instances (see Figure
The best classification results are presented in Table
Functional and BoW classification calibration by test accuracy comparison (best results only).
Dataset | Functional | BoW | ||||||
---|---|---|---|---|---|---|---|---|
Calibration/classifier | Crammer and Singer | L1 | L2 | Logistic regression | Crammer and Singer | L1 | L2 | Logistic regression |
|
||||||||
Accuracy (correctly classified) | 79.16 |
|
78.33 | 77.5 |
|
88.42 |
|
88.42 |
|
||||||||
Feature representation | TF/IDF | TF/IDF | TF/IDF | TF/IDF | TF/IDF | Word count or TF | Word count | Word count |
|
||||||||
Feature preprocess | Stemming, bigram | No SPR, stemming, and bigram | No stemming, unigram | Stemming, bigram | No SPR, no stemming, and unigram | Stemming | No SPR, stemming, and unigram | Unigram |
|
||||||||
Normalization | With | With | With or without | With | With | With or without | Without | With or without |
After we found good classifiers for each BoW and BoFW dataset with the right parameters combination calibration that generate accurate predictions on the labeled datasets, we classified the unlabeled Apps that remained in the Apps database (3886 Apps) using these trained classifiers. At the end of the classification process, all the App databases (6633 Apps) were tagged and catalogued into the classes. In order to evaluate the effectiveness of this unlabeled Apps classification, we randomly sampled 310 tagged Apps that were not labeled in the
Trained classifier (unlabeled Apps) evaluation.
Measure | Precision | Recall |
|
Rand index | Accuracy |
---|---|---|---|---|---|
Functionality dataset | 0.825 | 0.847 | 0.826 | 0.741 | 82.5 |
BoW dataset | 0.879 | 0.8994 | 0.889 | 0.798 | 87.88 |
As we have already seen, for
Presentation step.
In order to present the Apps hierarchically and in a logical order, we started with the App classes created in the previous step (12 BoFW labels and 19 BoW labels). However, in the future, the problem will be exacerbated as the number of Apps will increase, unlike our experimental scenario, and the algorithm needs to scale up and address millions of Apps that exist in App stores and probably thousands classes. Note that in the previous step we used a generic Apps grouping mechanism that is now refined before being used for personalized ordering. Using Wordnet and Verbnet, we searched for abstract labels that combine some of the classes in order to form a hierarchy of classes so that the upper level of the hierarchy fits onto the small screen of the mobile device. The semantic relations such as hypernym and Holonymy allowed us to find the closest common ancestor labels (abstract category label) for groups of class labels. In Verbnet, the verb hierarchy is built-in and there are verbs, classes of verbs, and types of a number of classes (such as Searching => Investigate => Test). These relations enabled us to define a hierarchy of class labels, as illustrated in Figure
BoFW-based (diagram (a)) and topic BoW-based (diagram (b)) taxonomies for Tools and Productivity categories. Each color illustrates the depth level of the class tree. The leaves illustrate class associated Apps.
To evaluate the taxonomy creation method, we used the clusters created during the
Having created generic hierarchical taxonomies, we now turn our attention to the individual user. Every user (owner) has different Apps installed on his/her mobile. Therefore, the generated functionality-based clusters must be adapted for each user. The suggested solution is to consider the Apps installed on the user’s device, classify them according to the generic Apps hierarchical taxonomy, and then trim long and empty branches of the taxonomy such that a balanced tree of categories with Apps is created that can nicely fit onto the small screen of the mobile device. First, only the Apps that are installed on the user’s device are mapped to the generic taxonomy (Apps Filtering) on the lowest (leaf) level of the taxonomy (see Figure
Taxonomy trimming process.
As a result, the Apps are organized into the minimal levels of hierarchy that are needed for compact presentation (depending on the number of installed Apps). Within the top layer, there are a few functional classes (that can fit onto the screen of a mobile device), while the layers below it contain related classes based on the taxonomy. The lower (leaf) level contains the Apps. The results can be presented to the user as follows, according to the default setting of the smartphone: classes are presented in functional folders, and BoW classes appear as frames within each cluster, as illustrated, for example, in Figure
Process output. Upper slides indicate functional folders (BoFW); lower slides indicate “Read” folder Apps clustered by (BoW) topic (the icons are for demonstration only).
Thus far, we have examined the individual steps of our suggested approach and showed that we succeeded, to a certain extent, in addressing each of the individual research challenges. We note that although we showed the feasibility of the idea, further research remains to be done to improve the individual steps. Following the individual experiments described in Section
For the study, we developed two user interfaces, one for Reference Approach (Figure
Experimental environment. Interfaces differ only in the category tree. Reference Approach with six fixed categories (image (b)) and Our Approach with hierarchical categories (image (a)).
We conducted a counterbalanced within-subject experiment comprising four stages. Each stage simulated a device with a specific number of installed Apps; the users needed to find a specific App using the device interface. In the first experimental stage, 40 Apps were installed, then 90 Apps, 140 Apps, and finally 240 Apps. Each stage included Apps that were used in the previous stage as well as new ones. In order to balance the learnability in the transition between the approaches, half of the users started from Reference Approach and transitioned to Our Approach and half performed the transition between the approaches in the opposite order. To decrease the learning effect between approaches, we used two test sets, one for each approach across the stages (480 Apps in total). It should be noted that although there were different Apps in each set, the functionality and the order in which they were presented to the user were identical. The categories generated by the prototype were the same for all participants but depended on the configuration. This design created four different configurations in the experimental platform (Reference Approach
In each one of the four stages (40, 90, 140, and 240), there were six tests in each of which the participants were asked to find a specific App. During each test, data were collected, including time from the start of the test until the App was found and clicks on the category tree. Only after all the stages in one approach were completed did the participant move to the next approach.
Prior to each stage, the participants were asked to explore the category tree and the Apps associated with each category carefully, giving them an opportunity to become familiar with the Apps, because smartphone users are familiar only with the Apps installed on their own device (the same opportunity was given for the two applications, to avoid any bias).
In each test, the participants saw the App’s icon for half a second in order to simulate a real situation where the user knows the desired functionality but only vaguely remembers the icon. During a search, when the user encounters an icon he/she immediately remembers that this is the icon for which he/she was searching.
Forty students of the University of Haifa, mainly from the Information System Department, who were paid for their time, participated in the study. All had been smartphone users for at least a year, having tens of Apps on their devices. 55% were males and 45% females, aged between 20 and 30, while the mean age was 25.38 (SD = 2.26). 37.5% were iOS users, and the rest (62.5%) used Android-based phones. 67.5% of the participants reported they had fewer than 60 Apps installed on their mobile; 22.5% had 60–100 Apps, 5% had 100–150 Apps, and 5% had more than 150 Apps. 62.5% of the participants reported that their App search function is App paging/scrolling (operating system default), 10% used a dedicated launcher, and the rest (27.5) reported another method, such as manually created folders and manual home screen icon arrangement. Figure
Participant’s characteristics.
We hypothesized that our suggested method for Apps categorization would be more effective than the alternative method. We assumed the following:
It will take less time to find a specific App with Our Approach than with Reference Approach. It will take fewer category clicks with Our Approach than with Reference Approach. We hypothesized that our suggested method would be preferred by the users to the alternative methods. We assumed the following:
Users will consider Our Approach a better search method than Reference Approach. Users will want to adopt Our Approach more than other App search methods (Reference Approach, manual folders, home screen icon arrangement, and OS default search method).
To evaluate the efficiency of the approaches, we compared them in terms of the time it took the participants to find the right App and the number of clicks on the category tree. We conducted multivariate analysis applying the independent variables approach and number of Apps using SAS statistical software [
Our hypothesis was that our proposed solution would be more efficient than the reference system, and indeed, in all four stages, Our Approach outperformed Reference Approach, as it enabled the users to find the Apps faster, as shown in Table
Comparison of approaches according to the time (ms) in each stage until the user finds the right App by various measures.
Measure/stage | Reference Approach | Our Approach | ||||||
---|---|---|---|---|---|---|---|---|
40 | 90 | 140 | 240 | 40 | 90 | 140 | 240 | |
Mean | 11674.1 | 19131.51 | 24258.4 | 27189.9 | 6997.71 | 11084 | 15967.1 | 20329.5 |
Median | 9359.3 | 16543.1 | 23371.7 | 25098.1 | 5727.1 | 10244 | 13787.9 | 18160.5 |
St. dev. | 6471.32 | 8458.89 | 9917.89 | 10830.1 | 5410.49 | 6372.32 | 8739.02 | 10958.2 |
Differences between approaches in terms of median of time (ms) until the user finds the right App.
We also compared the number of user clicks in the two cases and found that Our Approach outperformed Reference Approach in all stages, as it enabled the users to find the Apps with fewer category clicks, as illustrated in Table
Comparison of approaches according to the category clicks and stages by various measures.
Measure/stage | Reference Approach | Our Approach | ||||||
---|---|---|---|---|---|---|---|---|
40 | 90 | 140 | 240 | 40 | 90 | 140 | 240 | |
Mean | 2.65 | 2.82 | 3.09 | 2.49 | 1.66 | 2.36 | 3.11 | 3.46 |
Median | 2.1 | 2.6 | 3 | 2.5 | 1.4 | 1.8 | 2.2 | 2.2 |
St. dev. | 1.92 | 1.53 | 1.33 | 1.04 | 1.01 | 1.6 | 2.69 | 2.93 |
Differences between approaches in terms of median of category clicks number until the user finds the right App.
In every stage, as compared with the Reference Approach, the number of category clicks with Our Approach increased, as the hierarchy grew (as Our Approach uses the hierarchical categories approach while Reference Approach has fixed categories). Still, the median number of category clicks is less for Our Approach than for Reference Approach. In each approach, at all stages there is at least one user who found the right App using only one category click (lower whisker). In addition, we can see that the green boxes (quartile 2 + quartile 3) for 140 and 240 Apps are comparatively long for Reference Approach, which indicates that the distribution of users’ category clicks is wider. Further, the fact that the max. number of clicks is higher than that for Reference Approach may be attributed to learning difficulties of some of the users. To test whether these differences are statistically significant, Glimmix analysis was conducted. The results indicate that the differences between Reference Approach and Our Approach in the 40 Apps and 240 Apps stages are significant (
Regarding the hypothesis that categorization of Apps based on their functionality will be preferred by the users to current popular approach, as part of the study, the participants filled the SUS questionnaire [
Figure
Preference questionnaire distributions (A: Reference Approach; B: Our Approach).
We also analyzed the participants’ comments regarding their preference (open-ended questions, e.g., what are the pros and cons of each approach?) in order to elicit more insight into the pros and cons of each approach. The participants mentioned that the main drawbacks of Our Approach are the multiple subcategories and the need to adapt quickly to the new Apps categories configuration. In addition, they noted the advantages of Our Approach, such as significant time saving while searching Apps and the notion that the category tree is specific, easy to navigate, and organized better than that of the other search methods they know. Regarding Reference Approach, the participants mentioned that it is less convenient, mainly because of the general folders that are inconsistent and confusing.
Figure
Word cloud of open-ended questions. The larger the font size is and the greater the contrast is, the more frequently the participants used the word in their comments about Our Approach (cloud (a)) and the Reference Approach (cloud (b)).
In this paper, we suggested a novel approach for addressing the problem of finding installed Apps. We suggested a hierarchical functionality-based Apps ordering approach to help users organize their Apps and ease the process of finding infrequently used installed Apps. We addressed the problem in a three-stage process, illustrated in Figure
Functionality-based-taxonomy Apps organization process steps.
In the user study, we focused specifically on evaluating how the idea is perceived by users according to the classification results (included Apps misclassifications that may occur with automatic tools). The user study demonstrated that Our Approach outperformed the Reference Approach with respect to the time it took users to find Apps (Hypothesis (A.1) confirmed) and the number of category clicks (Hypothesis (A.2) confirmed). The participants preferred Our Approach to the Reference Approach (Hypothesis (B.1) confirmed). They mentioned advantages such as intuitive Apps organization (although the Apps categorization was not perfect), effective time saving while searching Apps, and efficient category tree navigation. The users wanted to adopt Our Approach more than other App search methods (Hypothesis (B.2) confirmed). It is noteworthy that the users preferred our suggested approaches, although their structure was more complex than that of the Reference Approach (hierarchy of App clusters versus list of App clusters) because of the categorization granularity with levels of abstraction. However, although the users preferred Our Approach to the Reference Approach, they preferred manual folders to all other approaches. This is an interesting finding that may be attributed to the fact that most users nowadays are used to organizing items in folders and that finding an installed App is not yet a major problem. The suggested approach will prove beneficial with the increase in the number of installed Apps, where maintaining manual folders will become a challenge.
As a result, we can conclude that a more detailed layered taxonomy is more beneficial than a flat and approximate list; however, the automation of Apps labeling and taxonomy building needs to be improved in order to achieve a better performance, in particular when we aim at automating the process completely, as the number of installed Apps will increase. It seems that there is room for further research on all the individual stages. Still, it seems that this approach provides an effective solution to the problem. Moreover, when new Apps are downloaded and installed, they can automatically be placed in the correct location according to the categorization. Apps may also be rearranged periodically as the number of Apps grows and some balancing is needed. In addition, this method allows the classic search of Apps to be enriched by using their name together with a search by functionality. For example, suppose Yelp has a label “travel”; it is very likely that a user can find this app simply by typing “T” in an app search box and obtaining a list of functional categories starting with “T,” including travel, and then can look at the list of Apps under travel, rather than searching the tree from its root.
The category tree on the prototype interface renders the task quite remote from what would actually be happening in terms of interaction on a mobile device. Therefore and following the participants’ comments about the simplistic interface of the prototype, we illustrate a possible visual solution for Apps ordering on a mobile device. Our challenge consists of presenting the App groups to mobile users in a limited screen area, where quick App identification is needed. We illustrate the desired user interface (Figure Scalable hierarchical level of details: the desired state of the user interface is a hierarchical reflection with several levels (three as an example) containing between 100 and 150 Apps. The number of Apps directly affects the number of levels in the hierarchy. The interface is able to include about 20 icons at every level, and hence, with two levels we may handle up to 400 Apps and with three levels up to 8000 Apps, if all categories are fully populated and evenly distributed. Personalized: the interface presents the categorization that is specific to the user. The categories are created according to the Apps installed on the user’s device. Ease of use: the interface allows quick and easy access to the Apps. A child function would be visible when visiting a parent function in order to enhance Apps searching. By tapping a particular function, rather than simply transferring to another window, the next level is presented over the whole screen, while the other functions may be shrunk and moved down to the edges of the screen (not shown in the figure). The fluid behavior of the screen allows simple switching between functions within the screen. Overlapping: there are Apps having characteristics that match several categories and are therefore assigned to all of them, allowing more than a single way to find these Apps.
Suggested mobile UI solution.
While suggesting a novel Apps ordering approach and demonstrating its feasibility, this work has a number of limitations regarding the suggested process for the Process limitations: the approach developed is far from being complete. The individual parts need to be refined further. It was tested and evaluated on an Apps database containing 6633 Apps extracted from 2 specific categories (Tools and Productivity). There are App stores with over a million Apps. These Apps are divided into approximately thirty general store categories. A larger scale research study may be required for examining the ability of the system to handle a real-life huge amount of Apps across a wide variety of functionalities and topics. The study is limited to two major categories of Apps: “Productivity” and “Tools.” While it is true that these two categories can be the major source of confusion in App search because of their functional ambiguity, it might be interesting to explore how the proposed method performs for other categories. There appear to be other methods of App functionality categorization using methods such as LDA [ User study limitations: the user studies took place in a university laboratory and the users tested Our Approach using a PC with a WinForm interface, although we made an effort to simulate everyday situations. A field research study using our suggested approach should be performed to examine it in a real-life situation.
In this paper, a solution to the need for an installed App finding method was proposed. We suggested and evaluated a method that automatically arranges Apps according to their functionality in order to help users organize their Apps. The approach was found to be technically feasible, although its individual stages need further improvement. The approach was found to be useful by the users and was perceived to be better than current simple approaches. Moreover, it seems that, with the growing numbers of Apps, the need for such effective Apps ordering solutions will increase and more sophisticated solutions, like the one suggested in this paper, are needed. However, as noted, during the study, we encountered quite a few challenges that need to be further investigated and these include the following: Generalization: we intend to evaluate this solution for larger numbers of diverse Apps in the virtual store, from different categories (instead of thousands of Apps, millions of Apps, from more categories). It appears that a general solution must rely on an automatic classification process. In future work, we intend to examine the use of predefined taxonomies, such as the Yahoo Open Directory project. Comparison: comparison with other approaches for functionality-based categorization. The proposed approach: the proposed approach combines flat clustering with semantic-based abstraction for creating hierarchical clustering. It may be interesting to experiment with using hierarchical clustering from the beginning and evaluate the contribution of the semantics. Field study using mobile App: in order to evaluate the idea in a real-life environment, a mobile App should be developed that demonstrates the idea and allows experimentation in a real-life setting and with the users’ personal devices. Attractive and interactive user interface: in accordance with the Personalized recommendation and prefetch based on context awareness: currently, clusters of installed Apps are presented as a taxonomy that suits the user. We can expand the personalization of the technology such that the clusters will appear in context with the user’s behavior. For example, a location-based service will take a user’s physical location next to a shopping center into account and the shopping cluster of Apps will pop up as a recommendation. Developers’ decision-supporting system: when a developer uploads his/her created App and its description to the virtual store and cannot decide which category to assign it to, applying Our Approach, a system can suggest the correct classification of an App.
In addition to the future research suggested based on our work, there are additional related research directions that may further enhance the use of Apps as the number of Apps on personal devices grows:
The authors declare that there is no conflict of interests regarding the publication of this paper.
The authors would like to thank Dr. Joel Lanir and Dr. Nitsa Barkan of the University of Haifa for helping in the design of the user study and in the analysis of the results.